Introduction
Open any email with an attachment, inspect a JWT token, embed a small image directly in your CSS, or send binary data through a JSON API, and you will find Base64 encoding working behind the scenes. It is one of the most widely used encoding schemes in modern computing, yet many developers use it without fully understanding how it works or when it is the right tool for the job.
At its core, Base64 encode is the process of converting binary data into a safe ASCII text representation using a 64-character alphabet. The encoded output can travel through systems that were designed to handle only text, including email protocols, HTML documents, JSON payloads, and URL query strings. The Base64 decode process reverses this, reconstructing the original binary data from the ASCII string.
In this guide, you will learn exactly how the Base64 algorithm works step by step, see working code examples in JavaScript, Python, and PHP, understand when to use it (and when not to), and discover the URL-safe variant used in JWTs and modern APIs.
What Is Base64 Encoding?
Base64 encoding is a binary-to-text encoding scheme that represents binary data using 64 printable ASCII characters. The name "Base64" comes from the fact that the encoding uses an alphabet of exactly 64 characters to represent data, compared to base-10 (decimal), base-16 (hexadecimal), or base-2 (binary).
The scheme was originally defined in RFC 4648 and is widely used in MIME email encoding (RFC 2045). The fundamental purpose is simple: take any binary data — an image, a PDF, encrypted ciphertext, or raw bytes — and produce a Base64 string that contains only letters, digits, and a few symbols that are safe in text-based contexts.
Unlike URL encoding (which escapes individual characters with percent signs), Base64 re-encodes the entire byte stream into a completely different representation. And unlike encryption, Base64 provides zero security — it is a reversible encoding, not a cipher. Anyone can decode it instantly.
How Base64 Works Step by Step
The Base64 encoding algorithm converts every group of 3 input bytes (24 bits) into 4 output characters (each representing 6 bits). Here is the process broken down:
- Take 3 bytes of input — This gives you 24 bits of binary data.
- Split into four 6-bit groups — 24 bits divide evenly into four groups of 6 bits each.
- Map each 6-bit value to a character — Each value (0-63) maps to a character in the Base64 alphabet.
- Handle padding — If the input length is not a multiple of 3, pad the output with
=characters.
Worked Example
Let us encode the string Man (3 ASCII bytes: 77, 97, 110):
Text: M a n
ASCII: 77 97 110
Binary: 01001101 01100001 01101110
Split into 6-bit groups:
010011 010110 000101 101110
Decimal: 19 22 5 46
Base64: T W F u
Result: "TWFu"
Base64 Padding
When the input is not a multiple of 3 bytes, Base64 padding fills in the gap. If there is 1 byte left over, the output gets two = signs. If there are 2 bytes left over, the output gets one = sign:
Input "Ma" (2 bytes) → "TWE=" (1 padding character)
Input "M" (1 byte) → "TQ==" (2 padding characters)
Input "Man" (3 bytes) → "TWFu" (no padding needed)
Padding ensures the encoded output length is always a multiple of 4, which makes decoding straightforward. Some implementations (like URL-safe Base64) omit padding entirely, since the decoder can infer the original length from the output length.
The Base64 Alphabet
The standard Base64 alphabet consists of 64 characters plus the padding character =. Here is the complete mapping:
| Value | Char | Value | Char | Value | Char | Value | Char |
|---|---|---|---|---|---|---|---|
| 0 | A | 16 | Q | 32 | g | 48 | w |
| 1 | B | 17 | R | 33 | h | 49 | x |
| 2 | C | 18 | S | 34 | i | 50 | y |
| 3 | D | 19 | T | 35 | j | 51 | z |
| 4 | E | 20 | U | 36 | k | 52 | 0 |
| 5 | F | 21 | V | 37 | l | 53 | 1 |
| 6 | G | 22 | W | 38 | m | 54 | 2 |
| 7 | H | 23 | X | 39 | n | 55 | 3 |
| 8 | I | 24 | Y | 40 | o | 56 | 4 |
| 9 | J | 25 | Z | 41 | p | 57 | 5 |
| 10 | K | 26 | a | 42 | q | 58 | 6 |
| 11 | L | 27 | b | 43 | r | 59 | 7 |
| 12 | M | 28 | c | 44 | s | 60 | 8 |
| 13 | N | 29 | d | 45 | t | 61 | 9 |
| 14 | O | 30 | e | 46 | u | 62 | + |
| 15 | P | 31 | f | 47 | v | 63 | / |
The 65th character, =, is used exclusively for padding. In the URL-safe Base64 variant, + is replaced with - and / is replaced with _.
Base64 in JavaScript
JavaScript provides two built-in functions for Base64 JavaScript operations: btoa() and atob(). The names stand for "Binary To ASCII" and "ASCII To Binary" respectively.
// Basic Base64 encode and decode
const encoded = btoa("Hello, World!");
console.log(encoded); // "SGVsbG8sIFdvcmxkIQ=="
const decoded = atob("SGVsbG8sIFdvcmxkIQ==");
console.log(decoded); // "Hello, World!"
However, btoa() and atob() only handle Latin-1 characters. If your string contains UTF-8 characters like emoji or accented letters, you need to use TextEncoder and TextDecoder:
// UTF-8 safe Base64 encode
function base64Encode(str) {
const bytes = new TextEncoder().encode(str);
let binary = '';
bytes.forEach(b => binary += String.fromCharCode(b));
return btoa(binary);
}
// UTF-8 safe Base64 decode
function base64Decode(b64) {
const binary = atob(b64);
const bytes = new Uint8Array(binary.length);
for (let i = 0; i < binary.length; i++) {
bytes[i] = binary.charCodeAt(i);
}
return new TextDecoder().decode(bytes);
}
console.log(base64Encode("Hello")); // "SGVsbG8="
console.log(base64Decode("4pyT")); // works with any UTF-8
For image to Base64 conversion in the browser, use the FileReader API:
// Convert an image file to a Base64 data URI
const input = document.querySelector('input[type="file"]');
input.addEventListener('change', (e) => {
const reader = new FileReader();
reader.onload = () => {
// reader.result contains: "data:image/png;base64,iVBOR..."
document.querySelector('img').src = reader.result;
};
reader.readAsDataURL(e.target.files[0]);
});
Base64 in Python
Base64 Python operations use the built-in base64 module. The functions work with bytes objects, so you must encode strings to bytes first:
import base64
# Encode a string to Base64
text = "Hello, World!"
encoded = base64.b64encode(text.encode('utf-8'))
print(encoded) # b'SGVsbG8sIFdvcmxkIQ=='
# Decode Base64 back to string
decoded = base64.b64decode(encoded).decode('utf-8')
print(decoded) # "Hello, World!"
# URL-safe Base64 (replaces + with - and / with _)
url_encoded = base64.urlsafe_b64encode(text.encode('utf-8'))
print(url_encoded) # b'SGVsbG8sIFdvcmxkIQ=='
# Encode binary file to Base64
with open('image.png', 'rb') as f:
img_base64 = base64.b64encode(f.read()).decode('ascii')
data_uri = f"data:image/png;base64,{img_base64}"
Base64 in PHP
PHP has straightforward functions for Base64 operations: base64_encode() and base64_decode(). PHP strings handle binary data natively, so no extra conversion is needed:
// Encode and decode strings
$encoded = base64_encode("Hello, World!");
echo $encoded; // "SGVsbG8sIFdvcmxkIQ=="
$decoded = base64_decode("SGVsbG8sIFdvcmxkIQ==");
echo $decoded; // "Hello, World!"
// Encode an image file
$imageData = file_get_contents('photo.jpg');
$base64Image = base64_encode($imageData);
$dataUri = 'data:image/jpeg;base64,' . $base64Image;
// Validate Base64 before decoding
$input = $_POST['data'] ?? '';
$decoded = base64_decode($input, true); // strict mode
if ($decoded === false) {
echo "Invalid Base64 input";
} else {
echo "Decoded: " . $decoded;
}
Common Use Cases for Base64
Base64 appears in nearly every layer of the modern web stack. Here are the most important real-world applications:
| Use Case | How Base64 Is Used | Example |
|---|---|---|
| Data URIs | Embed images, fonts, or SVGs directly in HTML/CSS to eliminate HTTP requests | data:image/png;base64,iVBOR... |
| Email (MIME) | Encode attachments so binary files can travel through SMTP, which only supports 7-bit ASCII | Content-Transfer-Encoding: base64 |
| JWT Tokens | The header and payload sections of a JSON Web Token are Base64url-encoded JSON objects | eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiIxMjM0NTY3ODkwIn0... |
| HTTP Basic Auth | The username:password pair is Base64-encoded in the Authorization header | Authorization: Basic dXNlcjpwYXNz |
| API Payloads | Transmit binary data (files, images, certificates) inside JSON objects | {"file": "JVBERi0xLjQK..."} |
| Cryptography | Encode encrypted ciphertext, keys, and signatures for text-safe transport | PGP armored output, SSL certificates (PEM format) |
For embedding small images (icons, logos under 2-3 KB), Base64 data URIs reduce HTTP requests and can improve page load speed. For larger images, the 33% size increase makes separate files more efficient. Our Base64 Encoder/Decoder tool generates ready-to-use data URIs with automatic MIME type detection, so you can paste the output directly into your HTML or CSS.
Base64 URL-Safe Variant
The standard Base64 alphabet includes + and /, which have special meanings in URLs. The + character is interpreted as a space in query strings, and / is a path separator. The = padding character also needs URL encoding since it is used for parameter assignment.
URL-safe Base64 (also called Base64url, defined in RFC 4648 Section 5) solves this by making two character substitutions and removing padding:
| Standard Base64 | URL-Safe Base64 | Reason |
|---|---|---|
+ |
- (hyphen) |
+ means space in URL query strings |
/ |
_ (underscore) |
/ is a path separator in URLs |
= (padding) |
Omitted | = is the assignment operator in URL parameters |
JWT tokens use Base64url encoding for both the header and payload segments. This is why JWT strings contain hyphens and underscores instead of plus signs and slashes. Here is a quick conversion in JavaScript:
// Standard Base64 to URL-safe Base64
function toBase64Url(base64) {
return base64
.replace(/\+/g, '-')
.replace(/\//g, '_')
.replace(/=+$/, '');
}
// URL-safe Base64 back to standard
function fromBase64Url(base64url) {
let base64 = base64url
.replace(/-/g, '+')
.replace(/_/g, '/');
// Re-add padding
while (base64.length % 4 !== 0) {
base64 += '=';
}
return base64;
}
Base64 Performance & Size Overhead
The most important tradeoff to understand is the 33% size increase. Because 3 bytes become 4 characters, every piece of Base64-encoded data is approximately one-third larger than the original binary. With MIME Base64 line wrapping (76-character lines followed by CRLF), the overhead increases slightly to about 36-37%.
This has practical implications:
- Inline images — A 10 KB icon becomes ~13.3 KB when Base64-encoded in a data URI. For small icons, eliminating an HTTP request is worth it. For a 500 KB photo, the 167 KB overhead is not.
- API payloads — Sending a 1 MB file as Base64 in a JSON body uses ~1.33 MB of bandwidth and cannot be streamed. For large files, multipart uploads are more efficient.
- Email attachments — A 5 MB PDF attachment becomes ~6.7 MB in the email. This is why email providers impose size limits on attachments.
- Caching — Base64-encoded inline resources cannot be cached independently by the browser, while separate files can be cached with long expiration headers.
As a rule of thumb: use Base64 data URIs for assets under 2-4 KB where reducing HTTP requests matters, and use separate files for anything larger.
Frequently Asked Questions
What is Base64 encoding used for?
Base64 encoding is used to convert binary data into ASCII text so it can be safely transmitted through text-based systems. Common uses include embedding images in HTML and CSS via data URIs, encoding email attachments (MIME), transmitting binary data in JSON APIs, encoding JWT token payloads, and HTTP Basic Authentication headers.
Why does Base64 increase file size by 33%?
Base64 converts every 3 bytes of binary data into 4 ASCII characters. Each Base64 character represents 6 bits of data, but occupies 8 bits (1 byte) in storage. So 3 input bytes (24 bits) become 4 output bytes (32 bits), resulting in a 33% size overhead. With line breaks added for MIME encoding, the overhead can reach approximately 36-37%.
What is the difference between btoa() and atob() in JavaScript?
btoa() encodes a binary string to Base64 (Binary To ASCII), while atob() decodes a Base64 string back to binary (ASCII To Binary). Note that btoa() only handles Latin-1 characters. For UTF-8 strings with characters like emoji or accented letters, you must first encode with TextEncoder, convert to a binary string, and then use btoa().
What is URL-safe Base64?
URL-safe Base64 replaces the + character with - (hyphen) and / with _ (underscore) from the standard Base64 alphabet, because + and / have special meanings in URLs. It also typically omits the = padding characters. This variant is used in JWTs, URL parameters, and file names where standard Base64 characters would cause parsing issues.
Is Base64 a form of encryption?
No. Base64 is an encoding scheme, not encryption. It does not provide any security or confidentiality — anyone can decode a Base64 string instantly without a key. If you need to protect data, use proper encryption algorithms like AES-256-GCM. Base64 is often used to encode the output of encryption algorithms so the ciphertext can be transmitted as text.
Conclusion
Base64 encoding is a fundamental tool in every developer's toolkit. It solves a simple but critical problem: safely transporting binary data through text-only channels. The algorithm is straightforward (3 bytes in, 4 characters out), the size overhead is predictable (33%), and every major language has built-in support.
The key rules to remember: use standard Base64 for email and general-purpose encoding, use Base64url for JWTs, URLs, and file names, use data URIs only for small assets, and never use Base64 as a substitute for encryption. When you need to handle UTF-8 text in JavaScript, wrap btoa()/atob() with TextEncoder/TextDecoder.
Need to Base64 encode or Base64 decode something right now? Use our free Base64 Encoder/Decoder — it supports Standard, URL-safe, and MIME modes, handles file uploads with image preview and data URI generation, and processes everything locally in your browser with no data sent to any server.