Base64 is a binary-to-text encoding scheme that converts raw bytes into a string of 64 printable ASCII characters: A-Z, a-z, 0-9, +, and /. It exists because many systems (email protocols, JSON payloads, HTML attributes) were designed to handle text, not arbitrary binary data. Base64 bridges that gap.

This article covers how Base64 works, when to use it (and when not to), code examples in JavaScript, Python, and Go, and the Base64URL variant you'll encounter in JWTs.

If you just need to encode or decode something right now, use the free Base64 decoder and encoder -- results appear as you type.

How Base64 Encoding Works

Base64 takes every 3 bytes of input (24 bits) and maps them to 4 Base64 characters (each representing 6 bits). That's the entire algorithm: regroup bits from 8-bit chunks to 6-bit chunks.

Here's the step-by-step for the string "Man":

Input M a n
ASCII decimal 77 97 110
Binary 01001101 01100001 01101110

Concatenated: 010011010110000101101110

Split into 6-bit groups: 010011 010110 000101 101110

Which map to: T W F u

So "Man" encodes to "TWFu".

Padding with =

Base64 processes input in 3-byte blocks. When your input length isn't a multiple of 3, the algorithm pads the output with = characters:

  • 1 leftover byte: 2 Base64 chars + ==
  • 2 leftover bytes: 3 Base64 chars + =

"Ma" (2 bytes) encodes to "TWE=" "M" (1 byte) encodes to "TQ=="

The = characters are not data. They're structural padding so the decoder knows the original byte count.

Size Overhead

Base64 output is always approximately 33% larger than the input. Three bytes in becomes four bytes out. For a 1 MB image, that's ~1.33 MB of Base64 text. This matters when deciding whether to embed images inline.

The Base64 Alphabet

The 64 characters are not arbitrary. They were chosen because they're safe in virtually every text-handling system:

Index Characters
0-25 A-Z
26-51 a-z
52-61 0-9
62 +
63 /

Padding character: =

These characters survive SMTP transmission, JSON serialization, XML documents, and most legacy systems -- which is exactly why they were chosen when Base64 was standardized in RFC 4648.

Common Use Cases

1. JWT Tokens

Every JWT (JSON Web Token) is three Base64URL-encoded segments separated by dots: header.payload.signature. The payload is Base64URL-encoded JSON -- that's why you can decode the claims from any JWT without a key. The signature verifies integrity; the encoding is not encryption.

eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIn0.dozjgNryP4J3jVmNHl0w5N_XgL0n3I9PlFUP0THsR8U

Decode the middle segment to see the payload.

2. Inline Images in CSS and HTML

.icon { background-image: url("data:image/png;base64,iVBORw0KGgo..."); }

Useful for small icons where an extra HTTP request would cost more than the 33% size overhead. For images larger than ~10 KB, an external file is usually faster.

3. HTTP Basic Authentication

The Authorization: Basic header encodes username:password as Base64:

Authorization: Basic dXNlcjpwYXNzd29yZA==

dXNlcjpwYXNzd29yZA== decodes to user:password. Basic Auth over HTTP is completely insecure. Basic Auth over HTTPS is fine for internal services -- the Base64 is not the security; TLS is.

4. Binary Data in JSON

JSON has no binary type. Storing or transmitting binary data (certificates, cryptographic keys, file contents) in JSON requires Base64-encoding it to a string first.

5. Email Attachments (MIME)

SMTP was designed for 7-bit ASCII. MIME's Content-Transfer-Encoding: base64 header allows email clients to send binary attachments by encoding them as Base64 text.

Base64URL: The URL-Safe Variant

Standard Base64 uses + and /, which have special meanings in URLs (+ = space, / = path separator). Base64URL replaces them:

Standard Base64 Base64URL
+ -
/ _
= padding omitted

The alphabet and algorithm are identical -- only those three differences exist. Base64URL is used in JWTs, OAuth tokens, and anywhere tokens appear in URLs or HTTP headers.

The most common Base64 bug developers encounter: using standard atob() on a Base64URL string. The decoder fails because it doesn't recognize - or _. You need to replace them back before decoding:

// JavaScript function decodeBase64URL(str) { // Replace Base64URL chars with standard Base64 chars const base64 = str.replace(/-/g, '+').replace(/_/g, '/'); // Restore padding const padded = base64.padEnd(base64.length + (4 - base64.length % 4) % 4, '='); return atob(padded); }

Code Examples: Encoding and Decoding

JavaScript

// JavaScript (Browser & Node.js) // Encode (browser) const encoded = btoa("Hello, World!"); console.log(encoded); // SGVsbG8sIFdvcmxkIQ== // Decode (browser) const decoded = atob("SGVsbG8sIFdvcmxkIQ=="); console.log(decoded); // Hello, World! // Encode in Node.js const encodedNode = Buffer.from("Hello, World!").toString("base64"); console.log(encodedNode); // SGVsbG8sIFdvcmxkIQ== // Decode in Node.js const decodedNode = Buffer.from("SGVsbG8sIFdvcmxkIQ==", "base64").toString("utf8"); console.log(decodedNode); // Hello, World!

Unicode gotcha: btoa() only accepts Latin-1 strings. Passing a string with characters outside that range (emoji, Chinese characters, accented letters) throws InvalidCharacterError. Use TextEncoder first:

// JavaScript -- Unicode-safe encoding function encodeBase64Unicode(str) { const bytes = new TextEncoder().encode(str); const binary = String.fromCharCode(...bytes); return btoa(binary); } function decodeBase64Unicode(encoded) { const binary = atob(encoded); const bytes = Uint8Array.from(binary, c => c.charCodeAt(0)); return new TextDecoder().decode(bytes); } console.log(encodeBase64Unicode("Hello 🌍")); // SGVsbG8g8J+MjQ== console.log(decodeBase64Unicode("SGVsbG8g8J+MjQ==")); // Hello 🌍

Python

# Python 3 import base64 # Encode encoded = base64.b64encode(b"Hello, World!").decode("utf-8") print(encoded) # SGVsbG8sIFdvcmxkIQ== # Decode decoded = base64.b64decode("SGVsbG8sIFdvcmxkIQ==").decode("utf-8") print(decoded) # Hello, World! # Base64URL (for JWTs and URL-safe contexts) url_encoded = base64.urlsafe_b64encode(b"Hello, World!").decode("utf-8") print(url_encoded) # SGVsbG8sIFdvcmxkIQ== url_decoded = base64.urlsafe_b64decode("SGVsbG8sIFdvcmxkIQ==").decode("utf-8") print(url_decoded) # Hello, World!

Go

// Go package main import ( "encoding/base64" "fmt" ) func main() { input := "Hello, World!" // Standard Base64 encoded := base64.StdEncoding.EncodeToString([]byte(input)) fmt.Println(encoded) // SGVsbG8sIFdvcmxkIQ== decoded, err := base64.StdEncoding.DecodeString(encoded) if err != nil { panic(err) } fmt.Println(string(decoded)) // Hello, World! // URL-safe Base64 (for JWTs, OAuth tokens) urlEncoded := base64.URLEncoding.EncodeToString([]byte(input)) fmt.Println(urlEncoded) // SGVsbG8sIFdvcmxkIQ== // Without padding (as used in JWTs) rawEncoded := base64.RawURLEncoding.EncodeToString([]byte(input)) fmt.Println(rawEncoded) // SGVsbG8sIFdvcmxkIQ }

Base64 Is Not Encryption

This is the most important thing to internalize about Base64: it provides zero security.

Base64 is a reversible, keyless transformation. Anyone who sees a Base64 string can decode it in one function call. There's no secret, no key, no protection.

Common mistakes:

  • Storing passwords as Base64: Use bcrypt, Argon2, or scrypt instead. Never use Base64 for passwords.
  • "Hiding" API keys by Base64-encoding them: The keys are fully exposed. Anyone reading your JS bundle can decode them.
  • Treating a Base64-encoded token as confidential: JWT payloads are Base64URL-encoded and fully readable. The signature verifies the token wasn't tampered with; it doesn't hide the claims.

The correct mental model: encoding is for compatibility, encryption is for secrecy, hashing is for verification. Base64 only serves the first purpose.

Common Gotchas

btoa() fails on non-Latin-1 input

If you pass a string containing characters with code points above 255 (Unicode), btoa() throws. See the Unicode-safe encoding pattern in the JavaScript examples above.

Padding errors with Base64URL

JWT segments have no = padding. Standard Base64 decoders expect padding. Always restore padding before calling atob() or equivalent functions.

Line breaks in some Base64 implementations

RFC 4648 allows implementations to insert line breaks every 76 characters (for MIME compatibility). Most modern implementations don't do this, but if you're parsing Base64 from email content, strip whitespace before decoding.

Base64 is not the same as hex encoding

Both Base64 and hex (Base16) encode binary as text. Hex uses 2 characters per byte. Base64 uses 4 characters per 3 bytes. Base64 is ~33% overhead; hex is 100% overhead. Hex is easier to read manually; Base64 is more compact.

Frequently asked questions

Is Base64 encoding the same as encryption?

No. Base64 is a reversible, keyless encoding. Any Base64 string can be decoded with a single function call. Use AES or similar for encryption.

Why does Base64 output end with = or ==?

Padding. Base64 processes 3-byte blocks. When the input isn't a multiple of 3 bytes, = characters fill out the final group. They carry no data -- they're structural.

What's the difference between Base64 and Base64URL?

Base64URL replaces + with - and / with _, and omits = padding. It's used in JWTs and tokens that appear in URLs or HTTP headers, where +, /, and = would be misinterpreted.

Does Base64 encoding compress data?

No -- it expands by ~33%. Every 3 bytes of input becomes 4 characters of output. Never use Base64 expecting smaller output.

Can Base64 encode any type of data?

Yes. Base64 operates on raw bytes, so it can encode strings, images, PDFs, certificates, or any binary content. The encoded output is always ASCII text.

Why does btoa() throw an error on my string?

btoa() only accepts Latin-1 (ISO 8859-1) characters (code points 0-255). If your string contains emoji or non-Latin characters, use TextEncoder to convert to bytes first. See the Unicode-safe example above.

Conclusion

Base64 solves a specific problem: moving binary data through text-only channels without corruption. The algorithm is simple -- 3 bytes in, 4 characters out -- and the 64-character alphabet was chosen for maximum compatibility across legacy systems.

Key takeaways:

  • Base64 uses 64 ASCII characters (A-Z, a-z, 0-9, +, /) to represent 6 bits each
  • Output is ~33% larger than input
  • = padding fills incomplete 3-byte blocks; it's not data
  • Base64URL replaces + with -, / with _, and drops padding; used in JWTs and tokens
  • Base64 is not encryption and provides no security whatsoever
  • btoa() in browsers only handles Latin-1; use TextEncoder for Unicode strings

To encode or decode a Base64 string right now, use the free Base64 tool -- no sign-up, no rate limits, all processing in your browser.