What does the 'b' character do in front of a string literal?

Programming

Python

String Literals

Byte Objects

Code Syntax

What does the 'b' character do in front of a string literal?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

The b prefix in front of a string literal creates a bytes object instead of a regular text string. In Python 3, b"Hello" produces a bytes value containing the ASCII-encoded representation of "Hello," while "Hello" without the prefix produces a str value containing Unicode text. This distinction matters whenever you work with network I/O, binary files, cryptographic functions, or any API that operates on raw bytes rather than human-readable characters.

How `str` and `bytes` Differ

Python 3 enforces a strict separation between text and binary data. A str is a sequence of Unicode code points. A bytes is a sequence of integers in the range 0-255. They are not interchangeable.

python

1text = "Hello"
2binary = b"Hello"
3
4print(type(text))      # <class 'str'>
5print(type(binary))    # <class 'bytes'>
6
7print(text[0])         # 'H'  (a character)
8print(binary[0])       # 72   (an integer, the ASCII value of 'H')

This separation prevents an entire class of encoding bugs that plagued Python 2, where str and bytes were conflated and silently converted using ASCII, breaking the moment non-ASCII characters appeared.

What You Can Put in a `b""` Literal

A bytes literal only accepts ASCII characters and escape sequences. You cannot include non-ASCII characters directly:

python

1# Valid
2data = b"Hello\x00\xff\n"
3
4# Invalid: SyntaxError
5# data = b"cafe"  # fine because all ASCII
6# data = b"caf\xc3\xa9"  # fine (raw bytes for UTF-8 encoded 'e')
7# data = b"cafe"  # SyntaxError if you paste the actual 'e' character

If you need to represent non-ASCII text as bytes, encode a str instead:

python

1text = "cafe"
2encoded = text.encode("utf-8")
3print(encoded)    # b'caf\xc3\xa9'
4print(len(text))     # 4 characters
5print(len(encoded))  # 5 bytes (the 'e' takes 2 bytes in UTF-8)

Converting Between `str` and `bytes`

The two directions are encode() (str to bytes) and decode() (bytes to str). Both require an encoding, and UTF-8 is the standard choice.

python

1# str -> bytes
2message = "Hello, world"
3as_bytes = message.encode("utf-8")
4print(as_bytes)          # b'Hello, world'
5
6# bytes -> str
7raw = b"Hello, world"
8as_text = raw.decode("utf-8")
9print(as_text)           # Hello, world

If you use the wrong encoding, you get garbled text or an exception:

python

data = "cafe".encode("utf-8")   # b'caf\xc3\xa9'
wrong = data.decode("ascii")       # UnicodeDecodeError
wrong = data.decode("latin-1")     # "cafA(c)" (wrong, but no error)

When You Need `bytes`

Bytes are required in several common scenarios:

python

1# Network sockets
2import socket
3s = socket.socket()
4s.connect(("example.com", 80))
5s.sendall(b"GET / HTTP/1.1\r\nHost: example.com\r\n\r\n")
6response = s.recv(4096)  # returns bytes
7
8# Writing binary files
9with open("output.bin", "wb") as f:
10    f.write(b"\x89PNG\r\n\x1a\n")  # PNG file header
11
12# Cryptographic hashing
13import hashlib
14digest = hashlib.sha256(b"password").hexdigest()
15
16# Base64 encoding
17import base64
18encoded = base64.b64encode(b"binary data")

Many libraries and protocols (HTTP, WebSockets, database drivers, serialization formats) return bytes and expect bytes as input.

Comparison Table

Feature	`str`	`bytes`
Prefix	None (or `u""` in Python 2)	`b""`
Contains	Unicode code points	Integers 0-255
Indexing returns	A one-character string	An integer
Encoding	Abstract text (no specific encoding)	Raw encoded data
Mutable variant	None (str is immutable)	`bytearray`
Concatenation	`"a" + "b"` works	`b"a" + b"b"` works
Mixed concatenation	`"a" + b"b"` raises TypeError	Must encode/decode first
File mode	`"r"` / `"w"` (text mode)	`"rb"` / `"wb"` (binary mode)
Use case	Display, text processing	I/O, binary protocols, hashing

The `bytearray` Variant

If you need a mutable sequence of bytes, use bytearray:

python

1data = bytearray(b"Hello")
2data[0] = 74  # ASCII for 'J'
3print(data)   # bytearray(b'Jello')
4
5# bytearray supports in-place modification
6data.append(33)
7print(data)   # bytearray(b'Jello!')

bytes is immutable (like str), while bytearray is mutable (like list). Use bytearray when you need to build binary data incrementally or modify it in place.

Python 2 vs. Python 3

The b prefix exists in Python 2 but does nothing. In Python 2, str is already a byte string, and b"hello" is identical to "hello". The prefix was added for forward compatibility so that code could be written to work in both Python 2 and 3.

python

1# Python 2
2type(b"hello")   # <type 'str'> (bytes and str are the same)
3type(u"hello")   # <type 'unicode'>
4
5# Python 3
6type(b"hello")   # <class 'bytes'>
7type("hello")    # <class 'str'> (str is Unicode)

If you are maintaining code that must run on both versions, using b"" for binary data and u"" for text makes the intent explicit.

Common Pitfalls

Mixing str and bytes in concatenation or comparison. Python 3 raises TypeError instead of silently converting, which is intentional. Always encode or decode explicitly.
Using decode("ascii") on data that contains non-ASCII bytes. UTF-8 is almost always the right default unless you are working with a legacy protocol that specifies otherwise.
Assuming len(encoded) equals len(original_string). Multi-byte characters (anything outside ASCII) take more than one byte in UTF-8. A string of 4 characters can be 4-16 bytes depending on the characters and encoding.
Forgetting to open files in binary mode ("rb" / "wb") when reading or writing non-text data. Text mode applies encoding/decoding and newline translation, which corrupts binary data.
Treating bytes as printable text. The repr() of a bytes object shows printable ASCII characters directly and escapes everything else, which can be misleading. A bytes object like b'hello' looks like a string but is not one.

Summary

The b prefix creates a bytes literal, a sequence of raw byte values, distinct from Python 3's str type which holds Unicode text.
Use encode() to convert str to bytes and decode() to convert bytes to str. Always specify the encoding explicitly, defaulting to UTF-8.
bytes are required for network I/O, binary file operations, cryptographic functions, and most low-level APIs.
Python 3 deliberately makes str and bytes incompatible to prevent the implicit encoding bugs that were common in Python 2.
For mutable byte sequences, use bytearray instead of bytes.

What does the 'b' character do in front of a string literal?

Master System Design with Codemia

How str and bytes Differ

What You Can Put in a b"" Literal

Converting Between str and bytes

When You Need bytes

Comparison Table

The bytearray Variant

Python 2 vs. Python 3

Common Pitfalls

Summary

How `str` and `bytes` Differ

What You Can Put in a `b""` Literal

Converting Between `str` and `bytes`

When You Need `bytes`

The `bytearray` Variant