What does the 'b' character do in front of a string literal?
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
The b prefix in front of a string literal creates a bytes object instead of a regular text string. In Python 3, b"Hello" produces a bytes value containing the ASCII-encoded representation of "Hello," while "Hello" without the prefix produces a str value containing Unicode text. This distinction matters whenever you work with network I/O, binary files, cryptographic functions, or any API that operates on raw bytes rather than human-readable characters.
How str and bytes Differ
Python 3 enforces a strict separation between text and binary data. A str is a sequence of Unicode code points. A bytes is a sequence of integers in the range 0-255. They are not interchangeable.
This separation prevents an entire class of encoding bugs that plagued Python 2, where str and bytes were conflated and silently converted using ASCII, breaking the moment non-ASCII characters appeared.
What You Can Put in a b"" Literal
A bytes literal only accepts ASCII characters and escape sequences. You cannot include non-ASCII characters directly:
If you need to represent non-ASCII text as bytes, encode a str instead:
Converting Between str and bytes
The two directions are encode() (str to bytes) and decode() (bytes to str). Both require an encoding, and UTF-8 is the standard choice.
If you use the wrong encoding, you get garbled text or an exception:
When You Need bytes
Bytes are required in several common scenarios:
Many libraries and protocols (HTTP, WebSockets, database drivers, serialization formats) return bytes and expect bytes as input.
Comparison Table
| Feature | str | bytes |
| Prefix | None (or u"" in Python 2) | b"" |
| Contains | Unicode code points | Integers 0-255 |
| Indexing returns | A one-character string | An integer |
| Encoding | Abstract text (no specific encoding) | Raw encoded data |
| Mutable variant | None (str is immutable) | bytearray |
| Concatenation | "a" + "b" works | b"a" + b"b" works |
| Mixed concatenation | "a" + b"b" raises TypeError | Must encode/decode first |
| File mode | "r" / "w" (text mode) | "rb" / "wb" (binary mode) |
| Use case | Display, text processing | I/O, binary protocols, hashing |
The bytearray Variant
If you need a mutable sequence of bytes, use bytearray:
bytes is immutable (like str), while bytearray is mutable (like list). Use bytearray when you need to build binary data incrementally or modify it in place.
Python 2 vs. Python 3
The b prefix exists in Python 2 but does nothing. In Python 2, str is already a byte string, and b"hello" is identical to "hello". The prefix was added for forward compatibility so that code could be written to work in both Python 2 and 3.
If you are maintaining code that must run on both versions, using b"" for binary data and u"" for text makes the intent explicit.
Common Pitfalls
- Mixing
strandbytesin concatenation or comparison. Python 3 raisesTypeErrorinstead of silently converting, which is intentional. Always encode or decode explicitly. - Using
decode("ascii")on data that contains non-ASCII bytes. UTF-8 is almost always the right default unless you are working with a legacy protocol that specifies otherwise. - Assuming
len(encoded)equalslen(original_string). Multi-byte characters (anything outside ASCII) take more than one byte in UTF-8. A string of 4 characters can be 4-16 bytes depending on the characters and encoding. - Forgetting to open files in binary mode (
"rb"/"wb") when reading or writing non-text data. Text mode applies encoding/decoding and newline translation, which corrupts binary data. - Treating
bytesas printable text. Therepr()of a bytes object shows printable ASCII characters directly and escapes everything else, which can be misleading. A bytes object likeb'hello'looks like a string but is not one.
Summary
- The
bprefix creates abytesliteral, a sequence of raw byte values, distinct from Python 3'sstrtype which holds Unicode text. - Use
encode()to convertstrtobytesanddecode()to convertbytestostr. Always specify the encoding explicitly, defaulting to UTF-8. bytesare required for network I/O, binary file operations, cryptographic functions, and most low-level APIs.- Python 3 deliberately makes
strandbytesincompatible to prevent the implicit encoding bugs that were common in Python 2. - For mutable byte sequences, use
bytearrayinstead ofbytes.

