Programming
Python
String Literals
Byte Objects
Code Syntax

What does the 'b' character do in front of a string literal?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

The b prefix in front of a string literal creates a bytes object instead of a regular text string. In Python 3, b"Hello" produces a bytes value containing the ASCII-encoded representation of "Hello," while "Hello" without the prefix produces a str value containing Unicode text. This distinction matters whenever you work with network I/O, binary files, cryptographic functions, or any API that operates on raw bytes rather than human-readable characters.

How str and bytes Differ

Python 3 enforces a strict separation between text and binary data. A str is a sequence of Unicode code points. A bytes is a sequence of integers in the range 0-255. They are not interchangeable.

python
1text = "Hello"
2binary = b"Hello"
3
4print(type(text))      # <class 'str'>
5print(type(binary))    # <class 'bytes'>
6
7print(text[0])         # 'H'  (a character)
8print(binary[0])       # 72   (an integer, the ASCII value of 'H')

This separation prevents an entire class of encoding bugs that plagued Python 2, where str and bytes were conflated and silently converted using ASCII, breaking the moment non-ASCII characters appeared.

What You Can Put in a b"" Literal

A bytes literal only accepts ASCII characters and escape sequences. You cannot include non-ASCII characters directly:

python
1# Valid
2data = b"Hello\x00\xff\n"
3
4# Invalid: SyntaxError
5# data = b"cafe"  # fine because all ASCII
6# data = b"caf\xc3\xa9"  # fine (raw bytes for UTF-8 encoded 'e')
7# data = b"cafe"  # SyntaxError if you paste the actual 'e' character

If you need to represent non-ASCII text as bytes, encode a str instead:

python
1text = "cafe"
2encoded = text.encode("utf-8")
3print(encoded)    # b'caf\xc3\xa9'
4print(len(text))     # 4 characters
5print(len(encoded))  # 5 bytes (the 'e' takes 2 bytes in UTF-8)

Converting Between str and bytes

The two directions are encode() (str to bytes) and decode() (bytes to str). Both require an encoding, and UTF-8 is the standard choice.

python
1# str -> bytes
2message = "Hello, world"
3as_bytes = message.encode("utf-8")
4print(as_bytes)          # b'Hello, world'
5
6# bytes -> str
7raw = b"Hello, world"
8as_text = raw.decode("utf-8")
9print(as_text)           # Hello, world

If you use the wrong encoding, you get garbled text or an exception:

python
data = "cafe".encode("utf-8")   # b'caf\xc3\xa9'
wrong = data.decode("ascii")       # UnicodeDecodeError
wrong = data.decode("latin-1")     # "cafA(c)" (wrong, but no error)

When You Need bytes

Bytes are required in several common scenarios:

python
1# Network sockets
2import socket
3s = socket.socket()
4s.connect(("example.com", 80))
5s.sendall(b"GET / HTTP/1.1\r\nHost: example.com\r\n\r\n")
6response = s.recv(4096)  # returns bytes
7
8# Writing binary files
9with open("output.bin", "wb") as f:
10    f.write(b"\x89PNG\r\n\x1a\n")  # PNG file header
11
12# Cryptographic hashing
13import hashlib
14digest = hashlib.sha256(b"password").hexdigest()
15
16# Base64 encoding
17import base64
18encoded = base64.b64encode(b"binary data")

Many libraries and protocols (HTTP, WebSockets, database drivers, serialization formats) return bytes and expect bytes as input.

Comparison Table

Featurestrbytes
PrefixNone (or u"" in Python 2)b""
ContainsUnicode code pointsIntegers 0-255
Indexing returnsA one-character stringAn integer
EncodingAbstract text (no specific encoding)Raw encoded data
Mutable variantNone (str is immutable)bytearray
Concatenation"a" + "b" worksb"a" + b"b" works
Mixed concatenation"a" + b"b" raises TypeErrorMust encode/decode first
File mode"r" / "w" (text mode)"rb" / "wb" (binary mode)
Use caseDisplay, text processingI/O, binary protocols, hashing

The bytearray Variant

If you need a mutable sequence of bytes, use bytearray:

python
1data = bytearray(b"Hello")
2data[0] = 74  # ASCII for 'J'
3print(data)   # bytearray(b'Jello')
4
5# bytearray supports in-place modification
6data.append(33)
7print(data)   # bytearray(b'Jello!')

bytes is immutable (like str), while bytearray is mutable (like list). Use bytearray when you need to build binary data incrementally or modify it in place.

Python 2 vs. Python 3

The b prefix exists in Python 2 but does nothing. In Python 2, str is already a byte string, and b"hello" is identical to "hello". The prefix was added for forward compatibility so that code could be written to work in both Python 2 and 3.

python
1# Python 2
2type(b"hello")   # <type 'str'> (bytes and str are the same)
3type(u"hello")   # <type 'unicode'>
4
5# Python 3
6type(b"hello")   # <class 'bytes'>
7type("hello")    # <class 'str'> (str is Unicode)

If you are maintaining code that must run on both versions, using b"" for binary data and u"" for text makes the intent explicit.

Common Pitfalls

  • Mixing str and bytes in concatenation or comparison. Python 3 raises TypeError instead of silently converting, which is intentional. Always encode or decode explicitly.
  • Using decode("ascii") on data that contains non-ASCII bytes. UTF-8 is almost always the right default unless you are working with a legacy protocol that specifies otherwise.
  • Assuming len(encoded) equals len(original_string). Multi-byte characters (anything outside ASCII) take more than one byte in UTF-8. A string of 4 characters can be 4-16 bytes depending on the characters and encoding.
  • Forgetting to open files in binary mode ("rb" / "wb") when reading or writing non-text data. Text mode applies encoding/decoding and newline translation, which corrupts binary data.
  • Treating bytes as printable text. The repr() of a bytes object shows printable ASCII characters directly and escapes everything else, which can be misleading. A bytes object like b'hello' looks like a string but is not one.

Summary

  • The b prefix creates a bytes literal, a sequence of raw byte values, distinct from Python 3's str type which holds Unicode text.
  • Use encode() to convert str to bytes and decode() to convert bytes to str. Always specify the encoding explicitly, defaulting to UTF-8.
  • bytes are required for network I/O, binary file operations, cryptographic functions, and most low-level APIs.
  • Python 3 deliberately makes str and bytes incompatible to prevent the implicit encoding bugs that were common in Python 2.
  • For mutable byte sequences, use bytearray instead of bytes.

Course illustration
Course illustration

All Rights Reserved.