What is marshalling? What is happening when something is marshalled?
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Marshalling is the process of transforming data from one representation into another so it can cross a boundary safely. That boundary might be between processes, machines, programming languages, memory layouts, or managed and unmanaged code.
The Basic Idea
Inside one program, a value may live in a format that only that runtime understands. For example, an in-memory object can contain pointers, metadata, or language-specific structure that another system cannot read directly.
When the value is marshalled, the runtime converts it into a portable or expected form. That may include:
- copying fields into a byte sequence,
- converting text encodings,
- reordering bytes for endianness,
- flattening nested objects,
- translating managed types into native memory layouts.
Unmarshalling is the reverse step on the receiving side.
Marshalling Versus Serialization
The two words overlap, but they are not always identical.
Serialization usually means converting data into a storable or transmittable format such as JSON, XML, or Protocol Buffers.
Marshalling is broader. It includes serialization, but also any conversion needed to satisfy a boundary contract. For example, passing a Python integer into a C function through a foreign-function interface involves marshalling even if no JSON or file format is involved.
Example: Marshalling Structured Data to Bytes
A low-level example is packing values into a byte layout before sending them over the network.
Here, marshalling means converting Python integers into a specific binary format. The ! tells struct to use network byte order. The receiver can unmarshal the same bytes by applying the same format.
That is a clean example of what is "happening" during marshalling: abstract values are being reshaped into an agreed representation.
Example: Marshalling for Interoperability
Marshalling is also common when one runtime calls into another. Imagine a managed environment calling native code. The runtime may need to convert:
- a managed string into a null-terminated native string,
- a Boolean into the expected integer size,
- an object reference into a raw pointer or handle,
- an array into a contiguous memory block.
The caller and callee may both support "strings," but not in the same layout. Marshalling bridges that mismatch.
Example: Marshalling for APIs
Web APIs use marshalling constantly, even if the term is not always stated explicitly. A Python dictionary is not sent across the network directly; it is turned into JSON text first.
On the receiving side, the JSON text is unmarshalled back into a native data structure.
What Actually Changes During Marshalling
The exact work depends on the boundary, but typical transformations include:
- layout changes from objects to bytes,
- encoding changes such as Unicode text to UTF-8 bytes,
- type narrowing or widening,
- copying from one memory space to another,
- validation against a schema or interface contract.
The original meaning of the data should remain the same, but its representation changes so the other side can understand it.
Why Marshalling Exists
Different systems rarely agree on raw in-memory representation. One machine may store integers with a different byte order. One language may represent strings with internal metadata. One process cannot safely dereference another process's pointers.
Marshalling solves that by replacing implicit internal representation with an explicit shared format.
Common Pitfalls
A common mistake is thinking marshalling is just "saving data." It is more general than that. It is about adapting representation across boundaries.
Another issue is assuming the process is lossless by default. If the target format cannot represent the same precision, encoding, or structure, information can be lost.
Security is another concern. Unmarshalling untrusted data can be dangerous if the format allows code execution or unsafe object reconstruction. That is one reason safer formats such as JSON are often preferred over language-specific object serialization.
Summary
- Marshalling converts data into a representation suitable for crossing a boundary.
- The boundary might be a network, file, process, language runtime, or native interface.
- Serialization is one form of marshalling, but marshalling is broader.
- The runtime may copy bytes, change layout, convert encoding, or adapt types.
- Unmarshalling reverses the process on the receiving side.

