How to send a structure across multiple processes using MPI_Allreduce()?
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
MPI_Allreduce does not “send a structure” in the general sense. It performs a collective reduction, which means every process contributes a value, MPI combines those values with an operation, and the reduced result is returned to every process. If your data is a struct, you need both an MPI datatype that describes the struct layout and a reduction rule that explains how the fields should be combined.
First Ask Whether MPI_Allreduce Is the Right Operation
Before building a custom datatype, make sure MPI_Allreduce is actually what you need.
Use MPI_Allreduce when:
- every process contributes a value
- those values should be combined across ranks
- every process needs the same reduced result
If you only want to send a struct from one process to all others, MPI_Bcast is usually the correct collective. If you want one process to collect results, MPI_Reduce may be the better fit.
MPI_Allreduce is about reduction, not just transport.
Describe the Struct with an MPI Datatype
Suppose each process owns a struct with an id and a numeric value.
This tells MPI how the struct is laid out in memory so it can move and reduce the data correctly.
Using offsetof is important because structure padding may exist between fields.
Define a Custom Reduction Operation
MPI still needs to know how to combine two Data values. For example, you might want the maximum id and the sum of value.
That function defines the reduction semantics field by field. MPI cannot guess this for arbitrary structs.
Use MPI_Allreduce with the Custom Pieces
Once you have both the datatype and the reduction operation, the collective call is straightforward.
Every process receives the same reduced global result.
Why Built-in Ops Usually Do Not Work
Built-in operations such as MPI_SUM and MPI_MAX work on built-in datatypes and certain predefined combinations. They do not automatically understand custom structs with mixed field semantics.
That is why custom structs usually require a user-defined MPI_Op unless the problem can be rewritten as several separate all-reductions on primitive arrays.
In practice, separate reductions on primitive fields are sometimes simpler and easier to debug.
Common Pitfalls
A common mistake is trying to use MPI_Allreduce only to distribute one struct. That is a broadcast problem, not a reduction problem.
Another mistake is creating the custom datatype but forgetting to define a matching custom reduction operation.
Developers also sometimes assume struct fields are tightly packed and compute offsets manually. Use offsetof or MPI address utilities instead.
Finally, make sure your custom reduction is associative and meaningful for the intended collective semantics. If the rule is not well-defined, the result is not trustworthy.
Summary
- '
MPI_Allreduceis for reduction plus distribution, not just sending a struct.' - Custom structs need an MPI datatype that matches their memory layout.
- Arbitrary structs also need a reduction rule, usually via a custom
MPI_Op. - If you only need distribution, prefer
MPI_Bcastinstead. - Sometimes separate reductions on primitive fields are simpler than reducing a struct directly.

