C group where algorithm

C++

group where algorithm

programming

data processing

coding techniques

C group where algorithm

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

There is no standard C++ algorithm literally named "group where." Usually this phrase means "filter the items I care about, then group the remaining items by some key," similar to a WHERE plus GROUP BY pipeline in SQL.

In C++, the normal way to do that is to loop once through the data, apply a predicate, and store matching elements in a map keyed by the grouping function. The standard library gives you the pieces, but you assemble the pattern yourself.

What "Group Where" Usually Means

Suppose you have a collection of orders and you want:

only active orders
grouped by region

That is a "group where" style operation:

'where: keep only items that satisfy a condition'
'group: place them into buckets by key'

A Simple Generic Implementation

Here is one way to write it in modern C++:

cpp

1#include <iostream>
2#include <string>
3#include <unordered_map>
4#include <utility>
5#include <vector>
6
7template <typename Range, typename KeyFn, typename PredFn>
8auto group_where(const Range& items, KeyFn key_fn, PredFn pred_fn) {
9    using Item = typename Range::value_type;
10    using Key = std::decay_t<decltype(key_fn(std::declval<Item>()))>;
11
12    std::unordered_map<Key, std::vector<Item>> groups;
13
14    for (const auto& item : items) {
15        if (pred_fn(item)) {
16            groups[key_fn(item)].push_back(item);
17        }
18    }
19
20    return groups;
21}
22
23struct Order {
24    int id;
25    std::string region;
26    bool active;
27};
28
29int main() {
30    std::vector<Order> orders = {
31        {1, "east", true},
32        {2, "west", false},
33        {3, "east", true},
34        {4, "north", true}
35    };
36
37    auto grouped = group_where(
38        orders,
39        [](const Order& order) { return order.region; },
40        [](const Order& order) { return order.active; }
41    );
42
43    for (const auto& [region, bucket] : grouped) {
44        std::cout << region << ": " << bucket.size() << "\n";
45    }
46}

This does exactly what the name suggests: filter first, then group.

Why a Single Pass Is Useful

The implementation above performs both operations in one pass through the input. That keeps the code simple and avoids creating an intermediate filtered container unless you actually need one.

For many business-style data processing tasks, this is already the right design:

straightforward to read
linear in the number of input items
easy to customize with lambdas

When Ordering Matters

std::unordered_map is fast, but it does not preserve key order. If grouped output should be sorted by key, use std::map instead:

cpp

std::map<std::string, std::vector<Order>> groups;

If you need insertion order, you will need a different structure or a second pass to sort the results.

Alternative Style with Ranges

In newer C++ code, you can separate the steps conceptually by filtering first with ranges and then grouping manually. The standard library still does not give you a ready-made group-by container builder, so the final accumulation step remains explicit.

That means ranges improve readability, but they do not remove the need for a grouping data structure.

Choose the Right Key and Storage Type

The "best" grouping function depends on the use case:

group by a string such as region or category
group by a boolean such as even versus odd
group by an integer bucket such as age decade

The value storage also matters. Storing whole objects is simple, but sometimes you only need IDs or counts. In that case, group into std::vector<int> or increment counters instead of copying full records.

Common Pitfalls

Looking for a standard library algorithm named group_where even though the pattern must be composed manually.
Using unordered_map and then being surprised that the output key order is not stable.
Copying large objects into groups when storing pointers, references, IDs, or counts would be cheaper.
Splitting filtering and grouping into multiple passes when one pass would be simpler.
Designing the grouping key poorly so unrelated items end up in the same bucket.

Summary

"Group where" in C++ usually means filter items by a predicate and group the survivors by a key.
The standard library does not provide a single built-in algorithm with that exact name.
A one-pass loop into std::unordered_map<Key, std::vector<Value>> is a practical solution.
Use std::map instead when key ordering matters.
Pick grouping keys and stored values based on the data you actually need afterward.