Determining if an unordered vectorT has all unique elements

C++

vectors

algorithms

data structures

programming tips

Determining if an unordered vectorT has all unique elements

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

Determining whether all elements in a vector are unique is a classic problem in C++ programming. It comes up in data validation, deduplication checks, and as a subroutine in larger algorithms. Because std::vector does not enforce uniqueness (unlike std::set), you need an explicit check when uniqueness is a requirement.

This article covers three practical approaches with working code, analyzes their time and space tradeoffs, and highlights pitfalls to watch for with custom types.

Approach 1: Sort and Compare Adjacent Elements

Sorting the vector brings duplicates next to each other. After sorting, a single linear pass with std::adjacent_find detects any duplicate pair.

cpp

1#include <vector>
2#include <algorithm>
3#include <iostream>
4
5template <typename T>
6bool allUniqueSorted(std::vector<T> v) {
7    // Note: we take v by value to avoid modifying the original
8    std::sort(v.begin(), v.end());
9    return std::adjacent_find(v.begin(), v.end()) == v.end();
10}
11
12int main() {
13    std::vector<int> nums = {3, 1, 4, 1, 5, 9};
14    std::cout << "All unique: " << std::boolalpha
15              << allUniqueSorted(nums) << std::endl;  // false
16    return 0;
17}

Time complexity: O(n log n) for the sort, plus O(n) for the linear scan. Space complexity: O(n) because we copy the vector. If you can modify the original, pass by reference and the extra space drops to O(1) (ignoring the sort's internal stack usage).

This is a good default choice when you do not want to pull in additional headers and the vector is not enormous.

Approach 2: Insert into a Set

A std::unordered_set provides average O(1) lookups. Insert each element, and if any insertion fails (meaning the element already exists), you have a duplicate.

cpp

1#include <vector>
2#include <unordered_set>
3#include <iostream>
4
5template <typename T>
6bool allUniqueSet(const std::vector<T>& v) {
7    std::unordered_set<T> seen;
8    seen.reserve(v.size());
9    for (const auto& elem : v) {
10        if (!seen.insert(elem).second) {
11            return false;  // duplicate found
12        }
13    }
14    return true;
15}
16
17int main() {
18    std::vector<int> nums = {3, 1, 4, 5, 9, 2, 6};
19    std::cout << "All unique: " << std::boolalpha
20              << allUniqueSet(nums) << std::endl;  // true
21    return 0;
22}

Time complexity: O(n) on average. Space complexity: O(n) for the set.

This approach has the advantage of short-circuiting: it returns as soon as the first duplicate is found, which can be much faster than sorting when duplicates appear early in the vector.

Approach 3: Brute Force (Nested Loop)

For very small vectors, a simple nested loop avoids any overhead from sorting or hashing.

cpp

1template <typename T>
2bool allUniqueBrute(const std::vector<T>& v) {
3    for (size_t i = 0; i < v.size(); ++i) {
4        for (size_t j = i + 1; j < v.size(); ++j) {
5            if (v[i] == v[j]) {
6                return false;
7            }
8        }
9    }
10    return true;
11}

Time complexity: O(n^2). Space complexity: O(1).

This is only practical for vectors with fewer than a few hundred elements. Beyond that, the quadratic time becomes unacceptable.

One-Liner with std::set Size Comparison

If conciseness matters more than early termination, you can compare the size of a set constructed from the vector to the vector's size.

cpp

1#include <set>
2
3template <typename T>
4bool allUniqueOneLiner(const std::vector<T>& v) {
5    return std::set<T>(v.begin(), v.end()).size() == v.size();
6}

This is clean and readable, but it always processes every element (no short-circuiting) and uses std::set which has O(n log n) insertion cost. For a quick check in non-performance-critical code, it works well.

Handling Custom Types

For the set-based approaches to work with custom types, those types must satisfy certain requirements:

std::unordered_set<T> requires a hash function and operator==. For standard types, these are provided. For custom structs, you need to specialize std::hash<T> or provide a custom hasher.
std::set<T> requires operator< (or a custom comparator).
The sort-based approach requires operator<.

cpp

1struct Point {
2    int x, y;
3    bool operator==(const Point& other) const {
4        return x == other.x && y == other.y;
5    }
6    bool operator<(const Point& other) const {
7        return std::tie(x, y) < std::tie(other.x, other.y);
8    }
9};
10
11// For unordered_set, provide a hash
12struct PointHash {
13    size_t operator()(const Point& p) const {
14        return std::hash<int>()(p.x) ^ (std::hash<int>()(p.y) << 16);
15    }
16};
17
18// Usage:
19// std::unordered_set<Point, PointHash> seen;

Common Pitfalls

Floating-point equality. Comparing floating-point numbers with == is unreliable due to precision errors. For vectors of double or float, consider rounding to a fixed number of decimal places or using an epsilon-based comparison, which means the set-based approach will not work directly.
Accidentally modifying the original vector. The sort-based approach modifies the vector in place. If you need the original order preserved, work on a copy.
Poor hash functions for custom types. A bad hash function that produces many collisions will degrade the unordered_set approach from O(n) to O(n^2) in the worst case. Test your hash distribution if performance matters.
Empty and single-element vectors. An empty vector and a vector with one element are trivially unique. Make sure your function handles these edge cases correctly (all approaches above do).

Summary

For most use cases, the unordered_set approach offers the best combination of O(n) average time and early termination on duplicates. The sort-then-scan method is a solid alternative when you want to avoid the overhead of hashing. The brute-force approach is fine for small vectors where simplicity matters most. When working with custom types, make sure to provide the required equality, comparison, or hash operators before reaching for any of these techniques.