Distinguishing extra element from two arrays?
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
If one array is the same as another except for one extra element, the best way to find that extra element depends on whether order matters, whether duplicates are possible, and whether the arrays contain numeric values. There is no single best trick for all versions of the problem, but there are a few standard solutions with clear tradeoffs.
Clarify the Problem Shape First
The simplest version is:
- array
Bcontains all elements of arrayA - plus exactly one additional element
Questions that matter:
- are duplicates allowed
- do both arrays contain numbers only
- is order preserved
- do you need the value, the index, or both
If duplicates are allowed, some naive approaches stop being safe. For example, checking only whether an element "exists" in the shorter array is not enough if the same value appears multiple times.
The Fast Numeric Trick: XOR
If the arrays contain integers and exactly one extra value appears in the longer array, XOR is a clean O(n) solution.
Why it works:
- '
x xor xbecomes0' - XOR is associative and commutative
- matching elements cancel out
- only the extra value remains
This is elegant and constant-space, but it only works cleanly when the values support bitwise XOR semantics.
The General Counting Solution
If the arrays contain arbitrary hashable values and duplicates matter, counting is the safest general-purpose answer.
This works even when the extra element is just an extra occurrence of an existing value.
The tradeoff is extra memory proportional to the number of distinct values.
Sum Difference Works, But Only in Narrow Cases
If the arrays contain numbers and there is exactly one extra numeric value, subtracting sums can work.
This is simple, but it is weaker than XOR or counting because:
- it assumes numeric values
- it can hide issues when multiple differences cancel each other
- it does not help if the data is not strictly the same except for one extra item
So it is fine for a tightly defined interview problem, but less robust in real-world code.
Sorting Helps When You Need the Position
If you are allowed to reorder the arrays, sorting and then comparing element by element is another option.
This costs O(n log n) because of sorting, but it is easy to understand and can help when you need the arrays aligned for further analysis.
What If Order Is Preserved?
If the longer array is the same sequence plus one inserted element, you can use a two-pointer linear scan and stop at the first mismatch.
That version is useful when the problem is really about detecting insertion in a stream or versioned sequence rather than arbitrary multiset comparison.
Common Pitfalls
The biggest mistake is ignoring duplicates. An algorithm that works for sets can fail for multisets.
Another mistake is using the sum trick when the real input may contain more than one mismatch or non-numeric values.
A third issue is choosing sorting when order itself is meaningful and should not be destroyed.
Finally, XOR is excellent for integers, but it is not a universal pattern for every data type.
Summary
- Use XOR for the classic integer version when exactly one extra numeric element exists.
- Use
Counteror frequency maps when duplicates and general hashable values matter. - Use sum difference only for tightly constrained numeric cases.
- Use sorting when reorder is allowed and simplicity matters more than asymptotic speed.
- If original order is preserved, a two-pointer scan may be the best fit.
- The right solution depends on duplicates, data type, and whether order matters.

