Algorithm efficient way to remove duplicate integers from an array
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Removing duplicate integers from an array is a classic problem with several approaches at different time/space trade-offs. The most common solutions are using a hash set (O(n) time, O(n) space, preserves order), sorting then deduplicating (O(n log n) time, O(1) extra space), and the in-place two-pointer technique for sorted arrays. The right choice depends on whether order must be preserved, whether the array is already sorted, and whether extra memory is acceptable.
Hash Set Approach (Fastest)
Use a set to track seen values and build a new array of unique elements:
- Time: O(n) — one pass through the array
- Space: O(n) — the set stores up to n elements
- Preserves order: Yes
Sort + Two-Pointer (In-Place, Sorted)
If the array is already sorted (or can be sorted), remove duplicates in place:
- Time: O(n log n) for sorting + O(n) for dedup = O(n log n)
- Space: O(1) extra (in-place)
- Preserves order: No (sorted order)
Language Built-Ins
Two-Pointer for LeetCode-Style In-Place
The classic interview variant modifies a sorted array in place and returns the new length:
Bit Manipulation (Fixed Range)
When values are within a known range, use a bit array:
Uses only (max_val / 8) bytes instead of storing full integer objects in a set.
Comparison Table
| Method | Time | Space | Preserves Order | Sorted Input Required |
| Hash Set | O(n) | O(n) | Yes | No |
| Sort + Two-Pointer | O(n log n) | O(1) | No (sorted) | No (sorts it) |
| Two-Pointer (pre-sorted) | O(n) | O(1) | Yes (sorted) | Yes |
| Bit Array | O(n) | O(max_val) | Yes | No |
dict.fromkeys (Python) | O(n) | O(n) | Yes | No |
Common Pitfalls
- Using
set()and expecting preserved order:set()in Python does not guarantee insertion order. Usedict.fromkeys()or iterate with aseenset and build the result list manually to preserve order. - Applying the two-pointer technique to an unsorted array: The two-pointer in-place approach only works on sorted arrays. On unsorted input it skips non-adjacent duplicates, producing incorrect results. Sort first or use a hash set.
- Ignoring negative integers in bitmap approach: A bit array indexed by value does not handle negative integers directly. Either offset values (
x + offset) or use a hash set for arrays containing negative numbers. - Modifying the array while iterating by index: Removing elements from an array during forward iteration shifts indices and causes skipped elements. Use a write pointer instead, or build a new array.
- Choosing O(n log n) sort when O(n) hash set is available: The sort approach uses less memory (O(1) extra space), but if memory is not constrained and order matters, the hash set approach is faster and simpler. Choose based on the actual constraints.
Summary
- Use a hash set for O(n) time with order preservation — the most common and practical approach
- Use sort + two-pointer for O(1) extra space when order does not matter
- Use the two-pointer in-place technique for already-sorted arrays (classic interview problem)
- Language built-ins like
set(),Distinct(),new Set()handle common cases concisely - Bit arrays work for non-negative integers in a known range with minimal memory
- Always clarify whether order must be preserved and whether the input is sorted before choosing an approach

