Java
ArrayLists
Intersection
Union
Data Structures

Intersection and union of ArrayLists in Java

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

When Java developers ask for the intersection or union of two ArrayList values, the real question is usually about semantics, not syntax. Do duplicates matter, should order be preserved, and are you doing mathematical set logic or list logic?

The right implementation depends on those answers. Java gives you quick collection methods such as retainAll and addAll, but you need to know exactly what behavior they produce before you use them.

Start by Defining the Result

There are two common interpretations:

  • set-style operations, where each value appears at most once
  • list-style operations, where duplicates may remain

For example, if list a is [1, 2, 2, 3] and list b is [2, 3, 4], then:

  • a list-style intersection based on a might be [2, 2, 3]
  • a set-style intersection would be [2, 3]
  • a list-style union might be [1, 2, 2, 3, 2, 3, 4]
  • a set-style union would be [1, 2, 3, 4]

If you skip this decision, it is very easy to write code that is technically correct but wrong for the real requirement.

Intersection With retainAll

If you want the elements from the first list that also appear in the second list, retainAll is the most direct tool. Copy the list first so you do not mutate the original input.

java
1import java.util.ArrayList;
2import java.util.Arrays;
3import java.util.List;
4
5public class Main {
6    public static void main(String[] args) {
7        List<Integer> a = Arrays.asList(1, 2, 2, 3, 4);
8        List<Integer> b = Arrays.asList(2, 3, 5);
9
10        List<Integer> intersection = new ArrayList<>(a);
11        intersection.retainAll(b);
12
13        System.out.println(intersection);
14    }
15}

This prints:

text
[2, 2, 3]

Notice what happened: duplicates from the first list were preserved because retainAll keeps every element from the target list that is also contained in the other collection.

Union With addAll

If by "union" you simply mean "put both lists together," then you want concatenation:

java
1import java.util.ArrayList;
2import java.util.Arrays;
3import java.util.List;
4
5public class Main {
6    public static void main(String[] args) {
7        List<Integer> a = Arrays.asList(1, 2, 3);
8        List<Integer> b = Arrays.asList(3, 4, 5);
9
10        List<Integer> union = new ArrayList<>(a);
11        union.addAll(b);
12
13        System.out.println(union);
14    }
15}

That is list-style union, not mathematical set union. Duplicates stay in the result.

Set-Style Union and Intersection

If uniqueness matters, use a set. LinkedHashSet is often better than HashSet because it preserves insertion order.

java
1import java.util.Arrays;
2import java.util.LinkedHashSet;
3import java.util.Set;
4
5public class Main {
6    public static void main(String[] args) {
7        Set<Integer> union = new LinkedHashSet<>(Arrays.asList(1, 2, 2, 3));
8        union.addAll(Arrays.asList(2, 3, 4, 5));
9
10        System.out.println(union);
11    }
12}

For a set-style intersection:

java
1import java.util.LinkedHashSet;
2import java.util.Set;
3
4Set<Integer> left = new LinkedHashSet<>(Set.of(1, 2, 3, 4));
5left.retainAll(Set.of(3, 4, 5));
6
7System.out.println(left);

This gives unique values only.

Streams Can Be Useful, but They Do Not Change the Semantics

Streams are a fine option when you want the rules to be explicit in the pipeline:

java
1import java.util.Arrays;
2import java.util.List;
3import java.util.Set;
4import java.util.stream.Collectors;
5
6List<Integer> a = Arrays.asList(1, 2, 2, 3, 4);
7Set<Integer> b = Set.of(2, 3, 5);
8
9List<Integer> intersection = a.stream()
10    .filter(b::contains)
11    .distinct()
12    .collect(Collectors.toList());
13
14System.out.println(intersection);

Here, distinct() changes the result from list-style to set-style behavior. That is the important part. Streams are just another way to express the rule.

Performance Notes

If one side is large and you are doing repeated membership checks, convert one collection to a HashSet or LinkedHashSet first. That usually gives much faster lookups than repeatedly calling contains on a list.

But do not optimize by changing the data structure unless you are also okay with the semantic change. Converting to a set removes duplicates by definition.

Common Pitfalls

The most common mistake is calling concatenation a union even when duplicates should have been removed. If uniqueness matters, use a set-based approach.

Another common bug is forgetting that retainAll mutates the list it is called on. Always copy the input first if other code still depends on the original data.

Developers also get surprised when HashSet changes iteration order. If order matters, prefer LinkedHashSet.

Finally, choose the operation based on meaning, not on whichever collection method looks shortest.

Summary

  • Decide first whether you need set semantics or list semantics.
  • Use retainAll on a copied list for a simple intersection.
  • Use addAll for list-style union that preserves duplicates.
  • Use LinkedHashSet when you want unique results without losing insertion order.
  • Be explicit about mutation, duplicates, and order so the result matches the real requirement.

Course illustration
Course illustration

All Rights Reserved.