C#
Set collection
data structures
programming
.NET

C Set collection?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction to C# Set Collection

C# provides a rich choice of collections in the System.Collections.Generic namespace, each purposefully engineered for particular scenarios involving data organization and access. Among these, the HashSet<T> class, which implements the concept of a mathematical set, stands out for its distinct operational characteristics. This article elaborates on the HashSet<T> collection in C#, exploring its architecture, functionalities, and typical use-cases through detailed explanations and examples.

Understanding HashSet<T>

Definition

The HashSet<T> class represents a collection of unique elements. It is renowned for its efficiency in performing set operations such as union, intersection, and difference due to its implementation based on hash tables. A key feature of the HashSet<T> is its ability to offer high-performance operations, particularly for scenarios involving large datasets and frequent checks for item existence.

Key Characteristics

  • Uniqueness of Elements: A HashSet<T> does not allow duplicate elements. When attempting to add a duplicate item, it simply ignores the item and does not throw an exception.
  • Unordered Collection: Items in a HashSet<T> are unordered, meaning there is no guaranteed sequence. It does not preserve the insertion order.
  • Efficient Lookup & Modification Operations: The HashSet<T> class typically offers O(1)O(1) complexity for Add, Remove, and Contains operations.

Basic Usage in C#

csharp
1using System;
2using System.Collections.Generic;
3
4class Program
5{
6    static void Main()
7    {
8        // Creating a HashSet of integers
9        HashSet<int> numbers = new HashSet<int> { 1, 2, 3, 4 };
10
11        // Adding elements
12        numbers.Add(5);
13        // Attempting to add a duplicate element
14        numbers.Add(3); // The element 3 already exists, so it's not added
15
16        // Removing an element
17        numbers.Remove(2);
18
19        // Checking for a specific element
20        bool containsThree = numbers.Contains(3); // true
21
22        // Displaying elements
23        Console.WriteLine("HashSet Elements:");
24        foreach (int number in numbers)
25        {
26            Console.WriteLine(number);
27        }
28    }
29}

Set Operations

The HashSet<T> class includes several methods that facilitate set operations:

  • UnionWith: Adds all unique elements from another collection. The union operation results in a set containing elements that are in either of the two collections.
csharp
1var set1 = new HashSet<int> { 1, 2, 3 };
2var set2 = new HashSet<int> { 3, 4, 5 };
3
4set1.UnionWith(set2); // set1 now contains { 1, 2, 3, 4, 5 }
  • IntersectWith: Modifies the current set to include only elements that are present in both collections.
csharp
1var set1 = new HashSet<int> { 1, 2, 3 };
2var set2 = new HashSet<int> { 2, 3, 4 };
3
4set1.IntersectWith(set2); // set1 now contains { 2, 3 }
  • ExceptWith: Removes all elements in the specified collection from the current set.
csharp
1var set1 = new HashSet<int> { 1, 2, 3 };
2var set2 = new HashSet<int> { 2, 3, 4 };
3
4set1.ExceptWith(set2); // set1 now contains { 1 }
  • SymmetricExceptWith: Modifies the current set to contain only elements that are present in one of the two collections but not both. This is essentially the symmetric difference.
csharp
1var set1 = new HashSet<int> { 1, 2, 3 };
2var set2 = new HashSet<int> { 3, 4, 5 };
3
4set1.SymmetricExceptWith(set2); // set1 now contains { 1, 2, 4, 5 }

Performance Considerations

The hash-based implementation of HashSet<T> gives it a distinct performance edge, particularly in scenarios involving lookups, insertions, and deletions. The computational complexity for these operations is roughly constant time, O(1)O(1), though actual performance can vary based on the quality of the hash function and the handling of hash collisions.

Key Points Summary

Feature/AspectDescription/Details
Element UniquenessHashSet<T> ensures that all elements are unique.
Unordered ElementsDoes not guarantee any order of elements, hence insertion order is not preserved.
Performance EfficiencyOffers average O(1) time complexity for key operations such as adding and searching elements.
Set OperationsSupports union, intersection, difference, and symmetric difference with other collections.
Non-BlockingNon-thread-safe; synchronization is required for multi-threaded access.

Conclusion

The HashSet<T> class offers a robust mechanism for storing and managing collections of items where uniqueness is critical. Its efficiency in handling large datasets makes it an attractive choice for a range of applications, from algorithms requiring fast lookups to those needing efficient set operations. While it provides impressive performance benefits, developers must be mindful of its unordered nature and its lack of thread safety, necessitating external synchronization when used in concurrent scenarios. With careful application, HashSet<T> can significantly enhance the performance of data-centric applications.


Course illustration
Course illustration

All Rights Reserved.