How To Compare Two Sets in Python

Take advantage of built-in methods for easy comparisons

How To Compare Two Sets in Python
Photo by Jason Dent on Unsplash

One of my favorite data types in Python is the set. Sets are super handy — most frequently being used to eliminate duplicate items in an iterable. However, sets can be used for more than a quick dupe eliminator.

Sets have their roots in discrete math—a branch that focuses on countable or distinguishable objects. The first way to observe and interact with sets is to compare them to one another.

This is reflected in Python’s natively available set methods. Before we dive into that, let’s cover a basic primer on creating sets.


Creating Sets in Python

To create a set, use curly braces with each element separated by a comma.my_first_set = {1,2,3,4,5}

If you happen to duplicate an element, you’ll notice the set gracefully removes it without causing any sort of error.my_first_set = {1,2,3,4,5,2}
print(my_first_set) # {1,2,3,4,5}

Make sure you don’t get sets confused with dictionaries, which are also enclosed with curly braces.my_set = {1,2,3}
my_dict = {"term": "definition", "term2": "definition2"}

Finally, if you want to convert an existing data structure into a set, do so with the set() function.my_list = [1,2,1,2,3]
my_set = set(my_list)
print(my_set) # {1,2,3}

Now that we know how to define a set, let’s jump into the meat of this piece and cover six natively available set-comparison methods.


Set-Comparison Methods

We’ll go over six set methods in pairs of two at a time. Each method will have a description, a visual diagram, and a code example. Let’s get to it.

.union() and .intersection()

We’ll start with the two easiest — and probably most familiar — set comparison concepts: union and intersection. These should bring you back to your early school years studying Venn diagrams.

A union is a new set that contains elements from all original sets. This can be thought of as the sum of two or more sets.

Union identifies the sum of multiple sets

To create a union set, we simply use the .union() method from one set and supply a second set as the argument.a = {1,2,3}
b = {3,4,5}
c = a.union(b)print(c) # {1,2,3,4,5}

Notice the number 3 is in both sets but only appears once in our union. Don’t forget sets are unique elements, so naturally duplicates will be left out.

An intersection is a new set that contains only the elements from a that are also present in b. Functionally, this includes the elements that overlap, or are in both sets.

Intersection identifies the common elements—or overlap—between multiple sets

To create an intersection set, we use the .intersection() method from one set and supply a second set as the argument.a = {1,2,3}
b = {3,4,5}
c = a.intersection(b)print(c) # {3}

Since both .union() and .intersection() either look at the whole or the overlap, it doesn’t matter which set is used to call the method. However, this will change as we move to more advanced methods.

.difference() and .symmetric_difference()

We move on to identifying elements that don’t exist in another—known as a difference. From the base set, we’ll eliminate any elements that are present in the other set.

Difference identifies the values that exist in a base set and not in another

To create a new set of elements that are present in the base set and aren’t present in the second set, use the .difference() method.a = {1,2,3}
b = {3,4,5}
c = a.difference(b)print(c) # {1,2}

Unlike .union() and .intersection(), the base set and set that’s passed matter for .difference() and will yield different results.a = {1,2,3}
b = {3,4,5}
c = b.difference(a)
print(c) # {4,5}

Now, to find the values that exist in only one set, we use .symmetric_difference(). Think of this as the union minus the intersection.

Symmetric difference identifies the unique elements across multiple sets

a = {1,2,3}
b = {3,4,5}
c = a.symmetric_difference(b)print(c) # {1,2,4,5}

Since we’re subtracting the intersection from the union—both operations that were symmetrical—the set that’s used as the base and argument don’t matter here either.

.issubset() and .issuperset()

Our final set comparisons are going to validate whether one set fully exists within another. Unlike the previous four methods, which returned new sets, the .issubset() and .issuperset() methods return boolean True or False values.

A subset is a set that entirely exists within another. Visually, a subset would be represented as a circle entirely within a larger circle.

A subset is a set where all elements are present in the larger set

It’s critical to correctly identify the base set as these operands aren’t interchangeable. We’ll modify the values of our a and b sets to demonstrate these methods.a = {1,2,3,4,5}
b = {3,4,5}
c = a.issubset(b)print(c) # False

If the False result was surprising to you, it’s because you interpreted the method backwards. The above example should be read as: Is a a subset of b?

If we flip the base set and argument—asking if b is a subset of a—we’ll get a True result.a = {1,2,3,4,5}
b = {3,4,5}
c = b.issubset(a)print(c) # True

A superset is predictably the opposite of a subset. A base set is declared a superset if all elements of the supplied argument exist within it. Going back to our overlapping circles, the superset is the big circle.

A superset is a set that includes every element (and potentially more) from another set

Having practiced with .issubset(), we should be able to correctly read .issuperset() and accurately predict the outcome.a = {1,2,3,4,5}
b = {3,4,5}
c = a.issuperset(b)
print(c) # True


Conclusion

Python is such a syntactically simple language to get started with that we forget about the treasure trove of goodies built into the language.

What are your favorite real-world uses of the set built-in methods?

Subscribe to Dreams of Fortunes and Cookies

Sign up now to get access to the library of members-only issues.
Jamie Larson
Subscribe