Exploring the World of Sets in Python

Understand one of the important data types in Python. Each item in a set is distinct. Sets can store multiple items of various types of data.

Topics: ,

Welcome, aspiring data scientists and coding newcomers! Today, we’re venturing into an intriguing aspect of Python that plays a pivotal role in data manipulation and analysis – sets. Python sets are powerful, versatile, and, once understood, can significantly streamline your code, especially when dealing with unique values and set operations. Let’s dive into the essence of sets, elucidating their features and functionalities with clear, practical examples.

What is a set in python?

A set is a collection type in Python, characterized by three main properties:

  • 1. Unordered: The items in a set do not have a defined order. This means you cannot access items in a set by referring to an index or a key.
  • 2. Mutable: You can add or remove items from a set after its creation.
  • 3. Unique Elements: A set automatically filters out duplicate entries, leaving only unique elements.

1. Create sets

  • To create a set, use curly braces `{}` or the `set()` function:
  • Sets remove duplicate items.
  • Sets can handle multiple data types.
Python
Python
Python
# create a set from odd numbers
data = {'Red','Green','Blue',1,2,3,1.5,2.2,3.9,True,False,True}
print(data)
print(type(data))
Output
{'Red','Green','Blue',1,2,3,1.5,2.2,3.9,True,False,True}
<class 'set'>

Convert different collection data types to sets.

Python
Python
Python
# create a set from odd numbers
odd1 = [1,3,5,7,9,11,13,15,17,19,21]
odd2 = (15,17,19,21,23,25,27)       
print(odd1)
print(odd2)
# Typecast using set()
odd1=set(odd1)
odd2=set(odd2)
Output
[1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21]
(15, 17, 19, 21, 23, 25, 27)

Creating a set with duplicate items.

Python
Python
Python
# Create set with duplicates items during assignments
odd={1,3,5,7,1,3,5,7}
odd
Output
{1, 3, 5, 7}

Length of sets

Python
Python
Python
# find length of set with len()
rainbow_set={'Violet','Indigo','Blue','Green','Yellow','Orange','Red'}
len(rainbow_set)
Output
7

2. Access Sets

  • Sets don’t have indexes.
  • To access the elements, we can use the ‘enumerate()’ or ‘in’ operator, However, because sets are unordered, they may not be accessed in the same order in which they were saved.
Python
Python
Python
#  use enumerate for traversing odd1.
for _,j in enumerate(odd1):
    print(j)

Output
1
3
5
7
9
11
13
Python
Python
Python
#traverse with in operator 
for i in odd1:
    print(i)
Output
1
3
5
7
9
11
13

The “in” keyword can also be used to check the availability of objects.

Python
Python
Python
# find 11 and 12 in odd1
print(11 in odd1)
print(12 in odd1)
Output
True
False

3. Set Manipulation

  • add()
  • update()
  • discard()
  • remove()

1. add function

In the following example, we will be using the add function from the set class to add elements to the set. If the element you are trying to add is already in the set, then there will be no change in the set whatsoever.

Syntax:

setname.add(element)

Python
Python
Python
# use add() to add a integer in odd2
print(f'Before:{odd2}')
odd2.add(29)
print(f'After:{odd2}')
Output
Before:{15, 17, 19, 21, 23, 25, 27}
After:{15, 17, 19, 21, 23, 25, 27, 29}

There will be no change if you add an element that is already in the set. Attempting to duplicate element 15 in set odd2.

Python
Python
Python
# add number already present in the set
print(f'Before:{odd2}')
odd2.add(15)
print(f'After:{odd2}')
Output
Before:{15, 17, 19, 21, 23, 25, 27, 29}
After:{15, 17, 19, 21, 23, 25, 27, 29}

2. update function

You can add single items to a set using the `.add()` method and multiple items (from another iterable) using the `.update()` method

Syntax:

setname.update(iterable_object)

Python
Python
Python
# use update to add another list into set
oddlist=[x for x in range(29,40,2)]
print(f'Before:{odd2}')
odd2.update(oddlist)
print(f'After:{odd2}')
Output
Before:{15, 17, 19, 21, 23, 25, 27, 29}
After:{33, 35, 37, 39, 15, 17, 19, 21, 23, 25, 27, 29, 31}

3. discard function

Items can be removed using methods like `.remove()` and `.discard()`. While both remove an element from the set, `.remove()` will raise a KeyError if the element does not exist, whereas `.discard()` will not

Syntax:

setname.discard(element)

Python
Python
Python
# discard 49 from odd2
print(f'Before:{odd2}')
odd2.discard(49)
print(f'After:{odd2}')
Output

Before:{33, 35, 37, 39, 17, 19, 21, 23, 25, 27, 29, 31}
After:{33, 35, 37, 39, 17, 19, 21, 23, 25, 27, 29, 31}
Python
Python
Python
# discard 36 from odd2
print(f'Before:{odd2}')
odd2.discard(36)
print(f'After:{odd2}')
Output

Before:{33, 35, 37, 39, 17, 19, 21, 23, 25, 27, 29, 31}
After:{33, 35, 37, 39, 17, 19, 21, 23, 25, 27, 29, 31}

4. remove function

Use the remove method to remove an element from the set. An error will be thrown if the element you are trying to remove does not exist in the set.

Syntax:

setname.remove(element)

Python
Python
Python
# remove element from odd2
print(f'Before:{odd2}')
odd2.remove(15)
print(f'After:{odd2}')
Output
Before:{33, 35, 37, 39, 15, 17, 19, 21, 23, 25, 27, 29, 31}
After:{33, 35, 37, 39, 17, 19, 21, 23, 25, 27, 29, 31}

4. Set Operations

Sets in Python support mathematical set operations like union, intersection, difference, and symmetric difference. These operations can be incredibly useful in data analysis for comparing datasets.

1. union

  • Union with “ | “.
  • Union with union().

In a union of two sets, the sets are combined into a single set.

image 134
Python
Python
Python
# create two sets with rain and bow .
rain={'Violet','Indigo','Blue','Green','Yellow'}
bow={'Blue','Green','Yellow','Orange','Red'}
# union operator with rain and bow.
rain | bow
Output
{'Blue', 'Green', 'Indigo', 'Orange', 'Red', 'Violet', 'Yellow'}
Python
Python
Python
# use union() operator with rain and bow
rain.union(bow)
Output
{'Blue', 'Green', 'Indigo', 'Orange', 'Red', 'Violet', 'Yellow'}

2. Intersection

  • Cross-reference with ‘&’
  • Intersection with the function ‘intersection()’
  • Intersection with the function ‘intersection update()’

When two sets intersect, one set is formed from the common elements of the two sets.

image 136
Python
Python
Python
# intersection of rain & bow
rain & bow
Output
{'Blue', 'Green', 'Yellow'}
Python
Python
Python

# intersection of rain and bow
rain.intersection(bow)
Output
{'Blue', 'Green', 'Yellow'}
Python
Python
Python

# intersection_update of rain and bow
print(rain.intersection_update(bow))
rain
Output
None
{'Blue', 'Green', 'Yellow'}
Python
Python
Python
# declare colors1,colors2,colors3.
colors1={'Black','Grey','White','Brown','Blue','Red'}
colors2={'Black','Grey','White','Brown','Indigo','Purple'}
colors3={'Black','Grey','White','Brown','Blue','Red','Green','Yellow'}

# instersection_update multiple sets 
colors1.intersection_update(colors2,colors3)
print(colors1)
Output
{'Black', 'Grey', 'White', 'Brown'}

3. Difference

  • Distinction with -.
  • Distinction with difference().
  • Distinction with difference_update().

The difference between sets is computed by removing common element sets and selecting the remaining elements from only the first set.

image 139
Python
Python
Python
colors1={'Black','Grey','White','Brown','Blue','Red'}
colors2={'Black','Grey','White','Brown','Indigo','Purple'}
colors3={'Black','Grey','White','Brown','Blue','Red','Green','Yellow'}

     Input                     Output
colors1-colors2                {'Blue', 'Red'}
colors2-colors1                {'Indigo', 'Purple'}

colors3-colors2                {'Blue', 'Green', 'Red', 'Yellow'}

colors2-colors3                {'Indigo', 'Purple'}

colors1-colors3                set()

colors3-colors1                {'green','yellow'}

Difference between two sets with .difference()

Python
Python
Python
# difference between two sets with .difference()
print(f'{colors1}\n{colors2}')
colors1.difference(colors2)
Output
{'Brown', 'White', 'Black', 'Grey'}
{'Brown', 'Black', 'Purple', 'Indigo', 'Grey', 'White'}
set()

Difference between two sets with .difference_update()

Python
Python
Python

# difference between two sets with .difference_update()
print(f'{colors1}\n{colors2}')
colors1.difference_update(colors2)
print(colors1)
Output
{'Brown', 'White', 'Black', 'Grey'}
{'Brown', 'Black', 'Purple', 'Indigo', 'Grey', 'White'}
set()

4. Symmetrical Difference

  • symmetric distinction with ^.
  • Using symmetric_difference() to compute symmetric differences.

The symmetric difference is calculated by removing common elements from sets and combining them into one set.

image 142
Python
Python
Python
# create a set from odd numbers
data = {'Red','Green','Blue',1,2,3,1.5,2.2,3.9,True,False,True}
print(data)
print(type(data))
Output
{False, 1, 2, 3, 1.5, 2.2, 3.9, 'Blue', 'Green', 'Red'}
<class 'set'>

Symmetric Difference Using the  ^  Operator

"^" The operator computes the symmetric difference between sets 1 and 2.

Python
Python
Python
# symmetric difference between two sets with ^ operaor
colors1^colors2
Output
{'Black', 'Brown', 'Grey', 'Indigo', 'Purple', 'White'}
Python
Python
Python
# symmetric difference with ^
colors2^colors3
Output
{'Black', 'Blue', 'Brown', 'Green', 'Grey', 'Red', 'White', 'Yellow'}
Python
Python
Python
# symmetric difference with ^
colors1^colors3
Output
{'Black', 'Blue', 'Brown', 'Green', 'Grey', 'Red', 'White', 'Yellow'}

symmetric_difference()

This method returns the difference between the first and second sets.

Syntax: 

set1.symmetric_difference(set2)

symmetric difference between two sets with symmetric_difference()

Python
Python
Python
# symmetric difference between two sets with symmetric_difference()
colors1.symmetric_difference(colors2)
Output
{'Black', 'Brown', 'Grey', 'Indigo', 'Purple', 'White'}
{'Black', 'Brown', 'Grey', 'Indigo', 'Purple', 'White'}

symmetric difference between two sets with symmetric_difference()

Python
Python
Python
# symmetric difference between two sets with symmetric_difference()
colors2.symmetric_difference(colors1)
Output
{'Black', 'Brown', 'Grey', 'Indigo', 'Purple', 'White'}
{'Black', 'Brown', 'Grey', 'Indigo', 'Purple', 'White'}

symmetric_difference_update()

This method returns the difference between the first and second sets.

Syntax:

set1.symmetric_difference_update(set2).

Python
Python
Python
# symmetric difference between two sets with symmetric_difference_update()
print(colors1)
colors1.symmetric_difference_update(colors2)
print(colors1)
Output
set()
{'Grey', 'Brown', 'White', 'Indigo', 'Purple', 'Black'}
Python
Python
Python

# symmetric difference between two sets with symmetric_difference_update()
print(colors2)
colors2.symmetric_difference_update(colors3)
print(colors2)
Output
{'Brown', 'Black', 'Purple', 'Indigo', 'Grey', 'White'}
{'Blue', 'Indigo', 'Red', 'Green', 'Purple', 'Yellow'}

5. Set Comparison Operators

  • < lesser than
  • > greater than
  • <= lesser than equal to
  • >= greater than equal to
  • == is or is not equal to

Let’s start by making some sets for this session.

Python
Python
Python
num={x for x in range(10)}
odd={x for x in range(10) if x%2!=0}
even={x for x in range(10) if x%2==0}

print(num)
print(odd)
print(even)
Output
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
{1, 3, 5, 7, 9}
{0, 2, 4, 6, 8}

In the following 4 images, there is a demonstration of all comparison operators with sets. Sets from above contain the digits 1 to 10 in num, as well as even and odd numbers from the 1 to 10 range in even and odd sets.

Python
Python
Python
# Is num subset of odd
num<odd
Output
False
Python
Python
Python
# Is  num superset of odd
num>odd
Output
True
Python
Python
Python
# Is subset of or equals to even
num <= even
Output
False
Python
Python
Python
# Is num superset of or equal to even
num >= even
Output
True

6.FrozenSet

Frozen sets are immutable, and they are made by calling the frozenset() function on a sequenced datatype.

Python
Python
Python
# create frozenset
num_list=[1,2,3,4,5]
num_set=frozenset(num_list )
print( num_set )
Output

frozenset({1, 2, 3, 4, 5})

Convert a tuple to frozen set.

Python
Python
Python
# typecast tuple to frozenset.
num_tuple=(1,2,3,4,5)
print( frozenset(num_tuple) )
Output

frozenset({1, 2, 3, 4, 5})

Convert a normal set to frozen set.

Python
Python
Python
# typecast set to frozenset.
num_set={1,2,3,4,5}
print( frozenset(num_set) )
Output
frozenset({1, 2, 3, 4, 5})

Convert a dictionary to frozen set, only keys will be used.

Python
Python
Python
# typecast dictionary to frozenset.
fict={'A':1,'B':2,'C':3,'D':4,'E':5,'F':6}
print( frozenset(fict) )
Output
frozenset({'B', 'D', 'E', 'F', 'A', 'C'})
FunctionsExplanations
clear()all items of the set are removed.
copy()Returns a copy of the set.
isdisjoint()Returns True if there are no common items in two sets.
issubset()  Returns True if one set is a subset of another.
issuperset()Returns True if one set is derived from other.
pop()Removes one element from the set at a time.
remove()  Removes the specified element from the set.

How useful was this post?

Click on a star to rate it!

  • ANCOVA: Analysis of Covariance with python

    ANCOVA is an extension of ANOVA (Analysis of Variance) that combines blocks of regression analysis and ANOVA. Which makes it Analysis of Covariance.

  • Learn Python The Fun Way

    What if we learn topics in a desirable way!! What if we learn to write Python codes from gamers data !!

  • Meet the most efficient and intelligent AI assistant : NotebookLM

    Start using NotebookLM today and embark on a smarter, more efficient learning journey!

  • Break the ice

    This can be a super guide for you to start and excel in your data science career.

  • Tourism Trend Prediction

    After tourism was established as a motivator of local economies (country, state), many governments stepped up to the plate.

  • Sentiment Analysis Polarity Detection using pos tag

    Sentiment analysis can determine the polarity of sentiments from given sentences. We can classify them into certain categories.

  • For loop with Dictionary

    Traverse a dictionary with for loop Accessing keys and values in dictionary. Use Dict.values() and Dict.keys() to generate keys and values as iterable. Nested Dictionaries with for loop Access Nested values of Nested Dictionaries How useful was this post? Click on a star to rate it! Submit Rating

  • For Loops with python

    For loop is one of the most useful methods to reuse a code for repetitive execution.

  • Metrics and terminologies of digital analytics

    These all metrics are revolving around visits and hits which we are getting on websites. Single page visits, Bounce, Cart Additions, Bounce Rate, Exit rate,

  • Hypothesis Testing

    Hypothesis testing is a statistical method for determining whether or not a given hypothesis is true. A hypothesis can be any assumption based on data.

  • A/B testing

    A/B tests are randomly controlled experiments. In A/B testing, you get user response on various versions of the product, and users are split within multiple versions of the product to figure out the “winner” of the version.

  • For Loop With Tuples

    This article covers ‘for’ loops and how they are used with tuples. Even if the tuples are immutable, the accessibility of the tuples is similar to that of the list.

  • Multivariate ANOVA (MANOVA) with python

    MANOVA is an update of ANOVA, where we use a minimum of two dependent variables.

  • Two-Way ANOVA

    You only need to understand two or three concepts if you have read the one-way ANOVA article. We use two factors instead of one in a two-way ANOVA.

Instagram
WhatsApp
error: Content is protected !!