Understand one of the important data types in Python. Each item in a set is distinct. Sets can store multiple items of various types of data.
Understand one of the important data types in Python. Each item in a set is distinct. Sets can store multiple items of various types of data.
Welcome, aspiring data scientists and coding newcomers! Today, we’re venturing into an intriguing aspect of Python that plays a pivotal role in data manipulation and analysis – sets. Python sets are powerful, versatile, and, once understood, can significantly streamline your code, especially when dealing with unique values and set operations. Let’s dive into the essence of sets, elucidating their features and functionalities with clear, practical examples.
A set is a collection type in Python, characterized by three main properties:
# create a set from odd numbers
data = {'Red','Green','Blue',1,2,3,1.5,2.2,3.9,True,False,True}
print(data)
print(type(data))
{'Red','Green','Blue',1,2,3,1.5,2.2,3.9,True,False,True}
<class 'set'>
sets
.# create a set from odd numbers
odd1 = [1,3,5,7,9,11,13,15,17,19,21]
odd2 = (15,17,19,21,23,25,27)
print(odd1)
print(odd2)
# Typecast using set()
odd1=set(odd1)
odd2=set(odd2)
[1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21]
(15, 17, 19, 21, 23, 25, 27)
Creating a set with duplicate items.
# Create set with duplicates items during assignments
odd={1,3,5,7,1,3,5,7}
odd
{1, 3, 5, 7}
# find length of set with len()
rainbow_set={'Violet','Indigo','Blue','Green','Yellow','Orange','Red'}
len(rainbow_set)
7
# use enumerate for traversing odd1.
for _,j in enumerate(odd1):
print(j)
1
3
5
7
9
11
13
#traverse with in operator
for i in odd1:
print(i)
1
3
5
7
9
11
13
The “in” keyword can also be used to check the availability of objects.
# find 11 and 12 in odd1
print(11 in odd1)
print(12 in odd1)
True
False
add()
update()
discard()
remove()
add
functionIn the following example, we will be using the add function from the set class to add elements to the set. If the element you are trying to add is already in the set, then there will be no change in the set whatsoever.
Syntax:
setname.add(element)
# use add() to add a integer in odd2
print(f'Before:{odd2}')
odd2.add(29)
print(f'After:{odd2}')
Before:{15, 17, 19, 21, 23, 25, 27}
After:{15, 17, 19, 21, 23, 25, 27, 29}
There will be no change if you add an element that is already in the set. Attempting to duplicate element 15 in set odd2.
# add number already present in the set
print(f'Before:{odd2}')
odd2.add(15)
print(f'After:{odd2}')
Before:{15, 17, 19, 21, 23, 25, 27, 29}
After:{15, 17, 19, 21, 23, 25, 27, 29}
update
functionYou can add single items to a set using the `.add()` method and multiple items (from another iterable) using the `.update()` method
setname.update(iterable_object)
# use update to add another list into set
oddlist=[x for x in range(29,40,2)]
print(f'Before:{odd2}')
odd2.update(oddlist)
print(f'After:{odd2}')
Before:{15, 17, 19, 21, 23, 25, 27, 29}
After:{33, 35, 37, 39, 15, 17, 19, 21, 23, 25, 27, 29, 31}
discard
functionItems can be removed using methods like `.remove()` and `.discard()`. While both remove an element from the set, `.remove()` will raise a KeyError if the element does not exist, whereas `.discard()` will not
setname.discard(element)
# discard 49 from odd2
print(f'Before:{odd2}')
odd2.discard(49)
print(f'After:{odd2}')
Before:{33, 35, 37, 39, 17, 19, 21, 23, 25, 27, 29, 31}
After:{33, 35, 37, 39, 17, 19, 21, 23, 25, 27, 29, 31}
# discard 36 from odd2
print(f'Before:{odd2}')
odd2.discard(36)
print(f'After:{odd2}')
Before:{33, 35, 37, 39, 17, 19, 21, 23, 25, 27, 29, 31}
After:{33, 35, 37, 39, 17, 19, 21, 23, 25, 27, 29, 31}
remove
function Use the remove method to remove an element from the set. An error will be thrown if the element you are trying to remove does not exist in the set.
setname.remove(element)
# remove element from odd2
print(f'Before:{odd2}')
odd2.remove(15)
print(f'After:{odd2}')
Before:{33, 35, 37, 39, 15, 17, 19, 21, 23, 25, 27, 29, 31}
After:{33, 35, 37, 39, 17, 19, 21, 23, 25, 27, 29, 31}
Sets in Python support mathematical set operations like union, intersection, difference, and symmetric difference. These operations can be incredibly useful in data analysis for comparing datasets.
union()
.In a union of two sets, the sets are combined into a single set.
# create two sets with rain and bow .
rain={'Violet','Indigo','Blue','Green','Yellow'}
bow={'Blue','Green','Yellow','Orange','Red'}
# union operator with rain and bow.
rain | bow
{'Blue', 'Green', 'Indigo', 'Orange', 'Red', 'Violet', 'Yellow'}
# use union() operator with rain and bow
rain.union(bow)
{'Blue', 'Green', 'Indigo', 'Orange', 'Red', 'Violet', 'Yellow'}
When two sets intersect, one set is formed from the common elements of the two sets.
# intersection of rain & bow
rain & bow
{'Blue', 'Green', 'Yellow'}
# intersection of rain and bow
rain.intersection(bow)
{'Blue', 'Green', 'Yellow'}
# intersection_update of rain and bow
print(rain.intersection_update(bow))
rain
None
{'Blue', 'Green', 'Yellow'}
# declare colors1,colors2,colors3.
colors1={'Black','Grey','White','Brown','Blue','Red'}
colors2={'Black','Grey','White','Brown','Indigo','Purple'}
colors3={'Black','Grey','White','Brown','Blue','Red','Green','Yellow'}
# instersection_update multiple sets
colors1.intersection_update(colors2,colors3)
print(colors1)
{'Black', 'Grey', 'White', 'Brown'}
-
.difference()
.difference_update()
.The difference between sets is computed by removing common element sets and selecting the remaining elements from only the first set.
colors1={'Black','Grey','White','Brown','Blue','Red'}
colors2={'Black','Grey','White','Brown','Indigo','Purple'}
colors3={'Black','Grey','White','Brown','Blue','Red','Green','Yellow'}
Input Output
colors1-colors2 {'Blue', 'Red'}
colors2-colors1 {'Indigo', 'Purple'}
colors3-colors2 {'Blue', 'Green', 'Red', 'Yellow'}
colors2-colors3 {'Indigo', 'Purple'}
colors1-colors3 set()
colors3-colors1 {'green','yellow'}
# difference between two sets with .difference()
print(f'{colors1}\n{colors2}')
colors1.difference(colors2)
{'Brown', 'White', 'Black', 'Grey'}
{'Brown', 'Black', 'Purple', 'Indigo', 'Grey', 'White'}
set()
# difference between two sets with .difference_update()
print(f'{colors1}\n{colors2}')
colors1.difference_update(colors2)
print(colors1)
{'Brown', 'White', 'Black', 'Grey'}
{'Brown', 'Black', 'Purple', 'Indigo', 'Grey', 'White'}
set()
^
.symmetric_difference()
to compute symmetric differences.The symmetric difference is calculated by removing common elements from sets and combining them into one set.
# create a set from odd numbers
data = {'Red','Green','Blue',1,2,3,1.5,2.2,3.9,True,False,True}
print(data)
print(type(data))
{False, 1, 2, 3, 1.5, 2.2, 3.9, 'Blue', 'Green', 'Red'}
<class 'set'>
^
Operator"^" The
operator computes the symmetric difference between sets 1 and 2.
# symmetric difference between two sets with ^ operaor
colors1^colors2
{'Black', 'Brown', 'Grey', 'Indigo', 'Purple', 'White'}
# symmetric difference with ^
colors2^colors3
{'Black', 'Blue', 'Brown', 'Green', 'Grey', 'Red', 'White', 'Yellow'}
# symmetric difference with ^
colors1^colors3
{'Black', 'Blue', 'Brown', 'Green', 'Grey', 'Red', 'White', 'Yellow'}
symmetric_difference()
This method returns the difference between the first and second sets.
set1.symmetric_difference(set2)
# symmetric difference between two sets with symmetric_difference()
colors1.symmetric_difference(colors2)
{'Black', 'Brown', 'Grey', 'Indigo', 'Purple', 'White'}
{'Black', 'Brown', 'Grey', 'Indigo', 'Purple', 'White'}
# symmetric difference between two sets with symmetric_difference()
colors2.symmetric_difference(colors1)
{'Black', 'Brown', 'Grey', 'Indigo', 'Purple', 'White'}
{'Black', 'Brown', 'Grey', 'Indigo', 'Purple', 'White'}
symmetric_difference_update()
This method returns the difference between the first and second sets.
set1.symmetric_difference_update(set2)
.
# symmetric difference between two sets with symmetric_difference_update()
print(colors1)
colors1.symmetric_difference_update(colors2)
print(colors1)
set()
{'Grey', 'Brown', 'White', 'Indigo', 'Purple', 'Black'}
# symmetric difference between two sets with symmetric_difference_update()
print(colors2)
colors2.symmetric_difference_update(colors3)
print(colors2)
{'Brown', 'Black', 'Purple', 'Indigo', 'Grey', 'White'}
{'Blue', 'Indigo', 'Red', 'Green', 'Purple', 'Yellow'}
<
lesser than>
greater than<=
lesser than equal to>=
greater than equal to==
is or is not equal toLet’s start by making some sets for this session.
num={x for x in range(10)}
odd={x for x in range(10) if x%2!=0}
even={x for x in range(10) if x%2==0}
print(num)
print(odd)
print(even)
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
{1, 3, 5, 7, 9}
{0, 2, 4, 6, 8}
In the following 4 images, there is a demonstration of all comparison operators with sets. Sets from above contain the digits 1 to 10 in num, as well as even and odd numbers from the 1 to 10 range in even and odd sets.
# Is num subset of odd
num<odd
False
# Is num superset of odd
num>odd
True
# Is subset of or equals to even
num <= even
False
# Is num superset of or equal to even
num >= even
True
Frozen sets are immutable, and they are made by calling the frozenset() function on a sequenced datatype.
# create frozenset
num_list=[1,2,3,4,5]
num_set=frozenset(num_list )
print( num_set )
frozenset({1, 2, 3, 4, 5})
Convert a tuple to frozen set.
# typecast tuple to frozenset.
num_tuple=(1,2,3,4,5)
print( frozenset(num_tuple) )
frozenset({1, 2, 3, 4, 5})
Convert a normal set to frozen set.
# typecast set to frozenset.
num_set={1,2,3,4,5}
print( frozenset(num_set) )
frozenset({1, 2, 3, 4, 5})
Convert a dictionary to frozen set, only keys will be used.
# typecast dictionary to frozenset.
fict={'A':1,'B':2,'C':3,'D':4,'E':5,'F':6}
print( frozenset(fict) )
frozenset({'B', 'D', 'E', 'F', 'A', 'C'})
Functions | Explanations |
---|---|
clear() | all items of the set are removed. |
copy() | Returns a copy of the set. |
isdisjoint() | Returns True if there are no common items in two sets. |
issubset() | Returns True if one set is a subset of another. |
issuperset() | Returns True if one set is derived from other. |
pop() | Removes one element from the set at a time. |
remove() | Removes the specified element from the set. |
ANCOVA is an extension of ANOVA (Analysis of Variance) that combines blocks of regression analysis and ANOVA. Which makes it Analysis of Covariance.
What if we learn topics in a desirable way!! What if we learn to write Python codes from gamers data !!
Start using NotebookLM today and embark on a smarter, more efficient learning journey!
This can be a super guide for you to start and excel in your data science career.
After tourism was established as a motivator of local economies (country, state), many governments stepped up to the plate.
Sentiment analysis can determine the polarity of sentiments from given sentences. We can classify them into certain categories.
Traverse a dictionary with for loop Accessing keys and values in dictionary. Use Dict.values() and Dict.keys() to generate keys and values as iterable. Nested Dictionaries with for loop Access Nested values of Nested Dictionaries How useful was this post? Click on a star to rate it! Submit Rating
For loop is one of the most useful methods to reuse a code for repetitive execution.
These all metrics are revolving around visits and hits which we are getting on websites. Single page visits, Bounce, Cart Additions, Bounce Rate, Exit rate,
Hypothesis testing is a statistical method for determining whether or not a given hypothesis is true. A hypothesis can be any assumption based on data.
A/B tests are randomly controlled experiments. In A/B testing, you get user response on various versions of the product, and users are split within multiple versions of the product to figure out the “winner” of the version.
This article covers ‘for’ loops and how they are used with tuples. Even if the tuples are immutable, the accessibility of the tuples is similar to that of the list.
MANOVA is an update of ANOVA, where we use a minimum of two dependent variables.
You only need to understand two or three concepts if you have read the one-way ANOVA article. We use two factors instead of one in a two-way ANOVA.