Understand one of the important data types in Python. Each item in a set is distinct. Sets can store multiple items of various types of data.
Understand one of the important data types in Python. Each item in a set is distinct. Sets can store multiple items of various types of data.
Welcome, aspiring data scientists and coding newcomers! Today, we’re venturing into an intriguing aspect of Python that plays a pivotal role in data manipulation and analysis – sets. Python sets are powerful, versatile, and, once understood, can significantly streamline your code, especially when dealing with unique values and set operations. Let’s dive into the essence of sets, elucidating their features and functionalities with clear, practical examples.
A set is a collection type in Python, characterized by three main properties:
# create a set from odd numbers
data = {'Red','Green','Blue',1,2,3,1.5,2.2,3.9,True,False,True}
print(data)
print(type(data))
{'Red','Green','Blue',1,2,3,1.5,2.2,3.9,True,False,True}
<class 'set'>
sets
.# create a set from odd numbers
odd1 = [1,3,5,7,9,11,13,15,17,19,21]
odd2 = (15,17,19,21,23,25,27)
print(odd1)
print(odd2)
# Typecast using set()
odd1=set(odd1)
odd2=set(odd2)
[1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21]
(15, 17, 19, 21, 23, 25, 27)
Creating a set with duplicate items.
# Create set with duplicates items during assignments
odd={1,3,5,7,1,3,5,7}
odd
{1, 3, 5, 7}
# find length of set with len()
rainbow_set={'Violet','Indigo','Blue','Green','Yellow','Orange','Red'}
len(rainbow_set)
7
# use enumerate for traversing odd1.
for _,j in enumerate(odd1):
print(j)
1
3
5
7
9
11
13
#traverse with in operator
for i in odd1:
print(i)
1
3
5
7
9
11
13
The “in” keyword can also be used to check the availability of objects.
# find 11 and 12 in odd1
print(11 in odd1)
print(12 in odd1)
True
False
add()
update()
discard()
remove()
add
functionIn the following example, we will be using the add function from the set class to add elements to the set. If the element you are trying to add is already in the set, then there will be no change in the set whatsoever.
Syntax:
setname.add(element)
# use add() to add a integer in odd2
print(f'Before:{odd2}')
odd2.add(29)
print(f'After:{odd2}')
Before:{15, 17, 19, 21, 23, 25, 27}
After:{15, 17, 19, 21, 23, 25, 27, 29}
There will be no change if you add an element that is already in the set. Attempting to duplicate element 15 in set odd2.
# add number already present in the set
print(f'Before:{odd2}')
odd2.add(15)
print(f'After:{odd2}')
Before:{15, 17, 19, 21, 23, 25, 27, 29}
After:{15, 17, 19, 21, 23, 25, 27, 29}
update
functionYou can add single items to a set using the `.add()` method and multiple items (from another iterable) using the `.update()` method
setname.update(iterable_object)
# use update to add another list into set
oddlist=[x for x in range(29,40,2)]
print(f'Before:{odd2}')
odd2.update(oddlist)
print(f'After:{odd2}')
Before:{15, 17, 19, 21, 23, 25, 27, 29}
After:{33, 35, 37, 39, 15, 17, 19, 21, 23, 25, 27, 29, 31}
discard
functionItems can be removed using methods like `.remove()` and `.discard()`. While both remove an element from the set, `.remove()` will raise a KeyError if the element does not exist, whereas `.discard()` will not
setname.discard(element)
# discard 49 from odd2
print(f'Before:{odd2}')
odd2.discard(49)
print(f'After:{odd2}')
Before:{33, 35, 37, 39, 17, 19, 21, 23, 25, 27, 29, 31}
After:{33, 35, 37, 39, 17, 19, 21, 23, 25, 27, 29, 31}
# discard 36 from odd2
print(f'Before:{odd2}')
odd2.discard(36)
print(f'After:{odd2}')
Before:{33, 35, 37, 39, 17, 19, 21, 23, 25, 27, 29, 31}
After:{33, 35, 37, 39, 17, 19, 21, 23, 25, 27, 29, 31}
remove
function Use the remove method to remove an element from the set. An error will be thrown if the element you are trying to remove does not exist in the set.
setname.remove(element)
# remove element from odd2
print(f'Before:{odd2}')
odd2.remove(15)
print(f'After:{odd2}')
Before:{33, 35, 37, 39, 15, 17, 19, 21, 23, 25, 27, 29, 31}
After:{33, 35, 37, 39, 17, 19, 21, 23, 25, 27, 29, 31}
Sets in Python support mathematical set operations like union, intersection, difference, and symmetric difference. These operations can be incredibly useful in data analysis for comparing datasets.
union()
.In a union of two sets, the sets are combined into a single set.
# create two sets with rain and bow .
rain={'Violet','Indigo','Blue','Green','Yellow'}
bow={'Blue','Green','Yellow','Orange','Red'}
# union operator with rain and bow.
rain | bow
{'Blue', 'Green', 'Indigo', 'Orange', 'Red', 'Violet', 'Yellow'}
# use union() operator with rain and bow
rain.union(bow)
{'Blue', 'Green', 'Indigo', 'Orange', 'Red', 'Violet', 'Yellow'}
When two sets intersect, one set is formed from the common elements of the two sets.
# intersection of rain & bow
rain & bow
{'Blue', 'Green', 'Yellow'}
# intersection of rain and bow
rain.intersection(bow)
{'Blue', 'Green', 'Yellow'}
# intersection_update of rain and bow
print(rain.intersection_update(bow))
rain
None
{'Blue', 'Green', 'Yellow'}
# declare colors1,colors2,colors3.
colors1={'Black','Grey','White','Brown','Blue','Red'}
colors2={'Black','Grey','White','Brown','Indigo','Purple'}
colors3={'Black','Grey','White','Brown','Blue','Red','Green','Yellow'}
# instersection_update multiple sets
colors1.intersection_update(colors2,colors3)
print(colors1)
{'Black', 'Grey', 'White', 'Brown'}
-
.difference()
.difference_update()
.The difference between sets is computed by removing common element sets and selecting the remaining elements from only the first set.
colors1={'Black','Grey','White','Brown','Blue','Red'}
colors2={'Black','Grey','White','Brown','Indigo','Purple'}
colors3={'Black','Grey','White','Brown','Blue','Red','Green','Yellow'}
Input Output
colors1-colors2 {'Blue', 'Red'}
colors2-colors1 {'Indigo', 'Purple'}
colors3-colors2 {'Blue', 'Green', 'Red', 'Yellow'}
colors2-colors3 {'Indigo', 'Purple'}
colors1-colors3 set()
colors3-colors1 {'green','yellow'}
# difference between two sets with .difference()
print(f'{colors1}\n{colors2}')
colors1.difference(colors2)
{'Brown', 'White', 'Black', 'Grey'}
{'Brown', 'Black', 'Purple', 'Indigo', 'Grey', 'White'}
set()
# difference between two sets with .difference_update()
print(f'{colors1}\n{colors2}')
colors1.difference_update(colors2)
print(colors1)
{'Brown', 'White', 'Black', 'Grey'}
{'Brown', 'Black', 'Purple', 'Indigo', 'Grey', 'White'}
set()
^
.symmetric_difference()
to compute symmetric differences.The symmetric difference is calculated by removing common elements from sets and combining them into one set.
# create a set from odd numbers
data = {'Red','Green','Blue',1,2,3,1.5,2.2,3.9,True,False,True}
print(data)
print(type(data))
{False, 1, 2, 3, 1.5, 2.2, 3.9, 'Blue', 'Green', 'Red'}
<class 'set'>
^
Operator"^" The
operator computes the symmetric difference between sets 1 and 2.
# symmetric difference between two sets with ^ operaor
colors1^colors2
{'Black', 'Brown', 'Grey', 'Indigo', 'Purple', 'White'}
# symmetric difference with ^
colors2^colors3
{'Black', 'Blue', 'Brown', 'Green', 'Grey', 'Red', 'White', 'Yellow'}
# symmetric difference with ^
colors1^colors3
{'Black', 'Blue', 'Brown', 'Green', 'Grey', 'Red', 'White', 'Yellow'}
symmetric_difference()
This method returns the difference between the first and second sets.
set1.symmetric_difference(set2)
# symmetric difference between two sets with symmetric_difference()
colors1.symmetric_difference(colors2)
{'Black', 'Brown', 'Grey', 'Indigo', 'Purple', 'White'}
{'Black', 'Brown', 'Grey', 'Indigo', 'Purple', 'White'}
# symmetric difference between two sets with symmetric_difference()
colors2.symmetric_difference(colors1)
{'Black', 'Brown', 'Grey', 'Indigo', 'Purple', 'White'}
{'Black', 'Brown', 'Grey', 'Indigo', 'Purple', 'White'}
symmetric_difference_update()
This method returns the difference between the first and second sets.
set1.symmetric_difference_update(set2)
.
# symmetric difference between two sets with symmetric_difference_update()
print(colors1)
colors1.symmetric_difference_update(colors2)
print(colors1)
set()
{'Grey', 'Brown', 'White', 'Indigo', 'Purple', 'Black'}
# symmetric difference between two sets with symmetric_difference_update()
print(colors2)
colors2.symmetric_difference_update(colors3)
print(colors2)
{'Brown', 'Black', 'Purple', 'Indigo', 'Grey', 'White'}
{'Blue', 'Indigo', 'Red', 'Green', 'Purple', 'Yellow'}
<
lesser than>
greater than<=
lesser than equal to>=
greater than equal to==
is or is not equal toLet’s start by making some sets for this session.
num={x for x in range(10)}
odd={x for x in range(10) if x%2!=0}
even={x for x in range(10) if x%2==0}
print(num)
print(odd)
print(even)
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
{1, 3, 5, 7, 9}
{0, 2, 4, 6, 8}
In the following 4 images, there is a demonstration of all comparison operators with sets. Sets from above contain the digits 1 to 10 in num, as well as even and odd numbers from the 1 to 10 range in even and odd sets.
# Is num subset of odd
num<odd
False
# Is num superset of odd
num>odd
True
# Is subset of or equals to even
num <= even
False
# Is num superset of or equal to even
num >= even
True
Frozen sets are immutable, and they are made by calling the frozenset() function on a sequenced datatype.
# create frozenset
num_list=[1,2,3,4,5]
num_set=frozenset(num_list )
print( num_set )
frozenset({1, 2, 3, 4, 5})
Convert a tuple to frozen set.
# typecast tuple to frozenset.
num_tuple=(1,2,3,4,5)
print( frozenset(num_tuple) )
frozenset({1, 2, 3, 4, 5})
Convert a normal set to frozen set.
# typecast set to frozenset.
num_set={1,2,3,4,5}
print( frozenset(num_set) )
frozenset({1, 2, 3, 4, 5})
Convert a dictionary to frozen set, only keys will be used.
# typecast dictionary to frozenset.
fict={'A':1,'B':2,'C':3,'D':4,'E':5,'F':6}
print( frozenset(fict) )
frozenset({'B', 'D', 'E', 'F', 'A', 'C'})
Functions | Explanations |
---|---|
clear() | all items of the set are removed. |
copy() | Returns a copy of the set. |
isdisjoint() | Returns True if there are no common items in two sets. |
issubset() | Returns True if one set is a subset of another. |
issuperset() | Returns True if one set is derived from other. |
pop() | Removes one element from the set at a time. |
remove() | Removes the specified element from the set. |
ANCOVA is an extension of ANOVA (Analysis of Variance) that combines blocks of regression analysis and ANOVA. Which makes it Analysis of Covariance.
What if we learn topics in a desirable way!! What if we learn to write Python codes from gamers data !!
Start using NotebookLM today and embark on a smarter, more efficient learning journey!
This can be a super guide for you to start and excel in your data science career.
A method to find a statistical relationship between two variables in a dataset where one variable is used to group data.
Seaborn library has matplotlib at its core for data point visualizations. This library gives highly statistical informative graphics functionality to Seaborn.
The Matplotlib library helps you create static and dynamic visualisations. Dynamic visualizations that are animated and interactive. This library makes it easy to plot data and create graphs.
This library is named Plotly after the company of the same name. Plotly provides visualization libraries for Python, R, MATLAB, Perl, Julia, Arduino, and REST.
Numpy array have functions for matrices ,linear algebra ,Fourier Transform. Numpy arrays provide 50x more speed than a python list.
Numpy has created a vast ecosystem spanning numerous fields of science.
Pandas is a easy to use data analysis and manipulation tool. Pandas provides functionality for categorical,ordinal, and time series data . Panda provides fast and powerful calculations for data analysis.
In this tutorial, you will learn How to Access The Data in Various Ways From the dataframe.
Understand one of the important data types in Python. Each item in a set is distinct. Sets can store multiple items of various types of data.
Tuples are a sequence of Python objects. A tuple is created by separating items with a comma. They are put inside the parenthesis “”(“” , “”)””.