Basic plots with Seaborn

Matplotlib is at the heart of Seaborn Library’s data point visualizations. This library gives highly statistically informative graphics functionality to Seaborn.

Installation

Following are short instructions for installing seaborn.

Python

# command prompt
pip install seaborn
# jupyter lab / collab
!pip install seaborn

You can use these commands on the command prompt. Use the same command on Jupyter or Collab in the code cell.

Importing

While importing Seaborn, you should also import the Matplotlib and pyplot objects for visualizing the plots. In the following image, we import seaborn, numpy, and pandas for visualizations, mathematical functions, and dataset handling.

Python

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

KDE plots with Distribution Densities

KDE plots are used to show the data distribution. In the following code, we use artificially produced data to plot with KDE.

Python

data = np.random.multivariate_normal([2, 3], [[4, 6], [7, 9]],size=2000)
data = pd.DataFrame(data, columns=['A', 'B'])
sns.set(rc={'figure.figsize':(11.7,8.27)})
sns.kdeplot(data['A'], shade=True)
sns.kdeplot(data['B'], shade=True)

Distplot

Distplot can help with observing histograms with trend lines. Distplot is useful for studying continuous data.

Python

sns.set(rc={'figure.figsize':(11.7,8.27)})
sns.distplot(data['A']);
sns.distplot(data['B']);

KDE plot 2D

We can also plot the data with KDE with a 2-factor distribution plot.

Python

sns.kdeplot(x=data['A'], y=data['B'])

Python

with sns.axes_style('white'):
    sns.jointplot("A", "B", data, kind='kde');

Python

with sns.axes_style('white'):
    sns.jointplot("A", "B", data, kind="hex")

Pairplot

Let’s use the planets dataset from Seaborn Library for rendering multiple plots in a single image. Using pair plots will help in understanding paired relations among multiple variables.

Python

planets = sns.load_dataset("planets")
planets.head()

method	number	orbital period	mass	distance	year
Radial Velocity	1	269.300	7.10	77.40	2006
Radial Velocity	1	874.774	2.21	56.95	2008
Radial Velocity	1	763.000	2.60	19.84	2011
Radial Velocity	1	326.030	19.40	110.62	2007
Radial Velocity	1	516.220	10.50	119.47	2009

Python

sns.pairplot(planets, hue='year', size=2.5);

Jointplot

In a joint plot, we get a bivariate graph and two separate graphs placed on their respective axis to show each variable’s distribution plotted with histograms.

In total, you get three graphs,

one that shows the relationship between variables

Other two represent variables individually.

Python

aplanets=sns.load_dataset('geyser')
sns.jointplot("duration", "waiting", data=planets, kind='reg')

kind parameter

We can change the kind parameter and use different arguments in the joint plot. To show an example, we will use the ‘hex’ argument with kind.

Python

geyser=sns.load_dataset('geyser')
with sns.axes_style('white'):
    sns.jointplot("duration", "waiting", data=geyser, kind='hex')

Factor Plot

For factor plots, we will use the taxis dataset, which has multivariate data and also has several data types. This dataset was originally published by the NYC Taxi and Limousine Commission (TLC)

Python

taxis = sns.load_dataset('taxis')
taxis.head(5)

Pickup	Drop-Off	Passengers	Distance	Fare	Tip	Total	Color	Payment	Pickup_Zone	Dropoff_Zone	Pickup_Borough	dropoff_borough
3/23/2019 20:21	3/23/2019 20:27	1	1.6	7	2.15	12.95	yellow	credit card	Lenox Hill West	UN/Turtle Bay South	Manhattan	Manhattan
3/4/2019 16:11	3/4/2019 16:19	1	0.79	5	0	9.3	yellow	cash	Upper West Side South	Upper West Side South	Manhattan	Manhattan
3/27/2019 17:53	3/27/2019 18:00	1	1.37	7.5	2.36	14.16	yellow	credit card	Alphabet City	West Village	Manhattan	Manhattan
3/10/2019 01:23	3/10/2019 01:49	1	7.7	27	6.15	36.95	yellow	credit card	Hudson Sq	Yorkville West	Manhattan	Manhattan
3/30/2019 13:27	3/30/2019 13:37	3	2.16	9	1.1	13.4	yellow	credit card	Midtown East	Yorkville West	Manhattan	Manhattan

Python

with sns.axes_style(style='ticks'):
    g=sns.set(rc={'figure.figsize':(11.7,8.27)})
    g = sns.factorplot("color", "fare", "pickup_borough", data=taxis, kind="box")
    
    g.set_axis_labels("Taxis", "dropoff_borough");

Time Series Analysis

To understand time series analysis, we are using the worldwide life expectancy dataset. The first plot shows the world GDP over the years and its increase.

Python

lifeexp=sns.load_dataset('healthexp')
lifeexp.head(1)

Year	Country	Spending Usd	Life Expectancy
1970	Germany	252.311	70.6

Python

with sns.axes_style('white'):
    g = sns.factorplot(x="Year",y="Spending_USD" ,data=lifeexp,aspect=4.0,)
    g.set_xticklabels(step=5)

Python

with sns.axes_style('white'):

    g = sns.factorplot(x="Year",y="Life_Expectancy" ,data=lifeexp ,aspect=2.0,)
    g.set_xticklabels(step=5)

One response to “Basic plots with Seaborn”

Start Your Python Journey from Scratch - Distinctive Analytics
October 21, 2023
[…] Seaborn is a Python data visualization library based on Matplotlib. See how to use basic plots with seaborn here. […]
Log in to Reply

Distinctive Analytics

Table of Contents

Installation

Importing

KDE plots with Distribution Densities

Distplot

KDE plot 2D

Pairplot

Jointplot

kind parameter

Factor Plot

Time Series Analysis

ANCOVA: Analysis of Covariance with python

Learn Python The Fun Way

Meet the most efficient and intelligent AI assistant : NotebookLM

Break the ice

Manova Quiz

Quiz on Group By

Visualization Quiz

Versions of ANCOVA (Analysis Of Covariance) with python

Python Variables

A/B Testing Quiz

Python Functions

Python Indexing: A Guide for Data Science Beginners

Diffusion Models: Making AI Creativity

One response to “Basic plots with Seaborn”

Leave a Reply Cancel reply

Points You Earned