The Matplotlib library helps you create static and dynamic visualisations. Dynamic visualizations that are animated and interactive. This library makes it easy to plot data and create graphs.
The Matplotlib library helps you create static and dynamic visualisations. Dynamic visualizations that are animated and interactive. This library makes it easy to plot data and create graphs.
The Matplotlib library helps you create static and dynamic visualizations. Dynamic visualizations that are animated and interactive. This library makes it easy to plot data and create graphs.
import matplotlib
# install
Install.packages("ggplot2")
# Load package
library("ggplot2")
Pythonimport matplotlib.pyplot as plt
Let’s use the plot function, for which we provide two columns from datasets.
import matplotlib.pyplot as plt
# import data preparation library
import numpy as np
# data handling library
import pandas as pd
bill=[25,30,40,50,70,85,95]
tips=[4,6,6.5,7,8.5,9.3,10.8]
print(len(bill),len(tips))
plt.plot(bill,tips)
plt.title('Bill To Tip Ratio')
plt.ylabel('Tips')
plt.xlabel('Bills')
# Load package
library("ggplot2")
df<- data.frame(bill=c(25,30,40,50,70,85,95),tips=(4,6,6.5,7,8.5,9.3,10.8))
library(ggplot2)
# Line plot
ggplot(data=df, aes(x=bill, y=tips, group=1)) + geom_line() + geom_point()
Bar plot represent data similar to rectangular bars. That are Normalized by a factor to be represented in a graph. Each bar represents a individual value or variable provided.
import matplotlib.pyplot as plt
from matplotlib import style
import pandas as pd
import numpy as np
figure,axes=plt.subplots(1,1)
employees=np.array([20,50,8,100])
axes.hist(employees,bins='auto')
axes.set_title('Histogram')
axes.set_xlabel('Count')
axes.set_ylabel('Employees')
plt.show()
library("ggplot2")
df<-data.frame(Department=c('Finance','Manager','Executive'),Employees=c(20,50,8))
ggplot(data=df,aes(x=Department,y=Employees))+geom_bar(stat='identity')
Histograms are similar to graphs. They group variables or classes to represent data in a rectangular bar. These bars are also called as bin.
import matplotlib.pyplot as plt
from matplotlib import style
import pandas as pd
import numpy as np
figure,axes=plt.subplots(1,1)
employees=np.array([20,50,8,100])
axes.hist(employees,bins='auto')
axes.set_title('Histogram')
axes.set_xlabel('Count')
axes.set_ylabel('Employees')
plt.show()
library("ggplot2")
df<-c(20,50,8,100)
hist(df,xlab="Employee count",col="green",border="black")
Scatter plot points out the exact placement of a data point. Using scatter plots, we can find clustered data. We can also use a third variable to change the size of the circle. It will add a third dimension to the graph.
import matplotlib.pyplot as plt
from matplotlib import style
import pandas as pd
import numpy as np
Department=['Finance','Manager','Executive','Worker']
employees=[20,50,8,100]
salary=[75,165,469,25]
figure=plt.figure()
axes=figure.add_axes([0,0,1,1])
axes.scatter(Department,employees,color='b')
axes.set_xlabel('Department')
axes.set_ylabel('Employees')
axes.set_title('Scatter_plot')
plt.show()
library("ggplot2")
df<-data.frame(Department=c('Finance','Manager','Executive','Worker'),Employees=c(20,50,8,100),salary=c(75,165,469,25))
ggplot(df,aes(x=Department,y=Employees))+geom_point(size=2,shape=46)
In the following code, we used the third variable, salary, to differentiate data points from each other.
import matplotlib.pyplot as plt
from matplotlib import style
import pandas as pd
import numpy as np
Department=['Finance','Manager','Executive','Worker']
employees=[20,50,8,100]
salary=[75,165,469,25]
figure=plt.figure()
axes=figure.add_axes([0,0,1,1])
axes.scatter(Department,employees,salary,color='b')
axes.set_xlabel('Department')
axes.set_ylabel('Employees')
axes.set_title('Scatter_plot')
plt.show()
library("ggplot2")
df<-data.frame(Department=c('Finance','Manager','Executive'),Employees=c(20,50,8),salary=c(75,165,469,25))
ggplot(df, aes(x=Employees, y=salary, shape=salary, color=Department)) +
geom_point()
There is a concept of subplots in Matplotlib where we can create a grid of plots. The grid can organise plots.
import matplotlib.pyplot as plt
from matplotlib import style
import pandas as pd
import numpy as np
Department=['Finance','Manager','Executive','Worker']
employees=[20,50,8,100]
salary=[75,165,469,25]
plt.subplot(1,2,2)
plt.bar(Department,salary,width=0.5)
plt.xlabel("Department")
plt.ylabel("Salary")
plt.subplot(1,2,2)
figure=plt.figure()
axes=figure.add_axes([0,0,1,1])
axes.scatter(Department,employees,salary,color='b')
axes.set_xlabel('Department')
axes.set_ylabel('Employees')
axes.set_title('Scatter_plot')
plt.show()
library("ggplot2")
df<-data.frame(Department=c('Finance','Manager','Executive','Worker'),Employees=c(20,50,8,100),salary=c(75,165,469,25))
ggplot(data=df,aes(x=Department,y=Employees))+geom_bar(stat='identity')
ggplot(df, aes(x=Employees, y=salary, shape=salary, color=Department))+geom_point()
We have saved the stock data in the nested dictionaries. Here we plot accurate data points with plt.plot() and plt.scatter(). By using these, we create connected scatter plots. Using the following method, you can mix multiple types of plots into a single graph.
# load libraries
import matplotlib.pyplot as plt
from matplotlib import style
import pandas as pd
import numpy as np
# create dataset
Stocks=pd.DataFrame({'Year':[2022,2021,2020,2019,2018],'Motors_Turnover':[47263.68,47031.47,43928.17,69202.76,58831.41],'Steel_Turnover':[129021.35,64869.00,60435.97,70610.92,59160.79]})
# plot multiple lines in one plot
plt.plot(Stocks['Year'],Stocks['Motors_Turnover'],label="Tata Motors",color='r')
plt.scatter(Stocks['Year'],Stocks['Motors_Turnover'], color='r')
plt.plot(Stocks['Year'],Stocks['Steel_Turnover'],label="Steel Motors",color='b')
plt.scatter(Stocks['Year'],Stocks['Steel_Turnover'],color='b')
plt.legend()
plt.xlabel('Years')
plt.ylabel('Net TurnOver (Crores)')
plt.title('Information')
plt.show()
# Load library
library("ggplot2")
# load dataset
stock<-data.frame(Year=c(2022,2021,2020,2019,2018),Motors_Turnover=c(47263.68,47031.47,43928.17,69202.76,58831.410),Steel_Turnover=c(129021.35,64869.00,60435.97,70610.92,59160.79))
# Save two plots in stock_plot
stock_plot<- ggplot(stock, aes(Year))+geom_line(aes(y=Motors_Turnover),color = "green") + geom_line(aes(y = Steel_Turnover), color = "blue")
stock_plot
ANCOVA is an extension of ANOVA (Analysis of Variance) that combines blocks of regression analysis and ANOVA. Which makes it Analysis of Covariance.
What if we learn topics in a desirable way!! What if we learn to write Python codes from gamers data !!
Start using NotebookLM today and embark on a smarter, more efficient learning journey!
This can be a super guide for you to start and excel in your data science career.
Solve the task by completing the SQL script
Learn about LAG function in SQL and solve the quiz.
fill in the blanks to complete the code.
Brush up on your pandas basics knowledge. Drag and drop quizzes.
Improve your analytical skills by practicing the following tasks
Random forest trees combine multiple decision trees to obtain an output. And it is flexible enough to adapt to Classification and Regression.Â
In measures of dispersion, the standard deviation is one of the prominent tools to calculate the dispersion of the data
Let’s learn to calculate the spread of the data and measure it. with Absolute measures and Relative measures
Interquartile range is the difference between first and last quarters in a series of numbers. A Quartile range means a four-partition series of numbers.
In this article, we will learn how to utilize the functionalities provided by excel and python libraries to calculate IQR,
[…] Matplotlib is a plotting library for the Python programming language and its numerical mathematics extension NumPy. Gain further knowledge from our following article. […]
Leave a Reply
You must be logged in to post a comment.