What if we learn topics in a desirable way!! What if we learn to write Python codes from gamers data !!
What if we learn topics in a desirable way!! What if we learn to write Python codes from gamers data !!
Who doesn’t love playing video games?? PUBG, Call of Duty, and Ludo are the favourite names that come to our minds when we first hear the word video game. We all have played our favourite video games at one or some point in our lives, that’s why all of us will feel connected to this data.
We are providing you with a sample dataset of users gaming sessions, which is given below.
This data contains users(User ID), whether user(s) played alone or with their friends, how much time they played, which special weapon/tool they used while playing etc. Isn’t it fascinating to analyze activities which look familiar?
Let’s imagine the data generated by video gamers which is shown below in the picture.
We are asking you to solve 10 analytical questions which are precisely the ways stakeholders ask their data analysts/data scientists in real-life projects. Out of these 10 questions, 5 are of easy level, 3 are of medium level, and 2 are of high level.
Please use the following Python code to generate sample data of gamers. You may need to install Faker library if it’s not already installed in your environment.
CONDA: conda install -c conda-forge Faker
PIP: pip3 install Faker
pip install Faker
import pandas as pd
from faker import Faker
import random
# Create a Faker instance to generate fake data
fake = Faker()
# Initialize empty lists for each column
user_ids = []
game_names = []
joined_members = []
special_weapon_used = []
devices = []
genders = []
time_of_playing = []
session_duration = []
value_added_services = []
# Generate 300 rows of data
for _ in range(300):
user_ids.append(fake.random_int(min=1, max=1000))
game_names.append(fake.random_element(elements=['Call of Duty', 'Fortnite', 'Apex Legends', 'Valorant', 'Overwatch', 'PUBG', 'Minecraft', 'Warzone']))
joined_members.append(fake.random_int(min=1, max=6))
special_weapon_used.append(fake.random_element(elements=['Rocket Launcher', 'Sniper Rifle', 'Grenade Launcher', 'None']))
devices.append(fake.random_element(elements=['PC', 'Console', 'Mobile']))
genders.append(fake.random_element(elements=['Male', 'Female']))
time_of_playing.append(fake.date_time_this_year())
session_duration.append(fake.random_int(min=30, max=120))
value_added_services.append(fake.random_element(elements=['Yes', 'No']))
# Create a Pandas DataFrame
data = {
'UserID': user_ids,
'Game Name': game_names,
'Joined Members': joined_members,
'Special Weapon Used': special_weapon_used,
'Device Played On': devices,
'Gender': genders,
'Time of Playing': time_of_playing,
'Session Duration (minutes)': session_duration,
'Value Added Services Bought': value_added_services
}
df = pd.DataFrame(data)
300 rows × 9 columns
Try solving these questions by writing codes in Python. Learn basic Python concepts in small small pieces, apply these learnings to solve the questions given below and eventually, you will find yourself becoming better in Python and data analytics.
Gaming has become a global phenomenon, and players of all genders are a part of this exciting world. By delving into the data, we get a snapshot of the gender diversity in the gaming community.
# Calculate the percentage of male and female players
gender_counts = df['Gender'].value_counts()
total_players = len(df)
percentage_male = (gender_counts['Male'] / total_players) * 100
percentage_female = (gender_counts['Female'] / total_players) * 100
print(f"Percentage of Male Players: {percentage_male:.2f}%")
print(f"Percentage of Female Players: {percentage_female:.2f}%")
Percentage of Male Players: 53.00%
Percentage of Female Players: 47.00%
Python# Load the dataset
import pandas as pd
# Count the number of players using "Rocket Launcher" as a special weapon for each game
rocket_launcher_counts = df[df['Special Weapon Used'] == 'Rocket Launcher']['Game Name'].value_counts()
# Find the game with the highest number of players using "Rocket Launcher"
most_popular_game = rocket_launcher_counts.idxmax()
player_count = rocket_launcher_counts.max()
print(f"The game with the highest number of players using 'Rocket Launcher' is {most_popular_game} with {player_count} players.")
The game with the highest number of players using 'Rocket Launcher' is Call of Duty with 15 players.
Python
Gaming platforms have come a long way, from classic consoles to mobile devices and powerful gaming PCs. In this section, we identify the go-to platform for gamers and discover where the action is happening.
# Count the number of players on each gaming platform
platform_counts = df['Device Played On'].value_counts()
# Find the most popular gaming platform and its player count
most_popular_platform = platform_counts.idxmax()
player_count = platform_counts.max()
print(f"The most popular gaming platform is {most_popular_platform} with {player_count} players.")
The most popular gaming platform is PC with 103 players.
PythonLate-night gaming sessions are a common sight, but how many players are truly night owls? We explore the gaming habits and uncover the nocturnal gamers.
# Filter players who played after 9 PM
late_night_players = df[df['Time of Playing'].dt.hour >= 21]
player_count = len(late_night_players)
print(f"{player_count} players were playing games after 9 PM.")
50 players were playing games after 9 PM.
PythonThe thrill of gaming often lies in the duration of gameplay. By analyzing session durations, we reveal the games that keep players engaged for extended periods.
# Calculate the average session duration for all players
average_duration = df['Session Duration (minutes)'].mean()
# Find the game with the longest average session duration
game_with_longest_duration = df.groupby('Game Name')['Session Duration (minutes)'].mean().idxmax()
print(f"The average session duration for all players is {average_duration:.2f} minutes.")
print(f"The game with the longest average session duration is {game_with_longest_duration}.")
The average session duration for all players is 73.43 minutes.
The game with the longest average session duration is Overwatch.
PythonAre there gender-related preferences when it comes to in-game weaponry? We explore the choices made by male and female gamers to uncover any distinctions.
# Filter the dataset for male and female players
male_players = df[df['Gender'] == 'Male']
female_players = df[df['Gender'] == 'Female']
# Calculate the most commonly used special weapon for each gender
most_common_weapon_male = male_players['Special Weapon Used'].mode().values[0]
most_common_weapon_female = female_players['Special Weapon Used'].mode().values[0]
print(f"Among male players, the most commonly used special weapon is {most_common_weapon_male}.")
print(f"Among female players, the most commonly used special weapon is {most_common_weapon_female}.")
Among male players, the most commonly used special weapon is Grenade Launcher.
Among female players, the most commonly used special weapon is Grenade Launcher.
PythonThe clock never stops ticking in the gaming world. We analyze the peak hours for gaming sessions to understand when players are most active.
# Extract the hour from the "Time of Playing" column
df['Hour of Playing'] = df['Time of Playing'].dt.hour
# Find the hour with the highest number of players
peak_hour = df['Hour of Playing'].mode().values[0]
player_count = len(df[df['Hour of Playing'] == peak_hour])
print(f"The peak hour for gaming sessions is {peak_hour}:00 with {player_count} players online.")
The peak hour for gaming sessions is 3:00 with 19 players online.
PythonIn the digital age, value-added services are often a part of the gaming experience. We delve into the world of in-game purchases to identify the most service-savvy gamers.
Part 2: Challenging Insights
# Calculate the percentage of players who bought value-added services for each game
service_percentage = (df.groupby('Game Name')['Value Added Services Bought']
.apply(lambda x: (x == 'Yes').mean() * 100)
.sort_values(ascending=False))
# Find the game with the highest percentage of players buying services
highest_service_game = service_percentage.idxmax()
highest_service_percentage = service_percentage.max()
print(f"The game with the highest percentage of players buying services is {highest_service_game} with {highest_service_percentage:.2f}%.")
The game with the highest percentage of players buying services is Call of Duty with 59.52%.
PythonDo session durations vary significantly between male and female players? We conduct a deep dive into the data to reveal any gender-based differences in gaming habits.
# Calculate the average session duration for each gender
average_duration_male = df[df['Gender'] == 'Male']['Session Duration (minutes)'].mean()
average_duration_female = df[df['Gender'] == 'Female']['Session Duration (minutes)'].mean()
print(f"Average session duration for males: {average_duration_male:.2f} minutes.")
print(f"Average session duration for females: {average_duration_female:.2f} minutes.")
Average session duration for males: 74.00 minutes.
Average session duration for females: 72.78 minutes.
PythonThe clock becomes our ally as we examine player behaviour throughout the day. By analyzing time-based patterns, we unveil the ebb and flow of the gaming world, revealing when gamers are most and least active.
# Group players by the hour of playing and count the number of players for each hour
hourly_player_counts = df.groupby('Hour of Playing')['UserID'].count()
# Find the most popular and least popular hours for gaming sessions
most_popular_hour = hourly_player_counts.idxmax()
least_popular_hour = hourly_player_counts.idxmin()
print(f"The most popular hour for gaming sessions is {most_popular_hour}:00.")
print(f"The least popular hour for gaming sessions is {least_popular_hour}:00.")
The most popular hour for gaming sessions is 3:00.
The least popular hour for gaming sessions is 16:00.
PythonANCOVA is an extension of ANOVA (Analysis of Variance) that combines blocks of regression analysis and ANOVA. Which makes it Analysis of Covariance.
What if we learn topics in a desirable way!! What if we learn to write Python codes from gamers data !!
Start using NotebookLM today and embark on a smarter, more efficient learning journey!
This can be a super guide for you to start and excel in your data science career.
After tourism was established as a motivator of local economies (country, state), many governments stepped up to the plate.
Sentiment analysis can determine the polarity of sentiments from given sentences. We can classify them into certain categories.
Traverse a dictionary with for loop Accessing keys and values in dictionary. Use Dict.values() and Dict.keys() to generate keys and values as iterable. Nested Dictionaries with for loop Access Nested values of Nested Dictionaries How useful was this post? Click on a star to rate it! Submit Rating
For loop is one of the most useful methods to reuse a code for repetitive execution.
These all metrics are revolving around visits and hits which we are getting on websites. Single page visits, Bounce, Cart Additions, Bounce Rate, Exit rate,
Hypothesis testing is a statistical method for determining whether or not a given hypothesis is true. A hypothesis can be any assumption based on data.
A/B tests are randomly controlled experiments. In A/B testing, you get user response on various versions of the product, and users are split within multiple versions of the product to figure out the “winner” of the version.
This article covers ‘for’ loops and how they are used with tuples. Even if the tuples are immutable, the accessibility of the tuples is similar to that of the list.
MANOVA is an update of ANOVA, where we use a minimum of two dependent variables.
You only need to understand two or three concepts if you have read the one-way ANOVA article. We use two factors instead of one in a two-way ANOVA.
Leave a Reply
You must be logged in to post a comment.