What if we learn topics in a desirable way!! What if we learn to write Python codes from gamers data !!
What if we learn topics in a desirable way!! What if we learn to write Python codes from gamers data !!
Who doesn’t love playing video games?? PUBG, Call of Duty, and Ludo are the favourite names that come to our minds when we first hear the word video game. We all have played our favourite video games at one or some point in our lives, that’s why all of us will feel connected to this data.
We are providing you with a sample dataset of users gaming sessions, which is given below.
This data contains users(User ID), whether user(s) played alone or with their friends, how much time they played, which special weapon/tool they used while playing etc. Isn’t it fascinating to analyze activities which look familiar?
Let’s imagine the data generated by video gamers which is shown below in the picture.
We are asking you to solve 10 analytical questions which are precisely the ways stakeholders ask their data analysts/data scientists in real-life projects. Out of these 10 questions, 5 are of easy level, 3 are of medium level, and 2 are of high level.
Please use the following Python code to generate sample data of gamers. You may need to install Faker library if it’s not already installed in your environment.
CONDA: conda install -c conda-forge Faker
PIP: pip3 install Faker
pip install Faker
import pandas as pd
from faker import Faker
import random
# Create a Faker instance to generate fake data
fake = Faker()
# Initialize empty lists for each column
user_ids = []
game_names = []
joined_members = []
special_weapon_used = []
devices = []
genders = []
time_of_playing = []
session_duration = []
value_added_services = []
# Generate 300 rows of data
for _ in range(300):
user_ids.append(fake.random_int(min=1, max=1000))
game_names.append(fake.random_element(elements=['Call of Duty', 'Fortnite', 'Apex Legends', 'Valorant', 'Overwatch', 'PUBG', 'Minecraft', 'Warzone']))
joined_members.append(fake.random_int(min=1, max=6))
special_weapon_used.append(fake.random_element(elements=['Rocket Launcher', 'Sniper Rifle', 'Grenade Launcher', 'None']))
devices.append(fake.random_element(elements=['PC', 'Console', 'Mobile']))
genders.append(fake.random_element(elements=['Male', 'Female']))
time_of_playing.append(fake.date_time_this_year())
session_duration.append(fake.random_int(min=30, max=120))
value_added_services.append(fake.random_element(elements=['Yes', 'No']))
# Create a Pandas DataFrame
data = {
'UserID': user_ids,
'Game Name': game_names,
'Joined Members': joined_members,
'Special Weapon Used': special_weapon_used,
'Device Played On': devices,
'Gender': genders,
'Time of Playing': time_of_playing,
'Session Duration (minutes)': session_duration,
'Value Added Services Bought': value_added_services
}
df = pd.DataFrame(data)
300 rows × 9 columns
Try solving these questions by writing codes in Python. Learn basic Python concepts in small small pieces, apply these learnings to solve the questions given below and eventually, you will find yourself becoming better in Python and data analytics.
Gaming has become a global phenomenon, and players of all genders are a part of this exciting world. By delving into the data, we get a snapshot of the gender diversity in the gaming community.
# Calculate the percentage of male and female players
gender_counts = df['Gender'].value_counts()
total_players = len(df)
percentage_male = (gender_counts['Male'] / total_players) * 100
percentage_female = (gender_counts['Female'] / total_players) * 100
print(f"Percentage of Male Players: {percentage_male:.2f}%")
print(f"Percentage of Female Players: {percentage_female:.2f}%")
Percentage of Male Players: 53.00%
Percentage of Female Players: 47.00%
Python# Load the dataset
import pandas as pd
# Count the number of players using "Rocket Launcher" as a special weapon for each game
rocket_launcher_counts = df[df['Special Weapon Used'] == 'Rocket Launcher']['Game Name'].value_counts()
# Find the game with the highest number of players using "Rocket Launcher"
most_popular_game = rocket_launcher_counts.idxmax()
player_count = rocket_launcher_counts.max()
print(f"The game with the highest number of players using 'Rocket Launcher' is {most_popular_game} with {player_count} players.")
The game with the highest number of players using 'Rocket Launcher' is Call of Duty with 15 players.
Python
Gaming platforms have come a long way, from classic consoles to mobile devices and powerful gaming PCs. In this section, we identify the go-to platform for gamers and discover where the action is happening.
# Count the number of players on each gaming platform
platform_counts = df['Device Played On'].value_counts()
# Find the most popular gaming platform and its player count
most_popular_platform = platform_counts.idxmax()
player_count = platform_counts.max()
print(f"The most popular gaming platform is {most_popular_platform} with {player_count} players.")
The most popular gaming platform is PC with 103 players.
PythonLate-night gaming sessions are a common sight, but how many players are truly night owls? We explore the gaming habits and uncover the nocturnal gamers.
# Filter players who played after 9 PM
late_night_players = df[df['Time of Playing'].dt.hour >= 21]
player_count = len(late_night_players)
print(f"{player_count} players were playing games after 9 PM.")
50 players were playing games after 9 PM.
PythonThe thrill of gaming often lies in the duration of gameplay. By analyzing session durations, we reveal the games that keep players engaged for extended periods.
# Calculate the average session duration for all players
average_duration = df['Session Duration (minutes)'].mean()
# Find the game with the longest average session duration
game_with_longest_duration = df.groupby('Game Name')['Session Duration (minutes)'].mean().idxmax()
print(f"The average session duration for all players is {average_duration:.2f} minutes.")
print(f"The game with the longest average session duration is {game_with_longest_duration}.")
The average session duration for all players is 73.43 minutes.
The game with the longest average session duration is Overwatch.
PythonAre there gender-related preferences when it comes to in-game weaponry? We explore the choices made by male and female gamers to uncover any distinctions.
# Filter the dataset for male and female players
male_players = df[df['Gender'] == 'Male']
female_players = df[df['Gender'] == 'Female']
# Calculate the most commonly used special weapon for each gender
most_common_weapon_male = male_players['Special Weapon Used'].mode().values[0]
most_common_weapon_female = female_players['Special Weapon Used'].mode().values[0]
print(f"Among male players, the most commonly used special weapon is {most_common_weapon_male}.")
print(f"Among female players, the most commonly used special weapon is {most_common_weapon_female}.")
Among male players, the most commonly used special weapon is Grenade Launcher.
Among female players, the most commonly used special weapon is Grenade Launcher.
PythonThe clock never stops ticking in the gaming world. We analyze the peak hours for gaming sessions to understand when players are most active.
# Extract the hour from the "Time of Playing" column
df['Hour of Playing'] = df['Time of Playing'].dt.hour
# Find the hour with the highest number of players
peak_hour = df['Hour of Playing'].mode().values[0]
player_count = len(df[df['Hour of Playing'] == peak_hour])
print(f"The peak hour for gaming sessions is {peak_hour}:00 with {player_count} players online.")
The peak hour for gaming sessions is 3:00 with 19 players online.
PythonIn the digital age, value-added services are often a part of the gaming experience. We delve into the world of in-game purchases to identify the most service-savvy gamers.
Part 2: Challenging Insights
# Calculate the percentage of players who bought value-added services for each game
service_percentage = (df.groupby('Game Name')['Value Added Services Bought']
.apply(lambda x: (x == 'Yes').mean() * 100)
.sort_values(ascending=False))
# Find the game with the highest percentage of players buying services
highest_service_game = service_percentage.idxmax()
highest_service_percentage = service_percentage.max()
print(f"The game with the highest percentage of players buying services is {highest_service_game} with {highest_service_percentage:.2f}%.")
The game with the highest percentage of players buying services is Call of Duty with 59.52%.
PythonDo session durations vary significantly between male and female players? We conduct a deep dive into the data to reveal any gender-based differences in gaming habits.
# Calculate the average session duration for each gender
average_duration_male = df[df['Gender'] == 'Male']['Session Duration (minutes)'].mean()
average_duration_female = df[df['Gender'] == 'Female']['Session Duration (minutes)'].mean()
print(f"Average session duration for males: {average_duration_male:.2f} minutes.")
print(f"Average session duration for females: {average_duration_female:.2f} minutes.")
Average session duration for males: 74.00 minutes.
Average session duration for females: 72.78 minutes.
PythonThe clock becomes our ally as we examine player behaviour throughout the day. By analyzing time-based patterns, we unveil the ebb and flow of the gaming world, revealing when gamers are most and least active.
# Group players by the hour of playing and count the number of players for each hour
hourly_player_counts = df.groupby('Hour of Playing')['UserID'].count()
# Find the most popular and least popular hours for gaming sessions
most_popular_hour = hourly_player_counts.idxmax()
least_popular_hour = hourly_player_counts.idxmin()
print(f"The most popular hour for gaming sessions is {most_popular_hour}:00.")
print(f"The least popular hour for gaming sessions is {least_popular_hour}:00.")
The most popular hour for gaming sessions is 3:00.
The least popular hour for gaming sessions is 16:00.
PythonANCOVA is an extension of ANOVA (Analysis of Variance) that combines blocks of regression analysis and ANOVA. Which makes it Analysis of Covariance.
What if we learn topics in a desirable way!! What if we learn to write Python codes from gamers data !!
Start using NotebookLM today and embark on a smarter, more efficient learning journey!
This can be a super guide for you to start and excel in your data science career.
This article will introduce important functions in SQL rank, denserank, over, partition.
In SQL you can make queries in number of ways ,though we can break complex codes into small readable and calculated parts.
SQL offers several powerful analytical functions that can provide valuable insights
SQL’s analytic functions allow for complex calculations and deeper data insights
SQL’s window functions are a potent tool that enables you to perform
SQL has a powerful feature called Recursive Common Table Expressions (CTEs), enabling you to work with hierarchical or recursive data. When handling data structures such as organisational hierarchies, bills of materials, family trees, and other similar structures, they can prove extremely valuable. 1. What is a Recursive CTE? 2. Syntax of a Recursive CTE 3.…
Statistical and mathematical functions in SQL
solve these Efficient python code quizzes
This is the second segment of simple to advanced codes
Improve your analytical skills by practicing the following tasks
Leave a Reply
You must be logged in to post a comment.