Hypothesis Testing

Hypothesis testing is a statistical method for determining whether a given hypothesis is true. A hypothesis can be any assumption based on data. We can apply hypothesis testing to find the better-performing version of the product. We will formulate the hypothesis as one version being null and the other being alternate.

The assumption behind a null hypothesis stands for whether the assumption is more likely to occur or not.

Tests for hypotheses are typically done for two reasons.

Prove an established process or truth with hypothesis.

Use a null hypothesis to determine the truth between two statements.

For example :

Consider a webpage as an example. It normally gets five minutes of user session duration on average. Furthermore, we decided to make some changes in order to increase user session duration. After the changes, in 1000 observed sessions average user session increased by 10 minutes on average.

So here the null hypothesis would be

H₀: Average sessions duration is the same after the change.

Alternate hypothesis is.

H_a: Average sessions duration increased after the change.

In a table of probability distributions, we need to know that the values provided under the null hypothesis do not go past a certain edge that refers to the final acceptable observation under H₀. That’s what the value of statistical significance is.

You may understand the probability distributions tables from following link

https://www.accessengineeringlibrary.com/content/book/9780071432085/back-matter/appendix1

Statistical Significance

When data gathered under hypothesis is normally distributed. That means according to probability of individual observation’s occurrence.

Statistical significance is a threshold, a minimum, or a maximum from a probability distribution that shows the peak of the most probable values from an observation.

The null statement in the hypothesis test refers to observations that are considered the norm or most common. This is why we define a border value that separates the most and least probable values in the observations.

Dg0KFf0at6DjoW8d 4FLOs83n5fPD XhnHY6xCJtfn3rVMEFsKar85nZ1Qmm4 A9I4idTn VFcrMkUkxK8cRvRI824Yi

When the data is collected according to hypotheses. We need some standards to test it against. We require a threshold to consider in order to reject or suggest a null hypothesis. The threshold for observations that are most likely to support the null hypothesis can be discovered using a probability chart.

Confidence, or confidence level, can be used to represent statistical significance.

In Python stats model library, Excel, or SPSS, you might find a ‘conf’ return value in statistical tests.

P-value

A probability chart can be used to determine the most likely number of observations to support the null hypothesis.

So we calculate the probability related to the alternate hypothesis. In short, the p-value is the probability of getting the least likely observations. If the p value is in an alternate hypothesis, it can be used to promote the alternate hypothesis.

A persona is an ideal type of audience that’s the focus of a marketing group for a product. It is very important to understand who you’re selling to and when.

For example:

A smartphone company is about to launch its 5G phone around Diwali. Before launching the new 5G model, it needs to empty out the unsold stocks. So the cellphone company is selling stock in this festival. So what kind of audience would be considered to sell the stocks, and what time should be right to get the new phones out in the market?

There can be two hypothesis tests here, one for audience and one for timing of the sale

1st test

Hypothesis Formulation

H₀ : The null hypothesis is to sell the remaining stock before Diwali begins. To avoid drawing attention away from new stock.

(Common practice in market)

H_a : New stock will still sell if we sell old stock at discount

Data Gathering

Steps further from here would be gathered sales data from previous year where similarly old stocks were sold. In this case, we can’t use current data.

2nd Test

We can use the low-budget buyer persona. And take specific steps to get the audience’s attention under that persona.

The company can focus on these personas when buying the old stock under various discount schemes.

Many e-commerce websites have discount sales before big festivals. For example, the Great Amazon Festival and Flipkart’s Big Billion Days.

Hypothesis Formulation

H₀ : Set ads focused at discounted prices for under budget buyer persona.

H_a: focused ads won’t change the sale numbers.

Data Gathering

Gather the data about audiences that clicked the ads and bought. Clicked and bought should be different groups.

Types of Tailed Test

One Tailed test
Two Tailed test

One tailed test

One-tailed tests have one direction. Where either the alternate hypothesis sample mean is larger or smaller than the mean. µ₀ µ_a

Two-tailed test

Two-tailed tests cover multiple directional comparisons, as the alternate sample mean is considered different from the null hypothesis mean.

Types of Sampled Test

One sampled test
Two sampled test

One Sample test

With one sample test, you have only one population of data. And you test within the population groups for differences between means.

Two Sample test

Two sample tests have data from two samples. We compare between multiple population means for differences for hypothesis testing.

Types of Hypothesis Tests

T-test

Assumptions

In the T-test, we compare means of two populations.
Standard Deviation is unknown.
Sample size is less than or equal to 30.
Data is not necessarily distributed.
Default null hypothesis is no relation between samples
Data is normally distributed

Z-test

Assumptions

Standard Deviation is known
Data sample is from independent normal distribution.
Data sample can be greater than 30.
In z test we calculate the difference between two means.

Distinctive Analytics

Statistical Significance

P-value

For example:

1st test

Hypothesis Formulation

Data Gathering

2nd Test

Hypothesis Formulation

Data Gathering

Types of Tailed Test

One tailed test

Two-tailed test

Types of Sampled Test

One Sample test

Two Sample test

Types of Hypothesis Tests

T-test

Z-test

ANCOVA: Analysis of Covariance with python

Learn Python The Fun Way

Meet the most efficient and intelligent AI assistant : NotebookLM

Break the ice

SQL CRUD basics in 5 mins

Important SQL functions

Important queries in SQL

SQL for data science

SQL Analytic Functions

SQL’s window function

SQL’s Recursive Common Table Expressions

SQL stats and maths functions

Efficient Python 1: Play with Numpy, loops, Lists, Arrays

Efficient Python 2

Leave a Reply Cancel reply

Points You Earned