A/B Testing: Data-Driven Decision Making

Huseyin Baytar
4 min readNov 8


Hello data science enthusiasts, in this article, I will talk about A/B testing. It is a crucial area that proves whether a Data Scientist’s company is making a profit or a loss from the campaigns conducted. Before explaining A/B testing, I will provide you with some brief and important information.


Sampling is extracting a subset from a population, assuming that it represents the characteristics of this population well. Let’s say there is a city with an average of 10,000 people. To find the average age of all of them, we would need to go to all 10,000 people, get their ages, and calculate the average. This would require a lot of manpower, so we select an unbiased subset of 100 people that represent the 10,000 people well. This gives us a chance to make a generalization without going through all 10,000 people.

Source : Omniconvert

Confidence Intervals

Confidence Intervals involve finding an interval consisting of two numbers that can cover the estimated values of the population parameter. For example, if we have a website, what is the confidence interval for the average time spent? Let’s assume the average is 180 seconds, and the standard deviation is 40 seconds, then the average time spent by users on the website with a 95% confidence interval would be between 172 and 188 seconds.


Correlation is a statistical method that provides information about the relationship between variables, including the direction and strength of this relationship. For example, positive correlation can be demonstrated based on a bodybuilder’s muscle development according to the workout they do.

source: simplypsychology

Hypothesis Testing

Hypothesis Testing is a statistical method used to test a belief or proposition. The primary goal in group comparisons is to determine whether any differences that may exist are due to chance. For instance, after making changes to the interface of a mobile application, we want to test if the average daily time users spend on the application has increased. We designate the interface before the change as A and after the change as B. We then formulate a hypothesis that there is no difference in the time spent between the two designs. We measure the time spent by users on design A and design B, let’s say it is 55 minutes for design A and 58 minutes for design B.

Just because it is 58 minutes, can we say that design B is better? No, the most critical point of A/B tests is here. Mathematically, 58 might seem better, but we took a sample, and this difference could have occurred by chance. We need to statistically prove whether this difference has emerged in a way that leaves no room for chance.

A/B Testing is used when comparing the averages of two groups.

H0: The means of the two groups are equal

H1: The means of the two groups are not equal.

We will interpret the p-value, if the p-value is less than 0.05, H0 is rejected. The Independent Samples T-test has two assumptions: Normality and Homogeneity of Variance.

There will be 4 steps in the A/B test as follows:

Step 1: Formulate Hypotheses.

Step 2: Assumption Check (normality assumption / homogeneity of variance).

Step 3: Applying the Hypothesis

  • If the assumptions are met, conduct an independent two-sample T-test ,A/B testing.
  • If the assumptions are not met conduct the Mann-Whitney U test.

Step 4: Interpret the results based on the p-value.

I explained more detailed on my Kaggle notebook, which i also did a lot of example’s of A/B Test with different datasets;

To Be Continued…



Huseyin Baytar