Kaggle

YouTube Thumbnail A/B Testing

No description has been provided for this image

📌 Introduction

In the digital age, content creators on YouTube constantly compete for viewer attention. One of the most powerful yet often overlooked factors in a video's success is its thumbnail — the first visual impression a potential viewer gets before deciding to click.

This project simulates a real-world A/B test conducted by a YouTuber who wants to find out whether a redesigned thumbnail performs better than the original. By running a controlled experiment over 30 days, we can use data and statistics to make a confident, evidence-based decision — rather than relying on guesswork.

🛠️ Tools Used

  • Python
  • Pandas
  • NumPy
  • Matplotlib
  • Seaborn
  • SciPy
  • Jupyter Notebook

📌 Import Libraries and Dataset

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

import warnings
warnings.filterwarnings('ignore')

from statsmodels.stats.proportion import proportions_ztest
from scipy.stats import ttest_ind
In [2]:
df = pd.read_csv("/kaggle/input/datasets/akisavujel/thumbnail/Thumbnail_A_B_testing.csv")

🧐 Understand the Dataset (EDA)

In [3]:
print("----- First 5 rows -----")
print(df.head())
----- First 5 rows -----
   Day Variant  Impressions  Clicks  CTR  Avg_watch_time
0    1       A         2175     870  0.4             2.0
1    2       A         3220     966  0.3             2.3
2    3       A         2537    1014  0.4             2.4
3    4       A         2502     500  0.2             2.5
4    5       A         2211     884  0.4             1.9
In [4]:
print("----- Last 5 rows -----")
print(df.tail())
----- Last 5 rows -----
    Day Variant  Impressions  Clicks  CTR  Avg_watch_time
55   26       B         7988    6390  0.8             6.8
56   27       B         7182    6463  0.9             5.9
57   28       B         8711    5226  0.6             6.5
58   29       B         9469    7575  0.8             7.3
59   30       B         9647    7717  0.8             8.3
In [5]:
print("----- Data Info -----")
print(df.info())
----- Data Info -----
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 60 entries, 0 to 59
Data columns (total 6 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   Day             60 non-null     int64  
 1   Variant         60 non-null     object 
 2   Impressions     60 non-null     int64  
 3   Clicks          60 non-null     int64  
 4   CTR             60 non-null     float64
 5   Avg_watch_time  60 non-null     float64
dtypes: float64(2), int64(3), object(1)
memory usage: 2.9+ KB
None
In [6]:
print("----- Summary Statistics -----")
print(df.describe())
----- Summary Statistics -----
             Day  Impressions       Clicks        CTR  Avg_watch_time
count  60.000000    60.000000    60.000000  60.000000       60.000000
mean   15.500000  5623.683333  3529.566667   0.523333        4.555000
std     8.728484  2880.562554  2753.453335   0.230230        2.384016
min     1.000000  2175.000000   469.000000   0.200000        1.600000
25%     8.000000  2934.750000   902.750000   0.300000        2.300000
50%    15.500000  5248.500000  2844.000000   0.500000        4.200000
75%    23.000000  8395.250000  6132.000000   0.800000        6.650000
max    30.000000  9895.000000  7717.000000   0.900000        8.400000
In [7]:
print("----- Missing Values -----")
print(df.isnull().sum())

print("----- Missing values (%) -----")
print(df.isnull().sum() / len(df) * 100)
----- Missing Values -----
Day               0
Variant           0
Impressions       0
Clicks            0
CTR               0
Avg_watch_time    0
dtype: int64
----- Missing values (%) -----
Day               0.0
Variant           0.0
Impressions       0.0
Clicks            0.0
CTR               0.0
Avg_watch_time    0.0
dtype: float64
In [8]:
print("----- Unique values in 'variant' -----")
print(df['Variant'].unique())

print("----- Value counts for 'variant' -----")
print(df['Variant'].value_counts())
----- Unique values in 'variant' -----
['A' 'B']
----- Value counts for 'variant' -----
Variant
A    30
B    30
Name: count, dtype: int64
In [9]:
print("----- Unique values in column video -----")
print(df['Day'].nunique())
----- Unique values in column video -----
30
In [10]:
print("----- Data types -----")
print(df.dtypes)
----- Data types -----
Day                 int64
Variant            object
Impressions         int64
Clicks              int64
CTR               float64
Avg_watch_time    float64
dtype: object
In [11]:
print("----- Column Names -----")
print(df.columns.tolist())
----- Column Names -----
['Day', 'Variant', 'Impressions', 'Clicks', 'CTR', 'Avg_watch_time']
In [12]:
print("----- Dataset shape -----")
print(df.shape)
----- Dataset shape -----
(60, 6)
In [13]:
print("----- Data types -----")
print(df.dtypes)
----- Data types -----
Day                 int64
Variant            object
Impressions         int64
Clicks              int64
CTR               float64
Avg_watch_time    float64
dtype: object
In [14]:
print("----- Average metrics per Variant -----")
print(df.groupby('Variant')[['Impressions', 'Clicks', 'CTR', 'Avg_watch_time']].mean())
----- Average metrics per Variant -----
         Impressions       Clicks       CTR  Avg_watch_time
Variant                                                    
A        2848.866667   884.433333  0.310000            2.30
B        8398.500000  6174.700000  0.736667            6.81
In [15]:
print("----- Correlation between Impressions, Clicks, CTR, Avg_watch_time -----")
print(df[['Impressions', 'Clicks', 'CTR', 'Avg_watch_time']].corr())
----- Correlation between Impressions, Clicks, CTR, Avg_watch_time -----
                Impressions    Clicks       CTR  Avg_watch_time
Impressions        1.000000  0.974118  0.899967        0.928784
Clicks             0.974118  1.000000  0.965546        0.925448
CTR                0.899967  0.965546  1.000000        0.881409
Avg_watch_time     0.928784  0.925448  0.881409        1.000000

✨ Cleaning Dataset

In [16]:
df = df.drop_duplicates()
print("----- Duplicates removed -----")
print("New shape:", df.shape)
----- Duplicates removed -----
New shape: (60, 6)
In [17]:
df = df.rename(columns={
    "Avg_watch_time": "Avg_watch_time_sec"
})
print("----- Columns after renaming -----")
print(df.columns)
----- Columns after renaming -----
Index(['Day', 'Variant', 'Impressions', 'Clicks', 'CTR', 'Avg_watch_time_sec'], dtype='object')

📊 Visualize the Data

In [18]:
A_data = df[df['Variant'] == 'A']
B_data = df[df['Variant'] == 'B']

print("----- Variant A Data -----")
print(A_data.head())

print("----- Variant B Data -----")
print(B_data.head())
----- Variant A Data -----
   Day Variant  Impressions  Clicks  CTR  Avg_watch_time_sec
0    1       A         2175     870  0.4                 2.0
1    2       A         3220     966  0.3                 2.3
2    3       A         2537    1014  0.4                 2.4
3    4       A         2502     500  0.2                 2.5
4    5       A         2211     884  0.4                 1.9
----- Variant B Data -----
    Day Variant  Impressions  Clicks  CTR  Avg_watch_time_sec
30    1       B         9148    5488  0.6                 8.4
31    2       B         8001    6400  0.8                 6.3
32    3       B         7169    4301  0.6                 6.0
33    4       B         7029    4920  0.7                 8.1
34    5       B         9330    7464  0.8                 8.2
In [19]:
max_val = max(A_data['Impressions'].max(), B_data['Impressions'].max())

plt.figure(figsize=(12,5))

# Variant A - Bar chart
plt.subplot(1,2,1)
sns.barplot(x=A_data['Day'], y=A_data['Impressions'], color='purple')
plt.title('Variant A - Daily Impressions')
plt.xlabel('Day')
plt.ylabel('Impressions')
plt.ylim(0, max_val)

# Variant B - Bar chart
plt.subplot(1,2,2)
sns.barplot(x=B_data['Day'], y=B_data['Impressions'], color='green')
plt.title('Variant B - Daily Impressions')
plt.xlabel('Day')
plt.ylabel('Impressions')
plt.ylim(0, max_val)

plt.show()
No description has been provided for this image

Impressions — Key Insights

Impressions

  • Variant A averages around 2,000 – 4,000 impressions per day
  • Variant B averages around 8,000 – 10,000 impressions per day

Key Insights

  • Variant B gets nearly 3x more impressions than Variant A every single day — this is a huge and consistent gap
  • There is not a single day where Variant A matched or beat Variant B, which strongly suggests the difference is real and not random
  • Variant B stays high throughout all 30 days — the redesigned thumbnail continuously attracted more viewers
In [20]:
plt.figure(figsize=(8, 5))

total_clicks = df.groupby('Variant')['Clicks'].sum()

plt.bar(total_clicks.index, total_clicks.values, color=['purple', 'green'], width=0.4)

plt.title('Total Clicks — Variant A vs Variant B', fontsize=14, fontweight='bold')
plt.xlabel('Variant')
plt.ylabel('Total Clicks')
plt.show()
No description has been provided for this image

Clicks — Key Insights

Clicks

  • Variant A total clicks: ~26,000
  • Variant B total clicks: ~185,000

Key Insights

  • Variant B has 7x more total clicks than Variant A — an enormous difference over 30 days
  • The original thumbnail failed to attract clicks consistently throughout the experiment
  • The redesigned thumbnail drove significantly more traffic
In [21]:
plt.figure(figsize=(8, 5))

sns.boxplot(data=df, x='Variant', y='CTR', palette=['purple', 'green'])

plt.title('CTR Distribution — Variant A vs Variant B', fontsize=14, fontweight='bold')
plt.xlabel('Variant')
plt.ylabel('CTR')
plt.show()
No description has been provided for this image

CTR Distribution — Key Insights

CTR

  • Variant A CTR ranges from 0.2 to 0.4
  • Variant B CTR ranges from 0.6 to 0.9

Key Insights

  • Variant A median is around 0.3 — means half the days performed below 30% CTR and Variant B median is around 0.75 — means half the days performed above 75% CTR
  • Variant B has a wider spread — even if its CTR varied more day to day but always stayed high

⚖️ Hypothesis Testing

In [22]:
clicks = [
    A_data['Clicks'].sum(), 
    B_data['Clicks'].sum()
]

impressions = [
    A_data['Impressions'].sum(), 
    B_data['Impressions'].sum()
]

print("Clicks:", clicks)
print("Impressions:", impressions)
Clicks: [np.int64(26533), np.int64(185241)]
Impressions: [np.int64(85466), np.int64(251955)]
In [23]:
stat, pval = proportions_ztest(count=clicks, nobs=impressions, alternative='two-sided')
In [24]:
print("----- Z-test for CTR -----")
print("Z-test statistic:", stat)
print("p-value:", pval)
print("\n")

if pval < 0.05:
    print("✅ Reject H0: Variant B has significantly higher CTR than Variant A")
else:
    print("❌ Fail to reject H0: No significant difference in CTR")
----- Z-test for CTR -----
Z-test statistic: -221.96230653986697
p-value: 0.0


✅ Reject H0: Variant B has significantly higher CTR than Variant A

📝 Conclusion

Based on the 30-day A/B test conducted on two YouTube thumbnail designs, Variant B significantly outperforms Variant A across all key metrics.

Visualization Findings

  • Variant B received nearly 3x more impressions than Variant A every single day
  • Variant B generated 7x more total clicks over the entire experiment
  • Variant B maintained a consistently higher CTR of 0.6 – 0.9 compared to Variant A's 0.2 – 0.4
  • Variant B watch time was significantly higher throughout the 30 days
  • There was not a single day where Variant A matched or beat Variant B

Hypothesis Testing Findings

  • Z-Test: CTR difference is statistically significant (p < 0.05)
  • We reject H0 — the difference is real and not due to random chance

✅ Recommendation

Switch to Variant B The YouTuber should immediately switch to Variant B as the default thumbnail. The redesigned thumbnail with bold visuals and the creator's face visible is far more effective at attracting viewer attention, driving clicks, and keeping viewers engaged longer.

No description has been provided for this image