YouTube Thumbnail A/B Testing
📌 Introduction
In the digital age, content creators on YouTube constantly compete for viewer attention. One of the most powerful yet often overlooked factors in a video's success is its thumbnail — the first visual impression a potential viewer gets before deciding to click.
This project simulates a real-world A/B test conducted by a YouTuber who wants to find out whether a redesigned thumbnail performs better than the original. By running a controlled experiment over 30 days, we can use data and statistics to make a confident, evidence-based decision — rather than relying on guesswork.
🛠️ Tools Used
- Python
- Pandas
- NumPy
- Matplotlib
- Seaborn
- SciPy
- Jupyter Notebook
📌 Import Libraries and Dataset
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')
from statsmodels.stats.proportion import proportions_ztest
from scipy.stats import ttest_ind
df = pd.read_csv("/kaggle/input/datasets/akisavujel/thumbnail/Thumbnail_A_B_testing.csv")
🧐 Understand the Dataset (EDA)
print("----- First 5 rows -----")
print(df.head())
----- First 5 rows ----- Day Variant Impressions Clicks CTR Avg_watch_time 0 1 A 2175 870 0.4 2.0 1 2 A 3220 966 0.3 2.3 2 3 A 2537 1014 0.4 2.4 3 4 A 2502 500 0.2 2.5 4 5 A 2211 884 0.4 1.9
print("----- Last 5 rows -----")
print(df.tail())
----- Last 5 rows -----
Day Variant Impressions Clicks CTR Avg_watch_time
55 26 B 7988 6390 0.8 6.8
56 27 B 7182 6463 0.9 5.9
57 28 B 8711 5226 0.6 6.5
58 29 B 9469 7575 0.8 7.3
59 30 B 9647 7717 0.8 8.3
print("----- Data Info -----")
print(df.info())
----- Data Info ----- <class 'pandas.core.frame.DataFrame'> RangeIndex: 60 entries, 0 to 59 Data columns (total 6 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Day 60 non-null int64 1 Variant 60 non-null object 2 Impressions 60 non-null int64 3 Clicks 60 non-null int64 4 CTR 60 non-null float64 5 Avg_watch_time 60 non-null float64 dtypes: float64(2), int64(3), object(1) memory usage: 2.9+ KB None
print("----- Summary Statistics -----")
print(df.describe())
----- Summary Statistics -----
Day Impressions Clicks CTR Avg_watch_time
count 60.000000 60.000000 60.000000 60.000000 60.000000
mean 15.500000 5623.683333 3529.566667 0.523333 4.555000
std 8.728484 2880.562554 2753.453335 0.230230 2.384016
min 1.000000 2175.000000 469.000000 0.200000 1.600000
25% 8.000000 2934.750000 902.750000 0.300000 2.300000
50% 15.500000 5248.500000 2844.000000 0.500000 4.200000
75% 23.000000 8395.250000 6132.000000 0.800000 6.650000
max 30.000000 9895.000000 7717.000000 0.900000 8.400000
print("----- Missing Values -----")
print(df.isnull().sum())
print("----- Missing values (%) -----")
print(df.isnull().sum() / len(df) * 100)
----- Missing Values ----- Day 0 Variant 0 Impressions 0 Clicks 0 CTR 0 Avg_watch_time 0 dtype: int64 ----- Missing values (%) ----- Day 0.0 Variant 0.0 Impressions 0.0 Clicks 0.0 CTR 0.0 Avg_watch_time 0.0 dtype: float64
print("----- Unique values in 'variant' -----")
print(df['Variant'].unique())
print("----- Value counts for 'variant' -----")
print(df['Variant'].value_counts())
----- Unique values in 'variant' ----- ['A' 'B'] ----- Value counts for 'variant' ----- Variant A 30 B 30 Name: count, dtype: int64
print("----- Unique values in column video -----")
print(df['Day'].nunique())
----- Unique values in column video ----- 30
print("----- Data types -----")
print(df.dtypes)
----- Data types ----- Day int64 Variant object Impressions int64 Clicks int64 CTR float64 Avg_watch_time float64 dtype: object
print("----- Column Names -----")
print(df.columns.tolist())
----- Column Names ----- ['Day', 'Variant', 'Impressions', 'Clicks', 'CTR', 'Avg_watch_time']
print("----- Dataset shape -----")
print(df.shape)
----- Dataset shape ----- (60, 6)
print("----- Data types -----")
print(df.dtypes)
----- Data types ----- Day int64 Variant object Impressions int64 Clicks int64 CTR float64 Avg_watch_time float64 dtype: object
print("----- Average metrics per Variant -----")
print(df.groupby('Variant')[['Impressions', 'Clicks', 'CTR', 'Avg_watch_time']].mean())
----- Average metrics per Variant -----
Impressions Clicks CTR Avg_watch_time
Variant
A 2848.866667 884.433333 0.310000 2.30
B 8398.500000 6174.700000 0.736667 6.81
print("----- Correlation between Impressions, Clicks, CTR, Avg_watch_time -----")
print(df[['Impressions', 'Clicks', 'CTR', 'Avg_watch_time']].corr())
----- Correlation between Impressions, Clicks, CTR, Avg_watch_time -----
Impressions Clicks CTR Avg_watch_time
Impressions 1.000000 0.974118 0.899967 0.928784
Clicks 0.974118 1.000000 0.965546 0.925448
CTR 0.899967 0.965546 1.000000 0.881409
Avg_watch_time 0.928784 0.925448 0.881409 1.000000
✨ Cleaning Dataset
df = df.drop_duplicates()
print("----- Duplicates removed -----")
print("New shape:", df.shape)
----- Duplicates removed ----- New shape: (60, 6)
df = df.rename(columns={
"Avg_watch_time": "Avg_watch_time_sec"
})
print("----- Columns after renaming -----")
print(df.columns)
----- Columns after renaming ----- Index(['Day', 'Variant', 'Impressions', 'Clicks', 'CTR', 'Avg_watch_time_sec'], dtype='object')
📊 Visualize the Data
A_data = df[df['Variant'] == 'A']
B_data = df[df['Variant'] == 'B']
print("----- Variant A Data -----")
print(A_data.head())
print("----- Variant B Data -----")
print(B_data.head())
----- Variant A Data -----
Day Variant Impressions Clicks CTR Avg_watch_time_sec
0 1 A 2175 870 0.4 2.0
1 2 A 3220 966 0.3 2.3
2 3 A 2537 1014 0.4 2.4
3 4 A 2502 500 0.2 2.5
4 5 A 2211 884 0.4 1.9
----- Variant B Data -----
Day Variant Impressions Clicks CTR Avg_watch_time_sec
30 1 B 9148 5488 0.6 8.4
31 2 B 8001 6400 0.8 6.3
32 3 B 7169 4301 0.6 6.0
33 4 B 7029 4920 0.7 8.1
34 5 B 9330 7464 0.8 8.2
max_val = max(A_data['Impressions'].max(), B_data['Impressions'].max())
plt.figure(figsize=(12,5))
# Variant A - Bar chart
plt.subplot(1,2,1)
sns.barplot(x=A_data['Day'], y=A_data['Impressions'], color='purple')
plt.title('Variant A - Daily Impressions')
plt.xlabel('Day')
plt.ylabel('Impressions')
plt.ylim(0, max_val)
# Variant B - Bar chart
plt.subplot(1,2,2)
sns.barplot(x=B_data['Day'], y=B_data['Impressions'], color='green')
plt.title('Variant B - Daily Impressions')
plt.xlabel('Day')
plt.ylabel('Impressions')
plt.ylim(0, max_val)
plt.show()
Impressions — Key Insights
Impressions
- Variant A averages around
2,000 – 4,000impressions per day - Variant B averages around
8,000 – 10,000impressions per day
Key Insights
- Variant B gets nearly
3xmore impressions than Variant A every single day — this is a huge and consistent gap - There is not a single day where Variant A matched or beat Variant B, which strongly suggests the difference is real and not random
- Variant B stays high throughout all 30 days — the redesigned thumbnail continuously attracted more viewers
plt.figure(figsize=(8, 5))
total_clicks = df.groupby('Variant')['Clicks'].sum()
plt.bar(total_clicks.index, total_clicks.values, color=['purple', 'green'], width=0.4)
plt.title('Total Clicks — Variant A vs Variant B', fontsize=14, fontweight='bold')
plt.xlabel('Variant')
plt.ylabel('Total Clicks')
plt.show()
Clicks — Key Insights
Clicks
- Variant A total clicks:
~26,000 - Variant B total clicks:
~185,000
Key Insights
- Variant B has
7xmore total clicks than Variant A — an enormous difference over 30 days - The original thumbnail failed to attract clicks consistently throughout the experiment
- The redesigned thumbnail drove significantly more traffic
plt.figure(figsize=(8, 5))
sns.boxplot(data=df, x='Variant', y='CTR', palette=['purple', 'green'])
plt.title('CTR Distribution — Variant A vs Variant B', fontsize=14, fontweight='bold')
plt.xlabel('Variant')
plt.ylabel('CTR')
plt.show()
CTR Distribution — Key Insights
CTR
- Variant A CTR ranges from
0.2 to 0.4 - Variant B CTR ranges from
0.6 to 0.9
Key Insights
- Variant A median is around
0.3— means half the days performed below 30% CTR and Variant B median is around0.75— means half the days performed above 75% CTR - Variant B has a wider spread — even if its CTR varied more day to day but always stayed high
⚖️ Hypothesis Testing
clicks = [
A_data['Clicks'].sum(),
B_data['Clicks'].sum()
]
impressions = [
A_data['Impressions'].sum(),
B_data['Impressions'].sum()
]
print("Clicks:", clicks)
print("Impressions:", impressions)
Clicks: [np.int64(26533), np.int64(185241)] Impressions: [np.int64(85466), np.int64(251955)]
stat, pval = proportions_ztest(count=clicks, nobs=impressions, alternative='two-sided')
print("----- Z-test for CTR -----")
print("Z-test statistic:", stat)
print("p-value:", pval)
print("\n")
if pval < 0.05:
print("✅ Reject H0: Variant B has significantly higher CTR than Variant A")
else:
print("❌ Fail to reject H0: No significant difference in CTR")
----- Z-test for CTR ----- Z-test statistic: -221.96230653986697 p-value: 0.0 ✅ Reject H0: Variant B has significantly higher CTR than Variant A
📝 Conclusion
Based on the 30-day A/B test conducted on two YouTube thumbnail designs, Variant B significantly outperforms Variant A across all key metrics.
Visualization Findings
- Variant B received nearly
3xmore impressions than Variant A every single day - Variant B generated
7xmore total clicks over the entire experiment - Variant B maintained a consistently higher CTR of
0.6 – 0.9compared to Variant A's0.2 – 0.4 - Variant B watch time was significantly higher throughout the 30 days
- There was not a single day where Variant A matched or beat Variant B
Hypothesis Testing Findings
- Z-Test: CTR difference is statistically significant
(p < 0.05) - We reject H0 — the difference is real and not due to random chance
✅ Recommendation
Switch to Variant B The YouTuber should immediately switch to Variant B as the default thumbnail. The redesigned thumbnail with bold visuals and the creator's face visible is far more effective at attracting viewer attention, driving clicks, and keeping viewers engaged longer.