Introduction
Analysis of Variance, commonly known as ANOVA, is a statistical method used to test differences between two or more means. It was developed by statistician Ronald Fisher in the 1920s and has become a fundamental technique in experimental research across various fields including psychology, biology, medicine, and social sciences.
ANOVA helps researchers answer a crucial question: Are the differences between group means statistically significant, or are they merely due to random chance? By comparing variances (hence the name), ANOVA determines whether the variation between groups is significantly greater than the variation within groups.
Types of ANOVA
There are several types of ANOVA, each designed for specific research scenarios:
- One-way ANOVA:
Tests for differences among three or more independent groups with one independent variable. - Two-way ANOVA:
Examines the influence of two different independent variables on a dependent variable. - Repeated measures ANOVA:
Used when the same subjects are measured multiple times. - MANOVA (Multivariate Analysis of Variance):
Tests for differences in multiple dependent variables simultaneously.
The Logic Behind ANOVA
ANOVA works by partitioning the total variance in a dataset into:
- Between-group variance: Variation due to differences between group means
- Within-group variance: Variation due to differences within each group (random error)
The method then calculates the F-statistic, which is the ratio of between-group variance to within-group variance:
If the F-statistic is large, it suggests that the between-group variance is larger than would be expected by chance, indicating a significant difference between at least some of the groups.
ANOVA Assumptions
Before applying ANOVA, several assumptions should be met:
- Independence: Observations within and between groups should be independent.
- Normality: The data within each group should be approximately normally distributed.
- Homogeneity of variance: The variance should be approximately equal across groups.
Implementing One-Way ANOVA in Python
Let’s implement a one-way ANOVA using Python. We’ll use libraries such as NumPy, SciPy, Pandas, and Matplotlib for data manipulation, statistical analysis, and visualization.
Setup and Data Generation
First, let’s import the necessary libraries and create some synthetic data:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
import statsmodels.api as sm
from statsmodels.formula.api import ols
# Set a random seed for reproducibility
np.random.seed(42)
# Generate synthetic data for three treatment groups
group_a = np.random.normal(loc=5, scale=1.5, size=30)
group_b = np.random.normal(loc=6, scale=1.5, size=30)
group_c = np.random.normal(loc=7, scale=1.5, size=30)
# Create a DataFrame
data = pd.DataFrame({
'value': np.concatenate([group_a, group_b, group_c]),
'group': np.repeat(['A', 'B', 'C'], repeats=30)
})
# Display the first few rows
print(data.head())
value group
0 5.745071 A
1 4.792604 A
2 5.971533 A
3 7.284545 A
4 4.648770 A
Visualizing the Data
Before conducting ANOVA, it’s helpful to visualize the data:
# Create a box plot
plt.figure(figsize=(10, 6))
sns.boxplot(x='group', y='value', data=data)
plt.title('Box Plot of Values by Group')
plt.xlabel('Group')
plt.ylabel('Value')
plt.grid(True, linestyle='--', alpha=0.7)
plt.show()
# Create a violin plot for more detailed distribution visualization
plt.figure(figsize=(10, 6))
sns.violinplot(x='group', y='value', data=data)
plt.title('Violin Plot of Values by Group')
plt.xlabel('Group')
plt.ylabel('Value')
plt.grid(True, linestyle='--', alpha=0.7)
plt.show()


Checking ANOVA Assumptions
Before performing ANOVA, we should check if our data meets the assumptions:
# 1. Check for normality within each group using Shapiro-Wilk test
for group_name, group_data in data.groupby('group'):
stat, p_value = stats.shapiro(group_data['value'])
print(f'Group {group_name} Shapiro-Wilk test: p-value = {p_value:.4f}')
# 2. Check for homogeneity of variance using Levene's test
stat, p_value = stats.levene(group_a, group_b, group_c)
print(f'Levene\'s test for homogeneity of variance: p-value = {p_value:.4f}')
# 3. Create Q-Q plots to visually check normality
fig, axes = plt.subplots(1, 3, figsize=(15, 5))
for i, (group_name, group_data) in enumerate(data.groupby('group')):
stats.probplot(group_data['value'], plot=axes[i])
axes[i].set_title(f'Q-Q Plot for Group {group_name}')
plt.tight_layout()
plt.show()
Group A Shapiro-Wilk test: p-value = 0.6868
Group B Shapiro-Wilk test: p-value = 0.9130
Group C Shapiro-Wilk test: p-value = 0.3654
Levene's test for homogeneity of variance: p-value = 0.8627

Performing One-Way ANOVA
Now, let’s perform the one-way ANOVA using different methods:
Method 1: SciPy’s f_oneway
# Perform one-way ANOVA using SciPy
f_stat, p_value = stats.f_oneway(group_a, group_b, group_c)
print(f'F-statistic: {f_stat:.4f}')
print(f'p-value: {p_value:.4f}')
# Interpret the results
alpha = 0.05
if p_value < alpha:
print("Reject the null hypothesis: There are significant differences between groups.")
else:
print("Fail to reject the null hypothesis: No significant differences between groups.")
F-statistic: 19.9192
p-value: 0.0000
Reject the null hypothesis: There are significant differences between groups.
Method 2: Using statsmodels for a more detailed output
# Perform ANOVA using statsmodels
model = ols('value ~ C(group)', data=data).fit()
anova_table = sm.stats.anova_lm(model, typ=2)
print("\nANOVA Table:")
print(anova_table)
ANOVA Table:
sum_sq df F PR(>F)
C(group) 79.507410 2.0 19.919231 7.545192e-08
Residual 173.629808 87.0 NaN NaN
Post-hoc Tests
If ANOVA finds significant differences, we need to determine which specific groups differ from each other using post-hoc tests:
# Perform Tukey's HSD test
from statsmodels.stats.multicomp import pairwise_tukeyhsd
# Tukey's HSD test
tukey_results = pairwise_tukeyhsd(data['value'], data['group'], alpha=0.05)
print("\nTukey's HSD Test:")
print(tukey_results)
# Visualize the post-hoc results
plt.figure(figsize=(10, 6))
tukey_results.plot_simultaneous()
plt.title("Tukey's HSD Test for Multiple Comparisons")
plt.grid(True, linestyle='--', alpha=0.7)
plt.show()
Tukey's HSD Test:
Multiple Comparison of Means - Tukey HSD, FWER=0.05
==================================================
group1 group2 meandiff p-adj lower upper reject
--------------------------------------------------
A B 1.1005 0.0093 0.2307 1.9702 True
A C 2.3015 0.0 1.4318 3.1713 True
B C 1.2011 0.0041 0.3313 2.0708 True
--------------------------------------------------

Effect Size
ANOVA tells us if there’s a significant difference, but not how large that difference is. For this, we can calculate the effect size using eta squared or partial eta squared:
# Calculate eta squared (effect size)
def eta_squared(anova_table):
sum_sq = anova_table['sum_sq']
eta_sq = sum_sq.iloc[0] / sum_sq.sum()
return eta_sq
# Calculate effect size
effect_size = eta_squared(anova_table)
print(f"\nEta squared (effect size): {effect_size:.4f}")
# Interpret the effect size
if effect_size < 0.01:
interpretation = "very small"
elif effect_size < 0.06:
interpretation = "small"
elif effect_size < 0.14:
interpretation = "medium"
else:
interpretation = "large"
print(f"This indicates a {interpretation} effect size.")
Eta squared (effect size): 0.3141
This indicates a large effect size.
Practical Example: Financial Analysis with ANOVA
Let’s apply ANOVA to a financial context: comparing the annual returns of three different investment strategies (Value, Growth, and Index investing) across multiple years.
# Set seed for reproducibility
np.random.seed(123)
# Generate data for three investment strategies with realistic returns
# Value investing: Mean return 8%, higher volatility
value_returns = np.random.normal(loc=8, scale=15, size=30)
# Growth investing: Mean return 11%, even higher volatility
growth_returns = np.random.normal(loc=11, scale=18, size=30)
# Index investing: Mean return 7%, lower volatility
index_returns = np.random.normal(loc=7, scale=10, size=30)
# Create a DataFrame
investment_data = pd.DataFrame({
'annual_return': np.concatenate([value_returns, growth_returns, index_returns]),
'strategy': np.repeat(['Value', 'Growth', 'Index'], repeats=30)
})
# Display summary statistics
print("\nAnnual Returns Summary by Investment Strategy:")
print(investment_data.groupby('strategy')['annual_return'].describe())
# Visualize the data
plt.figure(figsize=(10, 6))
sns.boxplot(x='strategy', y='annual_return', data=investment_data)
plt.title('Annual Returns by Investment Strategy')
plt.xlabel('Investment Strategy')
plt.ylabel('Annual Return (%)')
plt.axhline(y=0, color='r', linestyle='-', alpha=0.3) # Zero return reference line
plt.grid(True, linestyle='--', alpha=0.7)
plt.show()
# Perform one-way ANOVA
f_stat, p_value = stats.f_oneway(value_returns, growth_returns, index_returns)
print(f'\nANOVA Results:')
print(f'F-statistic: {f_stat:.4f}')
print(f'p-value: {p_value:.8f}')
# Check if ANOVA is significant
alpha = 0.05
if p_value < alpha:
print("Result: There are significant differences in returns between investment strategies.")
# Perform post-hoc test to identify which strategies differ
tukey_results = pairwise_tukeyhsd(investment_data['annual_return'], investment_data['strategy'], alpha=0.05)
print("\nTukey's HSD Test:")
print(tukey_results)
# Calculate risk-adjusted returns (Sharpe ratio simplified)
# Assuming risk-free rate of 2%
risk_free = 2
investment_data['excess_return'] = investment_data['annual_return'] - risk_free
# Calculate Sharpe ratios for each strategy
sharpe_ratios = {}
for strategy, data in investment_data.groupby('strategy'):
mean_excess_return = data['excess_return'].mean()
std_dev = data['excess_return'].std()
sharpe_ratio = mean_excess_return / std_dev if std_dev > 0 else 0
sharpe_ratios[strategy] = sharpe_ratio
print("\nRisk-Adjusted Performance (Sharpe Ratio):")
for strategy, ratio in sharpe_ratios.items():
print(f"{strategy}: {ratio:.4f}")
else:
print("Result: No significant differences in returns between investment strategies.")
Annual Returns Summary by Investment Strategy:
count mean std min 25% 50% 75% max
strategy
Growth 30.0 13.546734 22.187742 -39.374604 -2.923294 13.819795 28.386517 54.062575
Index 30.0 6.211631 10.624298 -14.231004 -1.359263 5.204392 15.041431 27.871134
Value 30.0 8.670710 17.808009 -28.400189 -4.240566 3.732640 23.035652 41.088951

ANOVA Results:
F-statistic: 1.3601
p-value: 0.26204215
Result: No significant differences in returns between investment strategies.
This example analyzes whether different investment strategies produce statistically different returns. We generate realistic annual return data for three common investment approaches:
- Value Investing: Focusing on undervalued stocks with strong fundamentals
- Growth Investing: Targeting companies with above-average growth potential
- Index Investing: Passive strategy tracking market indices
The ANOVA test helps determine if the observed differences in returns are statistically significant, while the post-hoc test identifies which specific strategies differ from each other. Additionally, we calculate the Sharpe ratio to compare risk-adjusted performance, which is crucial for investment analysis.
References:
- Fisher, R. A. (1925). Statistical methods for research workers. Oliver and Boyd.
- Analysis of variance. (n.d.). In Wikipedia. Retrieved from https://en.wikipedia.org/wiki/Analysis_of_variance
- McKinney, W. (2010). Data Structures for Statistical Computing in Python. Proceedings of the 9th Python in Science Conference, 51-56.
- Seabold, S., & Perktold, J. (2010). Statsmodels: Econometric and statistical modeling with Python. Proceedings of the 9th Python in Science Conference.