Disclaimer: This series is converted from my project in Empirical Finance at VU Amsterdam, co-authoring with Denzel van Beek and Marcin Grobelny. The data used in this post is simulated data.

This series has four parts:

Part 1: Can We Predict Stock Returns? Testing the Fama-French Model <– You’re here
Part 2: Did That Tax Law Actually Work? A Real-World Policy Analysis
Part 3: Predicting Up or Down: When Will Stocks Rise Tomorrow?
Part 4: Forecasting Trading Volume: When Simple Beats Complex

Introduction: The Quest to Explain Stock Returns

As an enthusiastic investor or an ambitious trader, you’ve probably asked yourself this question at least once in your life: “Can I predict tomorrow’s stock price?”

Here’s a simple way to think about it: Imagine you’re trying to forecast tomorrow’s weather - where would you start? You could just look at today’s temperature to guess tomorrow’s weather - a basic model. But you could also consider humidity, wind speed, and atmospheric pressure in addition to temperature - a more sophisticated forecast. Intuitively, more data points should lead to better predictions, right? The same logic applies to financial markets. For decades, stock investors, day traders, hedge funds, and even academic researchers have been obsessed with one question: What makes some stocks outperform others?

In the early 1990s, economist Eugene Fama and his colleague Kenneth French challenged the dominant Capital Asset Pricing Model (CAPM) by introducing something better: the Fama-French three-factor model. This framework explained stock returns using three underlying elements rather than just market risk alone. Then in 2015, they went further by adding two more factors to create the five-factor model, designed to capture even more nuances in asset returns. You can think of the Fama-French model as a sophisticated weather forecast for your portfolio—it uses multiple “ingredients” to predict how stocks will perform.

But here’s the million-dollar question we’ll explore in this post: Does adding more factors actually improve predictions? Spoiler alert: The answer isn’t as straightforward as you might think.

What Are These “Factors” Anyway?

The factors in the Fama-French model represent key characteristics of stocks and markets that have predictive power for future returns.

The Three Original Factors (Fama-French 3-Factor Model)

Market Risk Premium (MktRF): Measures how much the overall stock market beats safe investments like Treasury bills? Mathematically, it’s $R_m - R_f$, the extra return you earn for accepting the risk of investing in stocks instead of playing it safe.
Size (SMB - Small Minus Big): Captures whether small-cap companies outperform large-cap companies. The logic is straightforward: small startups are expected to exhibit more growth potential than mature giants like Apple, so investors expect to be rewarded for that upside.
Value (HML - High Minus Low): Can bargain stocks beat the market’s favourite “golden-egg-laying geese”? Imagine investing in a hidden gem restaurant with steady profits vs. paying top dollar for the next big franchise.

The Two Additional Factors (Making it 5-Factor)

Profitability (RMW - Robust Minus Weak): Can profitable companies beat money-losing ones? It sounds obvious, but the data backs it up: businesses that actually make money tend to outperform those burning cash.
Investment (CMA - Conservative Minus Aggressive): Do conservative investors beat aggressive spenders? Think of it this way: companies that invest prudently show financial discipline, while aggressive investors bet big on growth. History shows one approach tends to win out.

Myth-buster: Do sophisticated models outperforms simple one in prediction?

Our Data: 120 Companies Over 37 Years

Table 1: (\#tab:data-summary)Dataset Overview
Characteristic	Value
Sample Size	120 U.S. Companies
Time Period	Jan 1986 - Dec 2022
Frequency	Monthly
Total Observations	16,834
Industries Covered	8 sectors

What we measured: How much each stock’s return exceeded the risk-free rate (called “excess return”)

Key Findings

Finding #1: More Factors Don’t Always Mean Better Models

The Baseline Model

Before we dive into more complex models, let’s set up two simple baseline Fama-French five-factor (FF5F) models for comparison:

Pooled OLS: The one-size-fit-all model that treats all companies the same
Firm Fixed Effects: A corrected model that acknowledge each company is different and deserves its own baseline

What We Found

Table 3: (\#tab:results-table)Model Performance Comparison
Factor	Pooled OLS	Firm FE
Market Risk Premium	1.0311***	1.0311***
SMB (Size)	0.1641***	0.1641***
HML (Value)	0.3502***	0.3502***
RMW (Profitability)	0.2319***	0.2319***
CMA (Investment)	0.1765***	0.1765***

R-squared	0.2608	0.2611
*** = statistically significant at 1% level

Key Insight: The coefficients don’t change between models, they’re identical! Here’s why: firm fixed effects are useless when all your explanatory variables are market-level factors. The Fama-French factors vary over time, but they’re the same for every company on any given day.

Look closely at the model specification:

$$ \begin{aligned} \text{EXC\_RET}_{it} &= \beta_0 + \beta_1 \text{MktRF}_t + \beta_2 \text{SMB}_t + \beta_3 \text{HML}_t + \beta_4 \text{RMW}_t + \beta_5 \text{CMA}_t \end{aligned} $$

Can you spot the difference? Returns are indexed ${it}$ (firm $i$ at time $t$), but the factors are just ${t}$ (time only). They’re market-wide, not firm-specific.

Statistical Test: F-test for Fixed Effects

Since we include firm fixed effect, we need to test whether it is necessary via an F-test:

Null hypothesis: $H_0: \alpha_i = \beta_0, \ i = 1,...,120$ (all firms share the same intercept $\beta_0$).
Alternative hypothesis: $H1:$ At least one $\alpha_i \neq \alpha_j, \text{ for } i \neq j$ (firms have different intercepts).
Test statistic:

$$ F = \frac{N-k_1}{k_1-k_0} \cdot \frac{SSR_0 - SSR_1}{SSR_1} $$ where $SSR_0$ is the sum of squared residuals under null (pooled model), $SSR_1$ is the sum of squared residuals under alternative (fixed effects model), and $k_0,k_1$ are the number of parameters under null and alternative.

1
2
3
4
5
6


## 
## 	F test for individual effects
## 
## data:  EXC_RET ~ MktRF + SMB + HML + RMW + CMA
## F = 0.57551, df1 = 37, df2 = 16791, p-value = 0.9819
## alternative hypothesis: significant effects

Result: F-statistic = 0.5755, p-value = 0.9819. With such a high p-value, we fail to reject the null hypothesis. This confirms that firm fixed effects add no meaningful value to the model.

What does this tell us?

Simpler models are often better (Occam’s Razor applies to finance too!)
Time-based factors capture most of the action in stock returns
Individual company characteristics are already reflected in the factor exposures

Finding #2: Stock Returns Don’t Move in Straight Lines

The Non-Linearity Test

We also need to check if the model is actually linear. Does adding non-linear terms (specifically, squared terms) improve our predictions?

How it works:

Add squared versions of each factor to the model
If they’re significant, the model contains non-linear relationships

Intuitively, we’re checking whether the car (our FF5F model) moves at a constant speed or accelerates over time.

Auxiliary Regression Model:

$$ \begin{aligned} \text{EXC\_RET}_{it} &= \beta_0 + \beta_1 \text{MktRF}_t + \beta_2 \text{SMB}_t + \beta_3 \text{HML}_t + \beta_4 \text{RMW}_t + \beta_5 \text{CMA}_t \\ &\quad + \gamma_1 \text{MktRF}_t^2 + \gamma_2 \text{SMB}_t^2 + \gamma_3 \text{HML}_t^2 + \gamma_4 \text{RMW}_t^2 + \gamma_5 \text{CMA}_t^2 + \varepsilon_{it} \end{aligned} $$

We run a joint F-test to see if there is non-linearity:

Null hypothesis: $H_0: \gamma_1 = \gamma_2 = \gamma_3 = \gamma_4 = \gamma_5 = 0$
Alternative hypothesis: $H1:$ At least one of the $\gamma_j \neq 0, j = 1,...,5$.
Test statistic:

$$ F = \frac{(RSS_R - RSS_U)/q}{RSS_U/(n-k)} $$

where $RSS_R$ = residual sum of squares from the restricted model (linear only), $RSS_U$ = residual sum of squares from the unrestricted model (with squared terms), $q = k_1-k_0 =5$ is the number of restrictions (squared terms), $n$ = number of observations, and $k$ = number of parameters in the unrestricted model.

Results:

Table 5: (\#tab:nonlinearity-results)Auxiliary Regression with Squared Terms
Factor	Linear	Squared
MktRF	1.0340***	0.0022
SMB	0.0017	0.2155***
HML	0.1712***	-0.0032
RMW	-0.0065	0.1686***
CMA	0.3453***	0.0100
Joint F-test: χ² = 12.17, p-value = 0.0325

Interpretation & Implications

What does this mean?

The joint F-test (p-value = 0.0325) rejects the linear specification, meaning that at least some factor relationships are curved, not straight lines. Looking at individual factors, we can see that:

SMB² (0.2155***) and RMW² (0.1686***): The size and profitability effect are non-linear. Extremely small companies don’t just underperform, they face accelerating disadvantages (think liquidity crises, higher bankruptcy risk). In addition, Highly profitable firms might hit diminishing returns or attract regulatory attention.
MktRF², HML², CMA²: These factor remain linear, on the other hand, since their effects scale linearly.

The Intuition

Think of it this way:

Linear relationship: Company size drops X% → returns drop Y% (proportional)
Non-linear relationship: Same X% drop → returns plunge beyond Y% (small firms hit harder by liquidity issues, market access problems, etc.)

It’s like driving, going from 50 to 100 km/h doubles your speed, but fuel consumption increases more than double due to air resistance. Similarly, extreme values in size and profitability don’t just scale excess return, their amplify it.

Implication

The standard linear FF5F model is clearly misspecified for the data at hand, evidenced by the curvature at extreme values. Incorporating these non-linear terms could enhance our predictive power, especially for companies at the tails of the size and profitability distributions.

Finding #3: The “Value Premium” Mystery Solved

The Puzzle

Experienced traders often observe a puzzling pattern: “value stocks” (high book-to-market ratio) consistently outperform “growth stocks.” But have you ever stopped to ask why?

Our Investigation

We compared the HML coefficient (value factor) in two models:

Table 7: (\#tab:hml-comparison)HML Coefficient Comparison
Model	FF 3-Factor	FF 5-Factor
HML Coefficient	0.4725***	0.3502***

Change	↓ 25.9%
Adjusted R²	0.2572	0.2606

The Big Reveal

When we added profitability (RMW) and investment (CMA) factors, the HML coefficient dropped by 26%!

What’s Happening Here?

Before (3-Factor Model):
- HML captured everything about “value stocks”. This included their tendency to be more profitable (robust) and invest conservatively (conservative).
- HML was getting credit for effects that weren’t really about “value”.
After (5-Factor Model):
- RMW captures the profitability effect while CMA captures the investment effect.
- HML now shows the true value premium (which is smaller than we thought!).

Intuitive Example

Remember the classic spurious correlation? Ice cream sales seem to “cause” shark attacks because both spike in summer. Once you control for “go swimming”, the relationship disappears - it was never about ice cream. The FF5F model does exactly this: it separates genuinely independent effects from spurious correlations.

Why the 5-Factor Model is Better

Less omitted variable bias: We’re not giving HML credit for profitability/investment effects.
Higher R²: Explains 0.34 percentage points more variance.
More precise estimates: Each factor captures its unique contribution.

Correlation Check

Table 9: (\#tab:corr-table)Factor Correlations
	HML	RMW	CMA
HML	1.0000000	0.3086222	0.6507229
RMW	0.3086222	1.0000000	0.1844463
CMA	0.6507229	0.1844463	1.0000000

These correlations explain why the 3-factor model overstated HML’s effect!

Finding #4: Testing For Causal Effect Of Policies

The Policy Experiment

Suppose a new tax law targeting the Manufacturing and Finance sectors was introduced on January 1, 2011. Should you be concerned about this policy change?

Difference-in-Differences (DiD) Approach

We can treat the new tax policy as a scientific experiment:

Treatment group: Manufacturing & Finance companies (affected by the policy)
Control group: All other sectors (weren’t affected)
Before period: 1986-2010 (25 years)
After period: 2011-2022 (12 years)

The Question: Do treated firms’ returns improve relative to control firms after the policy?

The DiD Test

Model Setup:

$$ \text{EXC_RET}_{it} = \alpha_i +\beta_1 \text{After}_t + \beta_2( \text{Sec_MF}_i \times \text{After}t) + \varepsilon{it} $$

where:

$\alpha_i$ = Firm-specific baseline
$\beta_1$ = General time trend affecting all firms
$\beta_2$ = The causal effect we care about (DiD estimator)

We test on the coefficient of the interaction term:

Null hypothesis: $H_0: \beta_2 = 0$ (no causal effect of tax law)
Alternative hypothesis: $H_1: \beta_2 \neq 0$ (causal effect of tax law exists)
Test statistic:

$$ t = \frac{\hat{\beta_2}}{\text{s.e.}(\hat{\beta_2})} $$

Results

Table 11: (\#tab:did-results)Difference-in-Differences Results
Variable	Coefficient	p-value
After (time effect)	-0.1030	0.634
Treated × After (DiD)	0.0539	0.857
Test statistic: t = 0.1804 Conclusion: Fail to reject H₀ (no effect)

We fail to reject the null hypothesis at the 5% significance level with a test statistic of 0.1804 and p-value of 0.857. The policy change did not produce a meaningful differential impact on Manufacturing/Financial firms relative to other industries (as defined earlier). Translation: The tax law had zero measurable impact on Manufacturing/Finance firms’ stock returns.

What could be the reasons?

Markets are efficient: If investors expected the benefit, it was already priced in before 2011.
Policy was ineffective: The tax benefit was too small or poorly designed.
Offsetting factors: Other economic changes (2008 financial crisis recovery) dominated.
Compliance issues: Companies didn’t actually benefit as intended.

Key Takeaways for Investors

The Market Factor Dominates

Market Risk Premium coefficient ≈ 1.03
Meaning: If the market goes up 1%, your stock goes up ~1.03%
This explains most of the variation in returns

Size and Value Still Matter (But Less Than We Thought)

Small stocks outperform by 0.16% per month (1.9% annually)
Value stocks outperform by 0.35% per month (4.2% annually) in the 5-factor model
But remember: these are risk premiums, not free money

Profitability is Underrated

RMW coefficient = 0.23 (highly significant)
Profitable companies outperform by 2.8% annually
This was previously hidden in the “value” effect

Non-Linearity Creates Opportunities

Extreme values of factors have disproportionate effects
Simple linear models miss these nuances
Sophisticated investors can exploit these patterns

Policy Impacts Are Hard to Measure

Even well-intentioned policies may not affect stock returns
Markets are forward-looking and efficient
Be skeptical of claims that policy changes will “boost” certain sectors

Technical Summary Of The Analysis

Statistical Rigour: What We Tested

Test 1: Fixed Effects Necessity
- Null hypothesis: All firms share the same intercept
- Test: F-test comparing pooled vs. fixed effects models
- Result: F = 0.58, p = 0.98 → Fixed effects not needed
- Conclusion: Time effects dominate firm effects
Test 2: Linearity
- Null hypothesis: All squared terms = 0 (linear relationships)
- Test: Joint F-test on five squared factor terms
- Result: $\chi^2$ = 12.17, p = 0.033 → Reject linearity
- Conclusion: At least some relationships are non-linear
Test 3: Model Comparison (FF3 vs FF5)
- Metric: Adjusted R²
- FF3: 0.2572
- FF5: 0.2606
- Improvement: +0.34 percentage points (meaningful in finance)
Test 4: Causal Effect of a Hypothetical Tax Policy
- Null hypothesis: $\beta_2 = 0$ (no causal effect)
- Test: t-test on interaction term
- Result: t = 0.18, p = 0.86 → Fail to reject $H_0$
- Conclusion: No evidence of policy impact

Technical Notes

Standard Errors: All models use heteroskedasticity-robust standard errors (White’s correction).
Significance Levels:
- *** = p < 0.01 (99% confidence)
- ** = p < 0.05 (95% confidence)
- * = p < 0.10 (90% confidence)
Software: Analysis conducted in R using packages: plm, lmtest, sandwich, car
Data Source: simulated data for stock return performance

Code Sample

Data Preparation

1
2
3
4
5
6
7
8
9


# Load data
raw_dt <- read_csv("Data/data_case_1_group_79.csv") # change your data path here
raw_dt$date <- ymd(paste(raw_dt$Year, raw_dt$Month, "01", sep = "-"))

# Create excess returns
raw_dt$EXC_RET <- raw_dt$RET - raw_dt$RF

# Convert to panel data structure
raw_dt <- pdata.frame(raw_dt, index=c("IDCODE", "time")) # firm ID and Month

Model Estimation

1
2
3
4
5


# Five-factor model with firm fixed effects
fixed_i <- plm(EXC_RET ~ MktRF + SMB + HML + RMW + CMA, 
               data=raw_dt, 
               model="within", 
               effect="individual")

Non-Linearity Test

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


# Add squared terms
mdl_aux <- plm(EXC_RET ~ MktRF + I(MktRF^2) + SMB + I(SMB^2) + 
                         HML + I(HML^2) + RMW + I(RMW^2) + 
                         CMA + I(CMA^2), 
               data=raw_dt, model="pooling")

# Joint test
linearHypothesis(mdl_aux, 
                 c("I(MktRF^2)=0", "I(SMB^2)=0", "I(HML^2)=0", 
                   "I(RMW^2)=0", "I(CMA^2)=0"))

DiD Analysis

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


# Create treatment indicators
raw_dt <- raw_dt %>% 
  mutate(after = if_else(time >= 301, 1, 0),
         treated = if_else(Industry %in% c(4, 8), 1, 0))

# DiD regression
did_model <- plm(EXC_RET ~ after + treated:after, 
                 data=raw_dt, 
                 model="within", 
                 effect="individual")

Session Info

1

sessionInfo()

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48


## R version 4.4.2 (2024-10-31)
## Platform: aarch64-apple-darwin20
## Running under: macOS Sequoia 15.7.2
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: Europe/Amsterdam
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] kableExtra_1.4.0 knitr_1.48       car_3.1-2        carData_3.0-5   
##  [5] sandwich_3.1-1   lmtest_0.9-40    zoo_1.8-12       plm_2.6-4       
##  [9] lubridate_1.9.3  forcats_1.0.0    stringr_1.5.1    dplyr_1.1.4     
## [13] purrr_1.0.2      readr_2.1.5      tidyr_1.3.1      tibble_3.2.1    
## [17] ggplot2_3.5.1    tidyverse_2.0.0 
## 
## loaded via a namespace (and not attached):
##  [1] gtable_0.3.5        xfun_0.46           bslib_0.8.0        
##  [4] collapse_2.0.19     lattice_0.22-6      tzdb_0.4.0         
##  [7] numDeriv_2016.8-1.1 vctrs_0.6.5         tools_4.4.2        
## [10] Rdpack_2.6.2        generics_0.1.3      parallel_4.4.2     
## [13] fansi_1.0.6         highr_0.11          pkgconfig_2.0.3    
## [16] Matrix_1.7-1        stringmagic_1.1.2   lifecycle_1.0.4    
## [19] compiler_4.4.2      maxLik_1.5-2.1      munsell_0.5.1      
## [22] htmltools_0.5.8.1   sass_0.4.9          yaml_2.3.10        
## [25] Formula_1.2-5       crayon_1.5.3        pillar_1.9.0       
## [28] jquerylib_0.1.4     MASS_7.3-61         cachem_1.1.0       
## [31] abind_1.4-5         nlme_3.1-166        tidyselect_1.2.1   
## [34] bdsmatrix_1.3-7     digest_0.6.36       stringi_1.8.4      
## [37] bookdown_0.40       miscTools_0.6-28    fastmap_1.2.0      
## [40] grid_4.4.2          lfe_3.1.1           colorspace_2.1-1   
## [43] cli_3.6.5           magrittr_2.0.3      utf8_1.2.4         
## [46] withr_3.0.2         dreamerr_1.4.0      scales_1.3.0       
## [49] bit64_4.0.5         timechange_0.3.0    rmarkdown_2.27     
## [52] bit_4.0.5           blogdown_1.19       hms_1.1.3          
## [55] evaluate_0.24.0     rbibutils_2.3       fixest_0.12.1      
## [58] viridisLite_0.4.2   rlang_1.1.6         Rcpp_1.0.13-1      
## [61] xtable_1.8-4        glue_1.8.0          xml2_1.3.6         
## [64] vroom_1.6.5         svglite_2.1.3       rstudioapi_0.16.0  
## [67] jsonlite_1.8.8      R6_2.5.1            systemfonts_1.1.0

What’s Next?

In Part 2, we’ll shift from predicting return magnitudes to predicting return directions:

Can we predict whether a stock will go up or down tomorrow?
How does market volatility (VIX) affect these predictions?
Do more complex models actually improve prediction accuracy?

Stay tuned for logistic regression, ROC curves, and the surprising finding that simpler models sometimes win!

References & Further Reading

Fama, E. F., & French, K. R. (1993). “Common risk factors in the returns on stocks and bonds.” Journal of Financial Economics, 33(1), 3-56.
Fama, E. F., & French, K. R. (2015). “A five-factor asset pricing model.” Journal of Financial Economics, 116(1), 1-22.

Introduction: The Quest to Explain Stock Returns#

What Are These “Factors” Anyway?#

The Three Original Factors (Fama-French 3-Factor Model)#

The Two Additional Factors (Making it 5-Factor)#

Myth-buster: Do sophisticated models outperforms simple one in prediction?#

Our Data: 120 Companies Over 37 Years#

Key Findings#

Finding #1: More Factors Don’t Always Mean Better Models#

The Baseline Model#

What We Found#

Statistical Test: F-test for Fixed Effects#

Finding #2: Stock Returns Don’t Move in Straight Lines#

The Non-Linearity Test#

Interpretation & Implications#

Finding #3: The “Value Premium” Mystery Solved#

The Puzzle#

The Big Reveal#

Why the 5-Factor Model is Better#

Finding #4: Testing For Causal Effect Of Policies#

The Policy Experiment#

The DiD Test#

What could be the reasons?#

Key Takeaways for Investors#

Technical Summary Of The Analysis#

Statistical Rigour: What We Tested#

Technical Notes#

Code Sample#

What’s Next?#

References & Further Reading#