QuantLearn
Trading Strategies
Pair Trading
Market-neutral statistical arbitrage
Historical Backtest Results
Backtesting results using NVIDIA (NVDA) and AMD pair from 2013-2014, demonstrating the strategy in action.

Pair Trading Positions
Trading positions for NVDA vs AMD pair showing entry/exit points based on Z-score signals.

Portfolio Performance
Total portfolio performance showing Z-score oscillations and corresponding asset value changes.

Engle-Granger Test Results
Statistical output from Engle-Granger two-step cointegration test showing significant results.
Statistical Arbitrage Theory
Pair trading is based on the concept of cointegration - when two assets move together in the long term despite short-term divergences.
The strategy works like "a drunk man with a dog" - the invisible leash (statistical relationship) keeps both assets in check.
When one asset becomes relatively overvalued compared to its pair, we short the overvalued asset and long the undervalued one.
The strategy profits when the pair converges back to their historical relationship.
This approach was pioneered by quantitative analysts at Morgan Stanley in the 1980s and remains a cornerstone of statistical arbitrage.
Mathematical Foundation
1Cointegration Relationship
Long-run equilibrium relationship between two assets, where εₜ is the residual that should be stationary.
2Engle-Granger Step 1
OLS estimation of the cointegrating coefficient β using historical price data.
3Augmented Dickey-Fuller Test
Test for stationarity of residuals. H₀: γ = 0 (unit root), reject if p-value < 0.05.
4Error Correction Model (Step 2)
Adjustment coefficient δy must be negative, indicating mean reversion to equilibrium.
5Z-Score Calculation
Normalized residual used for signal generation. Trading signals triggered at ±1σ or ±2σ thresholds.
Core Algorithm Implementation
The pair trading algorithm consists of three main components: cointegration testing, signal generation, and position management.
Engle-Granger Two-Step Method
def EG_method(X, Y, show_summary=False):
# Step 1: Estimate long-run equilibrium
model1 = sm.OLS(Y, sm.add_constant(X)).fit()
epsilon = model1.resid
# Check stationarity with ADF test
if sm.tsa.stattools.adfuller(epsilon)[1] > 0.05:
return False, model1
# Step 2: Error correction model
X_dif = sm.add_constant(pd.concat([X.diff(), epsilon.shift(1)], axis=1).dropna())
Y_dif = Y.diff().dropna()
model2 = sm.OLS(Y_dif, X_dif).fit()
# Adjustment coefficient must be negative
if list(model2.params)[-1] > 0:
return False, model1
else:
return True, model1Tests for cointegration using Engle-Granger methodology. Returns True if assets are cointegrated with proper error correction.
Signal Generation Logic
def signal_generation(asset1, asset2, method, bandwidth=250):
signals = pd.DataFrame()
signals['asset1'] = asset1['Close']
signals['asset2'] = asset2['Close']
signals['signals1'] = 0
signals['signals2'] = 0
for i in range(bandwidth, len(signals)):
# Test cointegration on rolling window
coint_status, model = method(signals['asset1'].iloc[i-bandwidth:i],
signals['asset2'].iloc[i-bandwidth:i])
if coint_status:
# Calculate normalized residuals (Z-score)
fitted = model.predict(sm.add_constant(signals['asset1'].iloc[i:]))
residual = signals['asset2'].iloc[i:] - fitted
z_score = (residual - np.mean(model.resid)) / np.std(model.resid)
# Generate signals based on Z-score thresholds
if z_score > 1: # Upper threshold
signals.at[signals.index[i], 'signals1'] = 1
elif z_score < -1: # Lower threshold
signals.at[signals.index[i], 'signals1'] = -1
return signalsGenerates trading signals based on Z-score deviations from mean. Signal1=1 means long asset1/short asset2, and vice versa.
Portfolio Management
def portfolio(data):
capital0 = 20000
positions1 = capital0 // max(data['asset1'])
positions2 = capital0 // max(data['asset2'])
portfolio = pd.DataFrame()
portfolio['holdings1'] = data['cumsum1'] * data['asset1'] * positions1
portfolio['cash1'] = capital0 - (data['positions1'] * data['asset1'] * positions1).cumsum()
portfolio['total_asset1'] = portfolio['holdings1'] + portfolio['cash1']
# Repeat for asset2 with opposite positions
portfolio['holdings2'] = data['cumsum2'] * data['asset2'] * positions2
portfolio['cash2'] = capital0 - (data['positions2'] * data['asset2'] * positions2).cumsum()
portfolio['total_asset2'] = portfolio['holdings2'] + portfolio['cash2']
portfolio['total_asset'] = portfolio['total_asset1'] + portfolio['total_asset2']
return portfolioManages portfolio allocation and tracks performance of both assets separately before combining into total portfolio value.
Implementation Steps
- 1Identify two potentially cointegrated assets (same industry, stock vs ETF)
- 2Apply Engle-Granger two-step method to test for cointegration
- 3Calculate the spread between the two assets using OLS regression
- 4Normalize residuals to create Z-score for signal generation
- 5Set trigger conditions based on ±1σ or ±2σ from mean spread
- 6Execute trades when Z-score exceeds threshold levels
- 7Monitor cointegration status on rolling windows (typically 250 days)
- 8Clear positions immediately when cointegration relationship breaks
Key Metrics
Risk Considerations
Practice Implementation
Prerequisites
Mathematical Background
- • Linear regression and OLS estimation
- • Time series analysis (stationarity, unit roots)
- • Hypothesis testing and p-values
- • Basic econometrics (error correction models)
Technical Skills
- • Python programming (pandas, numpy)
- • Statistical libraries (statsmodels)
- • Data visualization (matplotlib)
- • Financial data handling (yfinance)
Complete Implementation
Access the full Python implementation from the original quantitative trading repository:
# Complete pair trading implementation
git clone https://github.com/je-suis-tm/quant-trading.git
cd quant-trading
python "Pair trading backtest.py"
# Modify tickers and parameters for your own analysisLearning Checkpoints
Understand Cointegration
Can you explain why two assets might be cointegrated and what breaks this relationship?
Interpret Statistical Tests
Practice reading ADF test results and understanding when to accept/reject cointegration.
Signal Generation
Implement Z-score calculations and understand threshold selection (±1σ vs ±2σ).
Risk Management
Understand position sizing, monitoring regime changes, and exit strategies.
Recommended Learning Path
Immediate Actions
- Download and run the Python script
- Test with different asset pairs
- Experiment with threshold parameters
Advanced Studies
- Learn Johansen cointegration test
- Study Vector Error Correction Models
- Explore multiple asset pair trading
Important Disclaimer
This strategy involves significant risk. Historical cointegration relationships can break permanently. Always use proper risk management, position sizing, and never risk more than you can afford to lose. Paper trade extensively before using real capital.