WquGuru·QuantLearn

Interactive Strategy Learning

QuantLearn

Trading Strategies

ArbitrageIntermediate

Pair Trading

Market-neutral statistical arbitrage

Historical Backtest Results

Backtesting results using NVIDIA (NVDA) and AMD pair from 2013-2014, demonstrating the strategy in action.

Pair Trading Positions

Trading positions for NVDA vs AMD pair showing entry/exit points based on Z-score signals.

Portfolio Performance

Total portfolio performance showing Z-score oscillations and corresponding asset value changes.

Engle-Granger Test Results

Statistical output from Engle-Granger two-step cointegration test showing significant results.

Statistical Arbitrage Theory

Pair trading is based on the concept of cointegration - when two assets move together in the long term despite short-term divergences.

The strategy works like "a drunk man with a dog" - the invisible leash (statistical relationship) keeps both assets in check.

When one asset becomes relatively overvalued compared to its pair, we short the overvalued asset and long the undervalued one.

The strategy profits when the pair converges back to their historical relationship.

This approach was pioneered by quantitative analysts at Morgan Stanley in the 1980s and remains a cornerstone of statistical arbitrage.

Mathematical Foundation

1
Cointegration Relationship

Y_t = α + β X_t + ε_t

Long-run equilibrium relationship between two assets, where εₜ is the residual that should be stationary.

2
Engle-Granger Step 1

β̂ = Σ(X_t - X̄)(Y_t - \bar{Y}) / Σ(X_t - X̄)²

OLS estimation of the cointegrating coefficient β using historical price data.

3
Augmented Dickey-Fuller Test

Δε_t = γε_{t-1} + Σφ_i Δε_{t-i} + u_t

Test for stationarity of residuals. H₀: γ = 0 (unit root), reject if p-value < 0.05.

4
Error Correction Model (Step 2)

ΔY_t = α_y + δ_y ε_{t-1} + Σγ_{yi} ΔY_{t-i} + Σλ_{yi} ΔX_{t-i} + v_{yt}

Adjustment coefficient δy must be negative, indicating mean reversion to equilibrium.

5
Z-Score Calculation

Z_t = (ε_t - μ_ε) / σ_ε

Normalized residual used for signal generation. Trading signals triggered at ±1σ or ±2σ thresholds.

Core Algorithm Implementation

The pair trading algorithm consists of three main components: cointegration testing, signal generation, and position management.

Engle-Granger Two-Step Method

python

def EG_method(X, Y, show_summary=False):
    # Step 1: Estimate long-run equilibrium
    model1 = sm.OLS(Y, sm.add_constant(X)).fit()
    epsilon = model1.resid
    
    # Check stationarity with ADF test
    if sm.tsa.stattools.adfuller(epsilon)[1] > 0.05:
        return False, model1
    
    # Step 2: Error correction model
    X_dif = sm.add_constant(pd.concat([X.diff(), epsilon.shift(1)], axis=1).dropna())
    Y_dif = Y.diff().dropna()
    model2 = sm.OLS(Y_dif, X_dif).fit()
    
    # Adjustment coefficient must be negative
    if list(model2.params)[-1] > 0:
        return False, model1
    else:
        return True, model1

Tests for cointegration using Engle-Granger methodology. Returns True if assets are cointegrated with proper error correction.

Signal Generation Logic

python

def signal_generation(asset1, asset2, method, bandwidth=250):
    signals = pd.DataFrame()
    signals['asset1'] = asset1['Close']
    signals['asset2'] = asset2['Close']
    signals['signals1'] = 0
    signals['signals2'] = 0
    
    for i in range(bandwidth, len(signals)):
        # Test cointegration on rolling window
        coint_status, model = method(signals['asset1'].iloc[i-bandwidth:i],
                                   signals['asset2'].iloc[i-bandwidth:i])
        
        if coint_status:
            # Calculate normalized residuals (Z-score)
            fitted = model.predict(sm.add_constant(signals['asset1'].iloc[i:]))
            residual = signals['asset2'].iloc[i:] - fitted
            z_score = (residual - np.mean(model.resid)) / np.std(model.resid)
            
            # Generate signals based on Z-score thresholds
            if z_score > 1:  # Upper threshold
                signals.at[signals.index[i], 'signals1'] = 1
            elif z_score < -1:  # Lower threshold
                signals.at[signals.index[i], 'signals1'] = -1
    
    return signals

Generates trading signals based on Z-score deviations from mean. Signal1=1 means long asset1/short asset2, and vice versa.

Portfolio Management

python

def portfolio(data):
    capital0 = 20000
    positions1 = capital0 // max(data['asset1'])
    positions2 = capital0 // max(data['asset2'])
    
    portfolio = pd.DataFrame()
    portfolio['holdings1'] = data['cumsum1'] * data['asset1'] * positions1
    portfolio['cash1'] = capital0 - (data['positions1'] * data['asset1'] * positions1).cumsum()
    portfolio['total_asset1'] = portfolio['holdings1'] + portfolio['cash1']
    
    # Repeat for asset2 with opposite positions
    portfolio['holdings2'] = data['cumsum2'] * data['asset2'] * positions2
    portfolio['cash2'] = capital0 - (data['positions2'] * data['asset2'] * positions2).cumsum()
    portfolio['total_asset2'] = portfolio['holdings2'] + portfolio['cash2']
    
    portfolio['total_asset'] = portfolio['total_asset1'] + portfolio['total_asset2']
    return portfolio

Manages portfolio allocation and tracks performance of both assets separately before combining into total portfolio value.

Implementation Steps

1Identify two potentially cointegrated assets (same industry, stock vs ETF)
2Apply Engle-Granger two-step method to test for cointegration
3Calculate the spread between the two assets using OLS regression
4Normalize residuals to create Z-score for signal generation
5Set trigger conditions based on ±1σ or ±2σ from mean spread
6Execute trades when Z-score exceeds threshold levels
7Monitor cointegration status on rolling windows (typically 250 days)
8Clear positions immediately when cointegration relationship breaks

Key Metrics

Cointegration test p-value < 0.05 (reject null hypothesis of no cointegration)

Adjustment coefficient δ < 0 (negative for mean reversion)

Half-life of mean reversion (how quickly spreads converge)

Maximum drawdown during divergence periods

Sharpe ratio of market-neutral returns

Hit rate (percentage of profitable trades)

Average holding period per position

Risk Considerations

Cointegration relationships can break due to fundamental changes (regime shifts)

Market conditions are dynamic - historical relationships may not persist

Company-specific events (new products, mergers) can permanently alter relationships

Example: NVIDIA vs AMD diverged after AI/crypto boom despite historical correlation

Model assumes Gaussian distribution of residuals, which may not hold during market stress

Transaction costs and slippage can erode profits from frequent trading

Leverage amplifies both profits and losses in divergent markets

Practice Implementation

Prerequisites

Mathematical Background

• Linear regression and OLS estimation
• Time series analysis (stationarity, unit roots)
• Hypothesis testing and p-values
• Basic econometrics (error correction models)

Technical Skills

• Python programming (pandas, numpy)
• Statistical libraries (statsmodels)
• Data visualization (matplotlib)
• Financial data handling (yfinance)

Complete Implementation

Access the full Python implementation from the original quantitative trading repository:

bash

# Complete pair trading implementation
git clone https://github.com/je-suis-tm/quant-trading.git
cd quant-trading
python "Pair trading backtest.py"
# Modify tickers and parameters for your own analysis

Learning Checkpoints

Understand Cointegration

Can you explain why two assets might be cointegrated and what breaks this relationship?

Interpret Statistical Tests

Practice reading ADF test results and understanding when to accept/reject cointegration.

Signal Generation

Implement Z-score calculations and understand threshold selection (±1σ vs ±2σ).

Risk Management

Understand position sizing, monitoring regime changes, and exit strategies.

Recommended Learning Path

Immediate Actions

Download and run the Python script
Test with different asset pairs
Experiment with threshold parameters

Advanced Studies

Learn Johansen cointegration test
Study Vector Error Correction Models
Explore multiple asset pair trading

Important Disclaimer

This strategy involves significant risk. Historical cointegration relationships can break permanently. Always use proper risk management, position sizing, and never risk more than you can afford to lose. Paper trade extensively before using real capital.

WquGuru·QuantLearn

QuantLearn

Pair Trading

Historical Backtest Results

Pair Trading Positions

Portfolio Performance

Engle-Granger Test Results

Statistical Arbitrage Theory

Mathematical Foundation

1
Cointegration Relationship

2
Engle-Granger Step 1

3
Augmented Dickey-Fuller Test

4
Error Correction Model (Step 2)

5
Z-Score Calculation

Core Algorithm Implementation

Engle-Granger Two-Step Method

Signal Generation Logic

Portfolio Management

Implementation Steps

Key Metrics

Risk Considerations

Practice Implementation

Prerequisites

Mathematical Background

Technical Skills

Complete Implementation

Learning Checkpoints

Recommended Learning Path

Immediate Actions

Advanced Studies

Important Disclaimer

Contents

Quick Navigation

Pair Trading

Historical Backtest Results

Pair Trading Positions

Portfolio Performance

Engle-Granger Test Results

Statistical Arbitrage Theory

Mathematical Foundation

1Cointegration Relationship

2Engle-Granger Step 1

3Augmented Dickey-Fuller Test

4Error Correction Model (Step 2)

5Z-Score Calculation

Core Algorithm Implementation

Engle-Granger Two-Step Method

Signal Generation Logic

Portfolio Management

Implementation Steps

Key Metrics

Risk Considerations

Practice Implementation

Prerequisites

Mathematical Background

Technical Skills

Complete Implementation

Learning Checkpoints

Recommended Learning Path

Immediate Actions

Advanced Studies

Important Disclaimer

Contents

Quick Navigation

1
Cointegration Relationship

2
Engle-Granger Step 1

3
Augmented Dickey-Fuller Test

4
Error Correction Model (Step 2)

5
Z-Score Calculation