Problem Description
Solution
Here is a complete, robust Pairs Trading strategy implemented in Python for the QuantConnect Lean Engine.
Strategy Overview
This strategy utilizes a Mean Reversion approach based on the cointegration of two correlated assets (Pepsi and Coca-Cola).
- Data Ingestion: We subscribe to PEP and KO at Daily resolution.
- Statistical Calculation:
- We use a rolling window (e.g., 60 days) of historical closing prices.
- We calculate the Hedge Ratio using Ordinary Least Squares (OLS) regression ($Price_Y = \beta \times Price_X + \alpha$).
- We derive the Spread: $Spread = Price_Y - (\beta \times Price_X)$.
- We calculate the Z-Score of the current spread relative to its historical mean and standard deviation.
- Entry Logic:
- Short the Spread (Sell Y, Buy X) if Z-Score > 2.0 (Spread is statistically too high).
- Long the Spread (Buy Y, Sell X) if Z-Score < -2.0 (Spread is statistically too low).
- Exit Logic:
- Close all positions when the Z-Score reverts to 0 (Mean Reversion).
Python Implementation
# region imports
from AlgorithmImports import *
import numpy as np
import statsmodels.api as sm
# endregion
class PairsTradingAlgorithm(QCAlgorithm):
def initialize(self):
# 1. Set Setup Parameters
self.set_start_date(2018, 1, 1)
self.set_end_date(2023, 1, 1)
self.set_cash(100000)
# 2. Add Assets (Pepsi and Coca-Cola)
self.pep = self.add_equity("PEP", Resolution.DAILY).symbol
self.ko = self.add_equity("KO", Resolution.DAILY).symbol
# 3. Strategy Parameters
self.lookback = 60 # Days for regression calculation
self.entry_threshold = 2.0 # Z-Score to enter trade
self.exit_threshold = 0.0 # Z-Score to exit trade (mean reversion)
# Warm up period to ensure we have data immediately
self.set_warm_up(self.lookback)
def on_data(self, data: Slice):
# Ensure we are not warming up and data exists for both symbols
if self.is_warming_up:
return
if not (data.contains_key(self.pep) and data.contains_key(self.ko)):
return
# 1. Get Historical Data
history = self.history([self.pep, self.ko], self.lookback, Resolution.DAILY)
# Check if history is empty or incomplete
if history.empty or 'close' not in history.columns:
return
# Unstack to get a DataFrame where columns are symbols and rows are time
df = history['close'].unstack(level=0)
# Ensure we have enough data points after unstacking
if len(df) < self.lookback:
return
# 2. Perform OLS Regression to find Hedge Ratio (Beta)
# Y = PEP, X = KO
y_vals = df[self.pep].values
x_vals = df[self.ko].values
# Add constant for OLS (Intercept)
x_with_const = sm.add_constant(x_vals)
model = sm.OLS(y_vals, x_with_const).fit()
beta = model.params[1]
intercept = model.params[0]
# 3. Calculate Spread and Z-Score
# Spread = Y - (Beta * X + Intercept)
# We calculate the spread series for the whole lookback window to get stats
spread_series = y_vals - (beta * x_vals + intercept)
mean_spread = np.mean(spread_series)
std_spread = np.std(spread_series)
current_spread = spread_series[-1]
if std_spread == 0:
return
z_score = (current_spread - mean_spread) / std_spread
# 4. Execution Logic
# Check if we have open positions
invested = self.portfolio.invested
# --- Entry Logic ---
if not invested:
# Short the Spread: Spread is too high, expect it to drop.
# Sell PEP (Y), Buy KO (X)
if z_score > self.entry_threshold:
self.set_holdings(self.pep, -0.5)
self.set_holdings(self.ko, 0.5)
self.debug(f"Entry Short Spread | Z-Score: {z_score:.2f}")
# Long the Spread: Spread is too low, expect it to rise.
# Buy PEP (Y), Sell KO (X)
elif z_score < -self.entry_threshold:
self.set_holdings(self.pep, 0.5)
self.set_holdings(self.ko, -0.5)
self.debug(f"Entry Long Spread | Z-Score: {z_score:.2f}")
# --- Exit Logic ---
else:
# We exit when the spread reverts to the mean (crosses 0)
# Depending on direction, we check if it crossed the exit threshold
# If we are Short the Spread (Short PEP, Long KO)
if self.portfolio[self.pep].is_short and z_score <= self.exit_threshold:
self.liquidate()
self.debug(f"Exit Short Spread (Mean Reversion) | Z-Score: {z_score:.2f}")
# If we are Long the Spread (Long PEP, Short KO)
elif self.portfolio[self.pep].is_long and z_score >= self.exit_threshold:
self.liquidate()
self.debug(f"Exit Long Spread (Mean Reversion) | Z-Score: {z_score:.2f}")
Key Components Explained
-
initialize:- We select
PEPandKOas our pair. These are historically cointegrated consumer staples. self.lookback = 60defines the rolling window size. A shorter window makes the strategy more reactive but noisier; a longer window makes it smoother but slower.
- We select
-
on_data:- Data Handling: We use
self.historyto fetch the last 60 days of closing prices. The.unstack(level=0)method is crucial to transform the multi-index dataframe into a format where columns represent symbols. - OLS Regression: We use
statsmodels(sm.OLS) to dynamically calculate the relationship between the two stocks. This provides thebeta(hedge ratio). - Z-Score: This normalizes the spread. A Z-Score of +2.0 means the spread is 2 standard deviations above its mean.
- Data Handling: We use
-
Execution:
set_holdings: We allocate 50% of capital to the long side and 50% to the short side. This creates a dollar-neutral portfolio (ignoring beta weighting for simplicity in this example).- Mean Reversion: The core philosophy is that the Z-Score will eventually return to 0. We liquidate positions when this happens to capture the profit.
Q&A: Pairs Trading on QuantConnect
Q: Why use Resolution.DAILY instead of MINUTE?
A: Pairs trading relies on statistical relationships that are often more stable on daily timeframes. Using minute data introduces microstructure noise and bid-ask bounce, which can generate false signals unless the strategy is highly optimized for high-frequency execution.
Q: How can I make the portfolio "Beta Neutral" instead of "Dollar Neutral"?
A: In the code above, we use 0.5 and -0.5 (Dollar Neutral). To be Beta Neutral, you would adjust the weights based on the calculated beta. For example, if Beta is 1.2, for every $1.00 of X you buy, you should sell $1.20 of Y.
Q: What happens if the correlation breaks?
A: This is the main risk of pairs trading. If the fundamental relationship between PEP and KO changes (e.g., one company is acquired), the spread may drift indefinitely (divergence). To mitigate this, you should add a "Stop Loss" logic, for example, liquidating if the Z-Score expands beyond 4.0.