Problem Description
Solution
Yes, you can use NumPy to calculate the standard deviation of historical closing prices in your QuantConnect algorithm. The Lean engine fully supports standard Python data science libraries, including NumPy and Pandas.
To do this, you typically request historical data using the self.history method, which returns a Pandas DataFrame. You can then pass the specific column (e.g., 'close') directly to np.std.
Implementation Example
Below is a complete algorithm demonstrating how to import NumPy, request historical data, and calculate the standard deviation of the closing prices.
# region imports
from AlgorithmImports import *
import numpy as np
# endregion
class NumpyStdDevAlgorithm(QCAlgorithm):
def initialize(self):
# Set start date and cash
self.set_start_date(2023, 1, 1)
self.set_end_date(2023, 6, 1)
self.set_cash(100000)
# Add an equity
self.symbol = self.add_equity("SPY", Resolution.DAILY).symbol
# Define lookback period for standard deviation
self.lookback = 30
def on_data(self, data: Slice):
# Ensure we have data for the symbol
if not data.contains_key(self.symbol):
return
# Request history for the symbol
# This returns a Pandas DataFrame indexed by (Symbol, Time)
history = self.history(self.symbol, self.lookback, Resolution.DAILY)
# Check if history is empty or missing the 'close' column
if history.empty or 'close' not in history.columns:
return
# Extract the closing prices
# We use .values to get the underlying numpy array, though pandas series works too
closing_prices = history['close'].values
# Calculate Standard Deviation using NumPy
# ddof=1 provides the sample standard deviation (unbiased estimator)
std_dev = np.std(closing_prices, ddof=1)
self.log(f"Date: {self.time} | {self.lookback}-Day Std Dev for {self.symbol}: {std_dev:.4f}")
# Example logic: Buy if price is below mean - 1 std dev (Mean Reversion)
mean_price = np.mean(closing_prices)
current_price = self.securities[self.symbol].price
if current_price < (mean_price - std_dev) and not self.portfolio.invested:
self.set_holdings(self.symbol, 1.0)
elif current_price > mean_price and self.portfolio.invested:
self.liquidate(self.symbol)
Key Concepts
- Importing NumPy: Ensure you include
import numpy as npat the top of your file. - Fetching History:
self.history(symbol, period, resolution)is the most efficient way to get past data. It returns a multi-index Pandas DataFrame. - Data Access: Access the closing prices column using
history['close']. - Calculation:
np.std(array)calculates the standard deviation.- By default,
np.stdcalculates the population standard deviation. - If you require the sample standard deviation (common in finance), use the argument
ddof=1(Delta Degrees of Freedom).
- By default,
Q&A
Q: Does self.history return a NumPy array directly?
A: No, self.history returns a Pandas DataFrame by default. However, you can easily convert a DataFrame column to a NumPy array using .values or .to_numpy().
Q: Can I use self.std instead of NumPy?
A: Yes, QuantConnect provides built-in indicators like self.std(symbol, period). This creates a rolling window indicator that updates automatically with every new data point. Using NumPy is better for ad-hoc calculations on historical batches, while self.std is better for continuous monitoring in on_data.
Q: Why use ddof=1 in np.std?
A: In financial statistics, we usually work with a sample of data rather than the entire population. Setting ddof=1 calculates the sample standard deviation, which provides an unbiased estimator of the population variance.
Q: Is NumPy faster than iterating through a Python list?
A: Yes, NumPy is significantly faster for numerical calculations because it uses vectorized operations implemented in C, avoiding the overhead of Python loops.