🚀 AI One-Click Generation of quantconnect Strategy Code

Using NumPy for Statistical Analysis in Your QuantConnect Python Algorithm

quantconnect | Published on: 2025/12/2
The following content is generated by EasyQuant .

Problem Description

Can I use NumPy to calculate the standard deviation of historical closing prices in my algorithm?

Solution

Yes, you can use NumPy to calculate the standard deviation of historical closing prices in your QuantConnect algorithm. The Lean engine fully supports standard Python data science libraries, including NumPy and Pandas.

To do this, you typically request historical data using the self.history method, which returns a Pandas DataFrame. You can then pass the specific column (e.g., 'close') directly to np.std.

Implementation Example

Below is a complete algorithm demonstrating how to import NumPy, request historical data, and calculate the standard deviation of the closing prices.

# region imports
from AlgorithmImports import *
import numpy as np
# endregion

class NumpyStdDevAlgorithm(QCAlgorithm):

    def initialize(self):
        # Set start date and cash
        self.set_start_date(2023, 1, 1)
        self.set_end_date(2023, 6, 1)
        self.set_cash(100000)
        
        # Add an equity
        self.symbol = self.add_equity("SPY", Resolution.DAILY).symbol
        
        # Define lookback period for standard deviation
        self.lookback = 30

    def on_data(self, data: Slice):
        # Ensure we have data for the symbol
        if not data.contains_key(self.symbol):
            return

        # Request history for the symbol
        # This returns a Pandas DataFrame indexed by (Symbol, Time)
        history = self.history(self.symbol, self.lookback, Resolution.DAILY)

        # Check if history is empty or missing the 'close' column
        if history.empty or 'close' not in history.columns:
            return

        # Extract the closing prices
        # We use .values to get the underlying numpy array, though pandas series works too
        closing_prices = history['close'].values

        # Calculate Standard Deviation using NumPy
        # ddof=1 provides the sample standard deviation (unbiased estimator)
        std_dev = np.std(closing_prices, ddof=1)

        self.log(f"Date: {self.time} | {self.lookback}-Day Std Dev for {self.symbol}: {std_dev:.4f}")
        
        # Example logic: Buy if price is below mean - 1 std dev (Mean Reversion)
        mean_price = np.mean(closing_prices)
        current_price = self.securities[self.symbol].price
        
        if current_price < (mean_price - std_dev) and not self.portfolio.invested:
            self.set_holdings(self.symbol, 1.0)
        elif current_price > mean_price and self.portfolio.invested:
            self.liquidate(self.symbol)

Key Concepts

  1. Importing NumPy: Ensure you include import numpy as np at the top of your file.
  2. Fetching History: self.history(symbol, period, resolution) is the most efficient way to get past data. It returns a multi-index Pandas DataFrame.
  3. Data Access: Access the closing prices column using history['close'].
  4. Calculation: np.std(array) calculates the standard deviation.
    • By default, np.std calculates the population standard deviation.
    • If you require the sample standard deviation (common in finance), use the argument ddof=1 (Delta Degrees of Freedom).

Q&A

Q: Does self.history return a NumPy array directly?
A: No, self.history returns a Pandas DataFrame by default. However, you can easily convert a DataFrame column to a NumPy array using .values or .to_numpy().

Q: Can I use self.std instead of NumPy?
A: Yes, QuantConnect provides built-in indicators like self.std(symbol, period). This creates a rolling window indicator that updates automatically with every new data point. Using NumPy is better for ad-hoc calculations on historical batches, while self.std is better for continuous monitoring in on_data.

Q: Why use ddof=1 in np.std?
A: In financial statistics, we usually work with a sample of data rather than the entire population. Setting ddof=1 calculates the sample standard deviation, which provides an unbiased estimator of the population variance.

Q: Is NumPy faster than iterating through a Python list?
A: Yes, NumPy is significantly faster for numerical calculations because it uses vectorized operations implemented in C, avoiding the overhead of Python loops.