🚀 AI One-Click Generation of quantconnect Strategy Code

Using Pandas `unstack` to Organize Multi-Symbol History Data in QuantConnect

quantconnect | Published on: 2025/12/2
The following content is generated by EasyQuant .

Problem Description

When I get historical data for multiple symbols, how can I organize it into a DataFrame with symbols as columns?

Solution

To organize historical data for multiple symbols into a DataFrame where symbols are columns, you need to manipulate the MultiIndex DataFrame returned by the self.history() method.

By default, when you request history for a list of symbols, QuantConnect returns a DataFrame indexed by [symbol, time]. To rearrange this so that Time is the index and Symbols are the columns, you use the pandas .unstack(level=0) method.

Core Solution

Here is the specific pattern to fetch data and pivot it:

# 1. Get the MultiIndex DataFrame (Index: Symbol, Time)
history_df = self.history(self.my_symbols, 30, Resolution.DAILY)

# 2. Select the column you want (e.g., 'close') and unstack the 'symbol' index
# level=0 refers to the first index, which is the Symbol
close_prices = history_df['close'].unstack(level=0)

# Result: A DataFrame where Index = Time, Columns = Symbols

Full Algorithm Example

Below is a complete, runnable algorithm demonstrating how to initialize multiple symbols, fetch their history, and transform the data into a clean DataFrame with symbols as columns.

# region imports
from AlgorithmImports import *
# endregion

class HistoryDataFrameAlgorithm(QCAlgorithm):

    def initialize(self):
        self.set_start_date(2023, 1, 1)
        self.set_end_date(2023, 6, 1)
        self.set_cash(100000)
        
        # 1. Define a list of tickers
        tickers = ["SPY", "AAPL", "MSFT", "GOOG"]
        
        # 2. Add equities and save their Symbol objects to a list
        self.symbols = [self.add_equity(ticker, Resolution.DAILY).symbol for ticker in tickers]

    def on_end_of_algorithm(self):
        # 3. Fetch history for all symbols at once
        # Returns a MultiIndex DataFrame: (symbol, time)
        history = self.history(self.symbols, 10, Resolution.DAILY)
        
        if history.empty:
            self.log("No history found.")
            return

        # 4. Transform: Select 'close' column and unstack the symbol index (level 0)
        # This pivots the table so Symbols become columns
        close_price_df = history['close'].unstack(level=0)
        
        # Logging to demonstrate the structure
        self.log(f"\nTransformed DataFrame Head:\n{close_price_df.head()}")
        
        # Example: Calculate returns using the new DataFrame structure
        daily_returns = close_price_df.pct_change().dropna()
        self.log(f"\nDaily Returns Head:\n{daily_returns.head()}")

        # Note: If you want ALL data (Open, High, Low, Close) with symbols as columns,
        # you can unstack the whole dataframe. This creates a MultiIndex Column.
        full_unstacked = history.unstack(level=0)
        # Accessing specific data: full_unstacked['open']['SPY']

Key Explanations

  1. self.history(self.symbols, ...): When passing a list of Symbol objects, QuantConnect returns a pandas DataFrame with a MultiIndex.
    • Level 0: Symbol
    • Level 1: Time
  2. ['close']: You usually want to isolate a specific data point (Close, Open, High, Low, or Volume) before pivoting.
  3. .unstack(level=0): This pandas command takes the index at level 0 (the Symbol) and moves it to the columns. The remaining index (Time) stays as the row index.

Handling Missing Data

If one symbol has less data than another (e.g., a recently IPO'd stock vs. SPY), unstack will automatically fill the missing periods with NaN. You may need to handle this using .dropna() or .fillna() depending on your strategy logic.


Q&A

Q: What happens if I don't select a specific column like 'close' before unstacking?
A: If you run history.unstack(level=0) on the entire DataFrame, you will get a DataFrame with MultiIndex Columns. The top level of columns will be the data type (open, high, low, close, volume), and the second level will be the symbols. You would access data like df['close']['AAPL'].

Q: Can I use this method for Minute resolution data?
A: Yes. The logic is identical regardless of resolution (Resolution.MINUTE, Resolution.HOUR, etc.). However, be careful with the amount of data you request; requesting years of minute data for multiple symbols can be memory-intensive.

Q: Why do I get an error saying "Index contains duplicate entries"?
A: This usually happens if you have subscribed to the same symbol multiple times or if the history request covers a period where data might overlap in a strange way (rare in standard equity data). Ensure your self.symbols list contains unique Symbol objects.

Q: How do I get the symbol string instead of the Symbol object as the column header?
A: The unstack method uses the index values. In QuantConnect, the index is the Symbol object (which contains ID, ticker, market, etc.). If you strictly want the string ticker (e.g., "AAPL") as the column name, you can map the columns after unstacking:

df = history['close'].unstack(level=0)
# Convert Symbol objects to their string Ticker representation
df.columns = [symbol.value for symbol in df.columns]