Problem Description
Solution
To retrieve the last 30 days of closing prices for all Dow Jones stocks in QuantConnect, the best approach combines two specific techniques:
- Dynamic Universe Selection: Use the
DIA(SPDR Dow Jones Industrial Average ETF) to automatically define the universe. This ensures your list of stocks is always historically accurate, handling additions and removals (e.g., when Amazon replaced Walgreens) automatically. - Batch History Request: Use
self.historywith a list of symbols and.unstack()the resulting DataFrame to create a clean table where columns are symbols and rows are dates.
Implementation Code
Here is a complete, runnable algorithm demonstrating this approach.
# region imports
from AlgorithmImports import *
# endregion
class DowJonesHistoryAlgorithm(QCAlgorithm):
def initialize(self):
self.set_start_date(2023, 1, 1)
self.set_end_date(2024, 1, 1)
self.set_cash(100000)
# 1. Define the Universe using the DIA ETF (Dow Jones Industrial Average)
# This automatically handles constituent changes over time.
self.universe_settings.resolution = Resolution.DAILY
self.add_universe(self.universe.etf("DIA"))
# List to store current Dow Jones symbols
self._dow_symbols = []
# Schedule a function to run every day at market open to get the data
self.schedule.on(
self.date_rules.every_day("DIA"),
self.time_rules.after_market_open("DIA", 10),
self.get_historical_data
)
def on_securities_changed(self, changes: SecurityChanges):
# 2. Maintain the list of active symbols
for security in changes.added_securities:
if security.symbol not in self._dow_symbols:
self._dow_symbols.append(security.symbol)
for security in changes.removed_securities:
if security.symbol in self._dow_symbols:
self._dow_symbols.remove(security.symbol)
def get_historical_data(self):
if not self._dow_symbols:
return
# 3. Request History
# We request 30 bars (days) at Daily resolution for all current symbols
history_df = self.history(self._dow_symbols, 30, Resolution.DAILY)
if history_df.empty:
return
# 4. Process Data
# The raw history_df has a MultiIndex (Symbol, Time).
# We select the 'close' column and unstack level 0 (Symbol)
# to make Symbols the columns and Time the index.
closing_prices = history_df['close'].unstack(level=0)
# Example: Log the most recent close of a specific stock (e.g., AAPL) if it exists
# This demonstrates how to access the clean DataFrame
target_symbol = "AAPL"
# Check if AAPL is currently in the columns (it might not be in the Dow in earlier years)
# Note: We convert Symbol objects to string for column matching if needed,
# but unstack usually preserves Symbol objects as column headers.
matching_col = [x for x in closing_prices.columns if x.value == target_symbol]
if matching_col:
recent_price = closing_prices[matching_col[0]].iloc[-1]
self.log(f"Date: {self.time.date()} | {target_symbol} Close: {recent_price}")
# Example: Calculate the 30-day mean for all stocks
means = closing_prices.mean()
Key Concepts Explained
1. self.universe.etf("DIA")
Hardcoding tickers (e.g., ["AAPL", "MSFT", ...]) is bad practice for backtesting because the Dow Jones index changes over time. By targeting the ETF constituents, QuantConnect automatically updates the universe to reflect the actual members of the Dow Jones on any specific date in the past.
2. self.history(symbols, 30, Resolution.DAILY)
This function fetches the data.
- Input: A list of
Symbolobjects. - Output: A pandas DataFrame with a MultiIndex. The index levels are
[Symbol, Time].
3. .unstack(level=0)
This is the most critical step for usability.
- Before Unstack: The DataFrame is "tall." Every row is a single point in time for a single symbol.
- After Unstack: The DataFrame is "wide."
- Rows: Dates.
- Columns: Tickers (Symbols).
- Values: Closing prices.
This format allows you to easily perform vectorized operations (like.mean(),.pct_change(), or correlation matrices) across all stocks simultaneously.
Q&A: QuantConnect Data Handling
Q: Why use Resolution.DAILY in the history request?
A: The request asked for the "last 30 days." If you use Resolution.MINUTE, requesting "30" would only give you the last 30 minutes of data. Explicitly setting Resolution.DAILY ensures you get 30 daily bars regardless of the algorithm's internal data frequency.
Q: What happens if a stock was added to the Dow Jones 10 days ago?
A: The self.history call will return NaN (Not a Number) for the first 20 days for that specific symbol in the unstacked DataFrame. You should handle these potential missing values using pandas methods like .dropna() or .fillna() depending on your strategy logic.
Q: Can I get other data besides closing prices?
A: Yes. The history_df contains columns for open, high, low, close, and volume. To get the volume, for example, you would use history_df['volume'].unstack(level=0).