Only problem is, if you run this right now for ticker TQQQ you will see that Polygon is missing three whole days worth of market data (9/22-9/24).
Do others have this issue with Polygon? Is this because we are in some kind of “free” version of Polygon using the Alpaca keys? What’s the most reliable way to get NASDAQ market data? Seems like every provider has a story out there like this, or they don’t even support NASDAQ (eg. Alpha Vantage).
I don’t seem to be having any problem with pulling with polygon.
I used historic_agg to pull the hour bars for the dates you mentioned and the data is all there.
I reached out on the Alpaca Slack and here’s the answer from Dan Whitnable that helped me:
There isn’t any issue with missing Polygon data (at least from what I have ever seen). The issue is the polygon.historic_agg_v2 method limits the number of rows which are returned. If one requests data between two dates, and the number of returned rows is greater than that limit, the result will simply be truncated. It unfortunately does this ‘silently’ without any error mentioning the result isn’t complete. It also isn’t documented exactly what the cutoff is. That is why it looked like there were several days missing in the data. They had been truncated. One can verify this behavior by simply requesting incrementally longer timeframes of data. Start with requesting 1 day, then 2, and so forth. Everything will be correct up until about 10 days (depending upon how the weekends fall). After that, the data for the last dates are dropped.
I’m sure there are more elegant solutions to get around this behavior, but I use a small function to fetch data one day at time. Like this
def get_hourly_history(ticker, from_date, to_date):
"""
Get Polygon data between two dates and return in a dataframe
"""
df = None
date_range = pd.date_range(start=from_date,
end=to_date,
normalize=True)
for day in date_range:
data = api.polygon.historic_agg_v2(ticker, 1, 'hour', _from=day, to=day).df
if df is None:
df = data
else:
df = df.append(data)
return df
That will fetch one day at a time and return all the data in a single dataframe. It could be sped up by fetching data in longer ‘chunks’ of days, and only fetching data for trading days, but this is simple. Hope that helps. (edited)