Environment
Language
Python 3.7
Alpaca SDK Version
Latest
Other Environment Details
Using historic_agg_v2 polygon function to get aggregate data
Problem
Summary
There seem to be variations in the number of entries returned by the “polygon.historic_agg_v2” function depending on the timespan, ticker & date range values. I assume limits exist on the number of results per api call, but I fail to find any consistency in the results from which I can deduce a pattern. I do recall reading the limit being 3000 in the documentation somewhere, but that doesn’t reconcile with what I’m seeing either.
Case 1 (pasted below): Hourly data on ‘AAPL’ from 2012-01-01 to 2013-07-01: Returns 1253 entries (starting from 2013-03-06).
Case 2 (omitted for brevity): Hourly data on ‘SPY’ from the same date range: Returns 1041 entries.
Case 3 (pasted below): Minute data on ‘AAPL’ from the same date range: Returns exactly 50,000 entries, but the returned start date is almost the same (off by a day) as it was in Case 1.
Case 4 (omitted for brevity): Minute data on ‘SPY’ from the same date range: Returns exactly 50,000 entries, again with an almost identical start date to Case 2 (hourly data).
The goal is to retrieve intraday data on a universe of tickers in order to run a backtest. Trying to understand what limits exist on the polygon api, and how I can fetch historical data en masse. Surely this isn’t the way the API was designed to function. What am I missing?!
Paper or Live Tradng?
Tried It On Both…Identical Results
Example Code
Case 1:
aapl = api.polygon.historic_agg_v2(‘AAPL’, 1, timespan=‘hour’, _from=‘2012-01-01’, to=‘2013-07-01’).df
open high low close volume
timestamp
2013-03-06 08:00:00-05:00 61.8500 61.8500 61.8286 61.8286 2800.0
2013-03-06 09:00:00-05:00 61.8243 62.1786 60.9071 61.0000 25419079.0
2013-03-06 10:00:00-05:00 61.0000 61.2857 60.8157 61.1114 21132972.0
2013-03-06 11:00:00-05:00 61.1157 61.5257 61.0143 61.1514 12994989.0
2013-03-06 12:00:00-05:00 61.1341 61.2543 60.8400 60.8771 8170974.0
… … … … … …
2013-06-28 15:00:00-04:00 56.9286 57.1757 56.5329 56.5943 19002739.0
2013-06-28 16:00:00-04:00 56.5829 57.1414 56.2157 56.7071 27717753.0
2013-06-28 17:00:00-04:00 56.7014 56.7371 56.5725 56.7286 5209022.0
2013-06-28 18:00:00-04:00 56.7214 56.7286 56.7143 56.7286 15925.0
2013-06-28 19:00:00-04:00 56.7214 56.7429 56.7214 56.7286 23758.0
[1253 rows x 5 columns]
Case 3:
aapl = api.polygon.historic_agg_v2(‘AAPL’, 1, timespan=‘minute’, _from=‘2012-01-01’, to=‘2013-07-01’).df
open high low close volume
timestamp
2013-03-05 09:45:00-05:00 60.6429 60.7014 60.5843 60.6543 626703.0
2013-03-05 09:46:00-05:00 60.6600 60.6928 60.1429 60.6721 590422.0
2013-03-05 09:47:00-05:00 60.6729 60.6729 60.5300 60.5457 385637.0
2013-03-05 09:48:00-05:00 60.5457 60.6429 60.5429 60.6429 491316.0
2013-03-05 09:49:00-05:00 60.6186 60.6571 60.6029 60.6286 505883.0
… … … … … …
2013-06-28 19:43:00-04:00 56.7357 56.7357 56.7357 56.7357 2100.0
2013-06-28 19:50:00-04:00 56.7400 56.7414 56.7400 56.7414 3703.0
2013-06-28 19:54:00-04:00 56.7429 56.7429 56.7286 56.7286 4823.0
2013-06-28 19:55:00-04:00 56.7286 56.7286 56.7286 56.7286 3416.0
2013-06-28 19:59:00-04:00 56.7286 56.7286 56.7286 56.7286 2191.0
[50000 rows x 5 columns]