Alpaca Python API only returns 100 days data

A few things going on here. First, there was a change 3 days ago which set the default number of bars to return at 100. This was actually always the default but it just wasn’t enforced uniformly.

So, one thing that happened is the get_aggs method now defaults to returning 100 bars. Maybe not a big issue except this API doesn’t have a way to specify the limit. You are now stuck with a max of 100 bars. The solution is to use the get_barset method instead. The get_aggs method/API wasn’t documented and was on the road to being deprecated. It won’t be included in the V2 updates coming out in a few months.

The get_barset method (the bars API) is the recommended way to get Alpaca aggregated trade data. It will default to 100 bars so, if the date range is longer than that, one will need to specify a limit parameter. Like this

bar_data = api.get_barset(symbols='SPY', timeframe='day', start=from_date_iso, end=to_date_iso, limit=1000)

The API will return a maximum of 1000 bars which is about 4 years of daily data or 2-3 days of minute data (depending upon number of after hours bars).

If migrating from the get_aggs method note the date parameters should be in ISO format. Here and here are forum posts showing how to ensure dates are formatted correctly by using the isoformat method.

A little background on this method. The primary use case is to fetch data in a trading algorithm relative to the current day or time. As an example, perhaps one wants to make a trade decision based on the 20 trading day average price of a stock. This is sometimes called the Simple Moving Average or SMA. Rather than fetching the current day and calculating the start day, this could be done like this

bar_data = api.get_barset(symbols='SPY', timeframe='day', limit=20)
sma_20 = bar_data.df['SPY'].close.mean()

Similarly, if one wanted the max price over the last 60 bars of trades, it could be done like this

bar_data = api.get_barset(symbols='SPY', timeframe='minute', limit=60)
max_price = bar_data.df['SPY'].high.max()

These work because the end defaults to the current datetime. In an algorithm, one may never need to deal with dates. Just use offsets from the default current date or time.

One thing to consider when dealing with bars. If there is no data for a bar (ie the stock didn’t trade during that time) then there won’t be a bar. A lot of folks question why there are ‘missing bars’ or why sometimes the method returns 1 day of data and sometimes 2. That’s the reason.

I see there was a comment “There are also ongoing issues with credential validation.” That is a little off topic from data but what is the issue? There shouldn’t be problems with authorization. Maybe open a separate post and we can get it resolved.

1 Like