Hello, I have been having some trouble using the Alpaca API as it is giving me incorrect data, such that the API is useless to me. I am still practising with the paper API as i’m not sure whether to use this API for live trading yet.
I have attached the sample code below,
I can prove that alpaca returns unreliable data. NIO released its IPO on September 12, 2018, yet somehow data is returned since 2003-10-01. What concerns me more is there is no missing data in any of these rows, so how is this data being populated? How many other stocks is this the case for? Please see the NIO example below.
import alpaca_trade_api as alpaca_api
nio = alpaca_api.REST(**ALPACA_HEADERS).polygon.historic_agg_v2('AAPL', 1, 'day', _from='1900-01-01', to='2099-12-31').df # returns data since 2003
nio.index.min()
Out[8]: Timestamp('2003-10-01 00:00:00-0400', tz='America/New_York')
nio.index.to_series().diff().max()
Out[23]: Timedelta('887 days 00:00:00')
Agree with @melgazar9 I suggest not to use Alpaca data as your source; it is not reliable (In addition, I experienced the unreliability of Alpaca paper trading)
I checked their historical data, and there is a significant discrepancy between data in yahoo finance and alpaca data.
Please, check this example.
# alpaca data
NY = 'America/New_York'
start_iso = pd.Timestamp('2016-07-21 09:30:00', tz=NY).isoformat()
end_iso = pd.Timestamp('2016-07-21 16:00:00', tz=NY).isoformat()
api.get_barset(['FNF'], 'day', start=start_iso, end=end_iso).df
FNF
open high low close volume
time
2016-07-21 00:00:00-04:00 37.06 37.73 36.8 37.52 2652549
and
# yahoo finance data
import yfinance as yf
yf.download(tickers='FNF',start='2016-07-21',end='2016-07-22')
[*********************100%***********************] 1 of 1 completed
Open High Low Close Adj Close Volume
Date
2016-07-20 26.678699 26.693142 26.555958 26.592058 23.105539 4056804
2016-07-21 26.758123 27.241877 26.570396 27.083033 23.532139 4427291
The data returned by the get_barset method is not adjusted for splits or dividends. Furthermore, it’s only data reported from several major exchanges and not the broader market. It is provided free for all users with the intended use case to be debugging algorithms in paper trading.
The Polygon.io data is provided is by default adjusted for splits but not dividends. It is generally very reliable. It can be fetched as split or non-split adjusted.
Ok, the alpaca devs uses Polygon api for their stock data so they cant really do anything about the prices being incorrect, but
BARSET have been depreciated so use get_bars instead
alpaca_api = tradeapi.REST(key["PUBLIC_KEY"],key["SECRET_KEY"],key["END_POINT"])
barset = alpaca_api.get_bars([ticker], "5Day", start =trade_Date, end= "2020-08-05", adjustment="all"
,limit=507)
#adjustment = "all" is the most important thing here, it will count all stock splits, dividends, and what ever, making the data accurate.
However there still are times when the API gives back pre IPO price for no reason like seen below:
def removeExtraFromBarset(self,barset):
if len(barset)==0:
return []
removeindex = -1
for i in range(len(barset)):
if barset[i].v ==0:
removeindex = i
if(removeindex!=-1 and barset[i].v!=0):
return barset[removeindex+2:]