Market historical data discrepancies

I recently began experimenting with Alpaca’s free tier Market Data API. Specifically, I have been testing the “Historical bars” stock endpoint which returns OHLC aggregates based on a symbol and timeframe. When using an intraday interval such as the 5 minute, 1 hour, or 4 hour there is a discrepancy between the candle data returned by the api and candle data available on platforms such as Yahoo Finance or TradingView. Is this an issue with the free tier subscription? Could someone explain what is causing this discrepancy?

@matt.bt123 It may help to understand a few hings about market data. Essentially the trade and quote data is ‘owned’ by the exchanges and they charge a fee to use or display any current data. Data older than 15 minutes (ie not ‘current’) they allow for use without a fee. Alpaca licenses data from the exchanges and passes along those fees for current data in the form of a paid subscription. Alpaca paid subscriptions have access to full market historical and current data. Alpaca free subscriptions have access to the same full market historical data (ie older than 15 minutes) but not full market current data.

In an effort to help users test and debug algorithms, Alpaca provides data from the IEX exchange which uniquely does NOT charge for using their data. The free data subscriptions have access to current market data but only those trades and quotes (and associated bars) which executed on the IEX exchange.

So, if one is querying non-current data (ie data older than 15 minutes) it will be the same whether one has a free or a paid subscription. It is helpful to explicitly specify feed=sip to ensure one is getting the full market sip data and not the subset provided with feed=iex. The full market sip data should match other sources (unless they are for some reason not providing full market data).

This may clear up any questions, but if not, could a specific example be provided? Include the symbol, start time, end time, and bar interval, and perhaps a screenshot of other sources which don’t seem too match. Also indicate the current time (to determine if it’s a request for current data or not).

1 Like