Incorrect spikes/peaks in historical data

I use the (paid) SIP feed for historical 1 min OHLC data. The data seems pretty much in line with most sources except from time to time there are these random incorrect spikes (see image, top is Alpaca data, bottom is Yahoo).

What’s going on there and is there a fix? This makes the data pretty useless for backtesting because these spikes will almost certainly (incorrectly) trigger stop-loss or take-profits limits.

1 Like

I’ve seen this with historical data before as well. I also see it a lot when streaming live trade data, where wild prices will come through, but when compared to Yahoo, Fidelity, or even Alpaca’s own data after the current minute bar is finalized, those prices “never happened”.

I see the same issues with historical trades data.
https://data.alpaca.markets/v2/stocks/{symbol}/trades

I observe these spikes (outside of the current quote) also on the stream from polygon.io. Some get corrected, but some stay. I assume they really are part of the tape, and if everyone else is also seeing them, they become part of the game, and might cause reactions by algorithms and traders :man_shrugging:

Answer from @Dan_Whitnable_Alpaca to a similar question on Slack:

Those are actual reported trades from the SIPs and are typically 1) an odd trade condition such as a ‘Derivatively Priced - 4’ or ‘Prior Reference Price - 4’ which would normally be excluded from bar calculations or 2) an incorrect reported price which will get corrected by the SIPs later in the day.

And also this Slack thread:

Alpaca was inadvertently not filtering for the U condition (Extended Hours Sold (Out of Sequence)). These are trades which 1) executed outside of normal market hours and 2) were reported late. These typically do not occur that frequently. However, in the case of TSLA on 2023-12-01, there apparently were several trades which executed before market hours but didn’t get reported until 9:28 ET. These were the trades which were messing up the minute bar data.
A fix has been put in place to correctly exclude these trades with a U condition and the bar data has been updated.