To much noise in historic ohlc market data

It seems that the historical data that can be obtained through the Alpaca API is just useless in terms of strategy development… Does anyone have any idea what to do about this problem?

@Serg What are you seeing, or not seeing in the historical data which makes you feel it “is useless in terms of strategy development”? What is it you are trying to do and what would you expect to see?

I would like to see the data without noise. Here is the same period, just OHLC taken from another source. If you look at these two images, you can see the difference…

#Spikes #Outliers in aggregate history bars from Alpaca

@Serg The Alpaca data for AAPL on 2021-08-31 is correct. The large dips in the lows (and an occasional high) in this case seem to be the result of trades with a P condition. As an example, below is the single trade which caused the 14:03 bar to have a low of $126.96.

While I might argue that a P condition should stand for “Problematic” it actually stands for " Prior Reference Price Trade". Basically, this is a trade which actually occurred at the time specified (the timestamps btw are when the trade actually occurred). However, the trade should have occurred earlier. It’s not a late reported trade but actually a late executed trade. These trades can be minutes, hours, or days late. There’s a bit more information here on the FINRA site.. So, while this is sort of an “oops”, it is a valid trade which gets reported like most other trades.

In determining which trades to include and which to exclude, Alpaca relies on our SIP providers guidance. Both UTS and CTA recommend using P condition trades in calculating the high, low, and volume, and with qualifications, in calculating the open and close. See the respective tables below

Alpaca implements the SIP recommendations very faithfully. Many other providers (perhaps the one referenced above) take shortcuts. Specifically, other data providers often only filter trades by the “Last” or close conditions and then, incorrectly, use those trades in calculating the open, high, low, close, and volume. Alpaca filters each value separately based upon the SIP recommendations. That is why other sources may have different high and low values from Alpaca.