I have Pro data plan and I run web socket trade updates for AAPL on March 10, 2021 for both Alpaca and Polygon for comparison. I consistently found that I was getting only around 80% trades of what I was getting in Polygon websocket.
Here is the stats in 10 min window:
Start time: 2021-03-10T17:10:00Z (NYC 12:10 PM)
End time: 2021-03-10T17:20:00Z (NYC 12:20 PM)
#Trades received in Alpaca websocket = 9170 #Trades received in Polygon websocket = 11236 #Trades in that timewindow in historic data download via Polygon = 11236
Alpaca is missing 20% trades and Polygon have 100% coverage.
Wondering what is wrong on Alpaca side?
I did bit more analysis on missing trades, here is trades in 1 second window (2021-03-10T17:10:04.000Z - 2021-03-10T17:10:05.000Z) for Alpaca stream and Polygon data.
Polygon have 23 trades but Alpaca only have 13 trades. Example of trade which is missing on Alpaca side:
Have you tried looking at the exchange ids? Is it possible Alpaca is missing trades from 1 or more exchanges? Or can you find a pattern with trade conditions? For some reason, some trades with special conditions could be filtered. Just an idea.
Can someone at Alpaca comment on this difference?
Where is the Alpaca data being sourced from in general?
I assumed they were just going to offer Polygon straight through the UI for the $50/month, but that doesn’t seem to be the case.
But in poly feed and historic data most of condition 14 (Intermarket Sweep) trades also have condition 41(Trade Thru Exempt) attached to them.
For trading decision purpose I generally ignore trades with condition 41 because they doesn’t reflect true price. But on Alpaca side this is missing.
It seems a potential lack in dark-pools is the leading hypothesis on the 10-20% tape gap. It would make sense since the tape can be perpetually behind ~10 seconds for all dark pool executions. Polygon states all 16 exchanges & all US volume, and it sounds like Alpaca’s Pro description of “all US exchanges” is equivalent.
Is everyone seeing equivalent:
This is what I’m trying to figure out. If they are sourcing the data from the SIP feeds, everything would be included. Dark pools are required to report to FINRA, and FINRA’s data gets broadcasted through CTA and UTP. They might be sourcing data from a third party? I really have no idea, but that would make more sense than them purposefully excluding 10-20% of data. If they are sourcing the data through a third party, they have to get the NBBO data from somewhere to execute trades on those NBBO prices(which I assume they are doing).
I found numerous data issues with Alpaca, to the point that I’m quite confident that the team behind is not competent and have no fucking idea what they are doing. You won’t hear from them why it’s messed up because they don’t know themselves.
Ran into a similar issue today, and based on my research, I believe the issue is that Alpaca is using the Exchange provided timestamp for their time windows, while Polygon is using the SIP timestamps.
While Polygon provides both the SIP and Exchange timestamp, Alpaca unfortunately only seems to provide/use the Exchange timestamp. This causes some quite significant discrepancies in how bars are aggregated thus leading to incorrect prices and volume. I will say that the full volume of trades do seem to exist, but just not in the correct bars. i.e. on 5/4/2022 the 8AM EST bar for TQQQ is showing around 200k volume on Polygon/TradingView, but only ~20k (10%) on Alpaca. However, if I look at the volume of trades spanning 7AM EST to 8AM EST, the total volumes are more or less the same. Unfortunately Alpaca isn’t just off by 1 or 2 bars, sometimes the trades are an hour off compared to Polygon.
I currently have a ticket open with support, but I’m not feel very hopeful. I get the feeling that the support person doesn’t understand my issue, and for whatever reason isn’t able/willing to pull in someone from the data team to provide support.
Alpaca could be the perfect offering if they get these data issues fixed. I’m currently paying $200/mo for Polygon, and I’m really hoping I can switch to Alpaca sooner than later.