V2 websocket stream missing data compared to Polygon

I’m trying to do some simple test to check the quality of V2 stream data due to recent switch from Polygon to Pro plan. I was checking trade stream data, and seems like it is missing about 10% trade data compared to Polygon.

I aggregated trade data for 10 minutes for both Pro plan and Polygon for a few tickers during exactly the same time period, and I consistently saw about 8-15% of trade data missing for different tickers on Alpaca Pro.

For example, between March 1st 13:50pm to 14:00pm:
Ticker   Source   # of trades   sum of trade size
TSLA    Pro     9089      290766
TSLA    Polygon   9991      314831
FB     Pro     2168      144316
FB     Polygon   2568      164978

I also spot checked a few trade data, and seems all the trade data from Pro can be found in Polygon, but not vice versa.

I wonder if this is a known issue and if Alpaca team will be able to investigate/fix. Also not sure Alpaca has done any benchmark for data quality. If it is indeed an issue then it can be a big problem for algo trading…

I checked the historical trade data from Pro, and the behavior is the same.

Another possible reason is Polygon contains data source from some exchanges that Alpaca Pro doesn’t. Below is from Polygon FAQ:

We include all 16 public exchanges + dark pools. Our feed contains 100% market volume for US trading.

Looks like Polygon include dark pools. Not sure if Alpaca include that.

Bump up the question here. Also want to see if anyone has done tests for Polygon vs Pro data.

I’d like to hear a response to this as well, missing dark pool data may very well be the explanation but this needs to be confirmed by Alpaca, transparency on the data we now will have to pay for is essential.

bump, this issue been open for too long. I am considering the paid option, but if this issue still exist then there is really no point paying for garbage data.