Polygon aggregate data glitch

Zachary_Warren · April 15, 2020, 4:09pm

So, I am sending REST requests every minute for 1 min aggregate data for a ticker and returning the MAX and MIN in the data set. However, after about 14-20 requests, the dataset received is completely different from an earlier timeframe. Has anyone else had issues with this?

nabeel · April 21, 2020, 12:54am

I’m also seeing behavior like this, reading straight from Polygon too. Invalid values for high or close or something and it’s throwing calculations off considerably

Zachary_Warren · April 21, 2020, 12:03pm

This as causing serious problems in my code ( taking positions when it should not). I actually made a short data processor that tests all the incoming objects from the server. It checks the time stamps of the data and rejects them if they are out of tolerance or if they are empty. It just seems odd that I would have to do this for a service that I pay for.

nabeel · April 21, 2020, 1:44pm

How are you checking tolerance? That’s what I’m struggling with at the moment… I have the data in Pandas and if I plot it, I see spikes where there shouldn’t be. Timestamps look OK to me, Pandas is sorting it

Zachary_Warren · April 21, 2020, 2:02pm

Please excuse the messy code. The problems that was having was that the time frames, which i was receiving were either incorrect or the object was empty. I took the timestamp of NYC and placed a tolerance at 99.9995% difference. So if any timestamp that I receive is outside of that tolerance, it is rejected. As well, I do the same for empty data objects. It will store the most recent valid data set. This is set to do HF requests for historic data, so there can be packet loss. However, once I started using the code below, I stopped having problems. As well, you will see how many invalid data objects you actually receive, which is a lot.

This brings me to another topic. Is there a better way to do this? I am definitely not a pro at this and would like some feedback if anyone has any. Thanks and I hope this helps you out! Cheers.

nabeel · April 21, 2020, 2:11pm

Thanks… the method I was thinking of was adding an additional column which has the difference from the previous item (pandas.diff() or something) and then using a delete on that column values.

Something like these two:

https://thispointer.com/python-pandas-how-to-drop-rows-in-dataframe-by-conditions-on-column-values/

I’ve found that usually with Pandas, there’s a 2 or 3 line way of doing anything you want

Zachary_Warren · April 21, 2020, 3:02pm

I think I might give that a try… Thanks!

Topic		Replies	Views
Getting data from polygon into a dataframe Alpaca Account Troubleshooting	1	1395	January 3, 2020
Bad prints in market data Alpaca Market Data	4	743	May 29, 2021
Polygon data not consistent with other data source Alpaca Market Data	0	672	November 16, 2020
Help with alpaca polygon api Alpaca Account Troubleshooting	2	2518	January 3, 2020
Polygon - Historic_agg_v2 Alpaca Integration Applications	2	2547	February 14, 2020

Polygon aggregate data glitch

Related Topics