Polygon aggregate data glitch

So, I am sending REST requests every minute for 1 min aggregate data for a ticker and returning the MAX and MIN in the data set. However, after about 14-20 requests, the dataset received is completely different from an earlier timeframe. Has anyone else had issues with this?

I’m also seeing behavior like this, reading straight from Polygon too. Invalid values for high or close or something and it’s throwing calculations off considerably

This as causing serious problems in my code ( taking positions when it should not). I actually made a short data processor that tests all the incoming objects from the server. It checks the time stamps of the data and rejects them if they are out of tolerance or if they are empty. It just seems odd that I would have to do this for a service that I pay for.

1 Like

How are you checking tolerance? That’s what I’m struggling with at the moment… I have the data in Pandas and if I plot it, I see spikes where there shouldn’t be. Timestamps look OK to me, Pandas is sorting it

Please excuse the messy code. The problems that was having was that the time frames, which i was receiving were either incorrect or the object was empty. I took the timestamp of NYC and placed a tolerance at 99.9995% difference. So if any timestamp that I receive is outside of that tolerance, it is rejected. As well, I do the same for empty data objects. It will store the most recent valid data set. This is set to do HF requests for historic data, so there can be packet loss. However, once I started using the code below, I stopped having problems. As well, you will see how many invalid data objects you actually receive, which is a lot.

This brings me to another topic. Is there a better way to do this? I am definitely not a pro at this and would like some feedback if anyone has any. Thanks and I hope this helps you out! Cheers.

Thanks… the method I was thinking of was adding an additional column which has the difference from the previous item (pandas.diff() or something) and then using a delete on that column values.

Something like these two:

https://thispointer.com/python-pandas-how-to-drop-rows-in-dataframe-by-conditions-on-column-values/

I’ve found that usually with Pandas, there’s a 2 or 3 line way of doing anything you want

I think I might give that a try… Thanks!