Why is a small subset of 1m OHLCV bars delayed by 30s from SIP websockets data connection?

I’ve been testing things out with the unlimited data subscription of streaming SIP websockets 1m bars–and I have consistently noticed that there appears to generally be a subset of tickers that are returned 30s late. The timestamp of the late tickers is aligned with the “original” minute. The stocks affected are not necessarily always the same. I do not think it is caused by anything on my end. I receive the majority of bars data on-time, but there are always a few stragglers that come in at the 30s mark between minutes. Any idea what’s causing this?

  1. Are these actually coming late or is someone convinced it’s on my end? [at this point, I don’t think it’s on my end–but I can always be wrong.]
  2. If they are coming late, do the data still represent the indicated timestamp? [or should the timestamps actually be updated to be xx:xx:30 through xx:xx:30]?
  3. If they are coming late, why are they coming late? / any chance they could be sent sooner? I’m OK with some kind of grace period for considering data accumulated, but 30s is pretty much the maximum possible grace period for 1 min resolution data–and I consider that pretty late if that data point is supposed to be useful in a “real-time” context. ]

For example, in my logs, after hundreds/thousands of on-time entries for each minute, I’ll end up with a few entries like this:

Long Log Snippet of a particular example of tickers that came in 30s late
... lots of on-time entries... followed by a ~30s wait and then... stragglers... 

2021/07/12 09:32:30 INFO MainThread on_msg() > Data Entry: [sip_bars_1m] ts=2021-07-12 14:32:00+00:00, key=SBUX, value={'open': 118.35, 'close': 118.385, 'high': 118.4, 'low': 118.34, 'volume': 3673}
2021/07/12 09:32:30 INFO MainThread on_msg() > Data Entry: [sip_bars_1m] ts=2021-07-12 14:32:00+00:00, key=SYF, value={'open': 49.48, 'close': 49.5, 'high': 49.53, 'low': 49.48, 'volume': 9911}
2021/07/12 09:32:30 INFO MainThread on_msg() > Data Entry: [sip_bars_1m] ts=2021-07-12 14:32:00+00:00, key=SITE, value={'open': 175.55, 'close': 175.3104, 'high': 175.8, 'low': 175.3104, 'volume': 7743}
2021/07/12 09:32:30 INFO MainThread on_msg() > Data Entry: [sip_bars_1m] ts=2021-07-12 14:32:00+00:00, key=CRTD, value={'open': 4.9152, 'close': 5.027, 'high': 5.06, 'low': 4.88, 'volume': 158400}
2021/07/12 09:32:30 INFO MainThread on_msg() > Data Entry: [sip_bars_1m] ts=2021-07-12 14:32:00+00:00, key=CETX, value={'open': 2.09, 'close': 2.04, 'high': 2.1, 'low': 2.02, 'volume': 1006254}
2021/07/12 09:32:30 INFO MainThread on_msg() > Data Entry: [sip_bars_1m] ts=2021-07-12 14:32:00+00:00, key=PBR, value={'open': 10.8499, 'close': 10.84, 'high': 10.855, 'low': 10.84, 'volume': 32487}
2021/07/12 09:32:30 INFO MainThread on_msg() > Data Entry: [sip_bars_1m] ts=2021-07-12 14:32:00+00:00, key=EWZ, value={'open': 38.62, 'close': 38.6, 'high': 38.63, 'low': 38.6, 'volume': 65083}
2021/07/12 09:32:30 INFO MainThread on_msg() > Data Entry: [sip_bars_1m] ts=2021-07-12 14:32:00+00:00, key=SPCE, value={'open': 43.79, 'close': 43.89, 'high': 43.89, 'low': 43.7, 'volume': 222394}
2021/07/12 09:32:30 INFO MainThread on_msg() > Data Entry: [sip_bars_1m] ts=2021-07-12 14:32:00+00:00, key=FDX, value={'open': 301.19, 'close': 300.93, 'high': 301.19, 'low': 300.89, 'volume': 7035}
2021/07/12 09:32:30 INFO MainThread on_msg() > Data Entry: [sip_bars_1m] ts=2021-07-12 14:32:00+00:00, key=FHN, value={'open': 16.88, 'close': 16.9, 'high': 16.9, 'low': 16.875, 'volume': 50655}
2021/07/12 09:32:30 INFO MainThread on_msg() > Data Entry: [sip_bars_1m] ts=2021-07-12 14:32:00+00:00, key=PAYO, value={'open': 10.46, 'close': 10.46, 'high': 10.46, 'low': 10.45, 'volume': 766}
2021/07/12 09:32:30 INFO MainThread on_msg() > Data Entry: [sip_bars_1m] ts=2021-07-12 14:32:00+00:00, key=LPLA, value={'open': 138.18, 'close': 138.11, 'high': 138.28, 'low': 138.11, 'volume': 3695}
2021/07/12 09:32:30 INFO MainThread on_msg() > Data Entry: [sip_bars_1m] ts=2021-07-12 14:32:00+00:00, key=DT, value={'open': 62.085, 'close': 62.25, 'high': 62.3193, 'low': 62.085, 'volume': 169884}
2021/07/12 09:32:30 INFO MainThread on_msg() > Data Entry: [sip_bars_1m] ts=2021-07-12 14:32:00+00:00, key=CERS, value={'open': 5.06, 'close': 5.055, 'high': 5.06, 'low': 5.055, 'volume': 3297}
2021/07/12 09:32:30 INFO MainThread on_msg() > Data Entry: [sip_bars_1m] ts=2021-07-12 14:32:00+00:00, key=LLY, value={'open': 236.4, 'close': 236.685, 'high': 236.685, 'low': 236.36, 'volume': 3534}
2021/07/12 09:32:30 INFO MainThread on_msg() > Data Entry: [sip_bars_1m] ts=2021-07-12 14:32:00+00:00, key=TM, value={'open': 178.49, 'close': 178.49, 'high': 178.49, 'low': 178.49, 'volume': 176}
2021/07/12 09:32:30 INFO MainThread on_msg() > Data Entry: [sip_bars_1m] ts=2021-07-12 14:32:00+00:00, key=TSLA, value={'open': 684.64, 'close': 684.3, 'high': 684.85, 'low': 683.46, 'volume': 94224}
2021/07/12 09:32:30 INFO MainThread on_msg() > Data Entry: [sip_bars_1m] ts=2021-07-12 14:32:00+00:00, key=PHB, value={'open': 19.595, 'close': 19.595, 'high': 19.595, 'low': 19.595, 'volume': 1600}
2021/07/12 09:32:30 INFO MainThread on_msg() > Data Entry: [sip_bars_1m] ts=2021-07-12 14:32:00+00:00, key=VALE, value={'open': 22.15, 'close': 22.165, 'high': 22.1764, 'low': 22.15, 'volume': 108245}
2021/07/12 09:32:30 INFO MainThread on_msg() > Data Entry: [sip_bars_1m] ts=2021-07-12 14:32:00+00:00, key=FXI, value={'open': 43.9164, 'close': 43.935, 'high': 43.935, 'low': 43.91, 'volume': 14870}
2021/07/12 09:32:30 INFO MainThread on_msg() > Data Entry: [sip_bars_1m] ts=2021-07-12 14:32:00+00:00, key=VIPS, value={'open': 18.825, 'close': 18.825, 'high': 18.835, 'low': 18.805, 'volume': 9241}
2021/07/12 09:32:30 INFO MainThread on_msg() > Data Entry: [sip_bars_1m] ts=2021-07-12 14:32:00+00:00, key=FB, value={'open': 351.405, 'close': 351.535, 'high': 351.55, 'low': 351.18, 'volume': 22571}
2021/07/12 09:32:30 INFO MainThread on_msg() > Data Entry: [sip_bars_1m] ts=2021-07-12 14:32:00+00:00, key=CAG, value={'open': 35.9, 'close': 35.9, 'high': 35.91, 'low': 35.89, 'volume': 35684}
2021/07/12 09:32:30 INFO MainThread on_msg() > Data Entry: [sip_bars_1m] ts=2021-07-12 14:32:00+00:00, key=IBB, value={'open': 163.15, 'close': 163.2, 'high': 163.2, 'low': 163.04, 'volume': 7242}
2021/07/12 09:32:30 INFO MainThread on_msg() > Data Entry: [sip_bars_1m] ts=2021-07-12 14:32:00+00:00, key=MPLX, value={'open': 29.81, 'close': 29.82, 'high': 29.82, 'low': 29.81, 'volume': 1505}
2021/07/12 09:32:30 INFO MainThread on_msg() > Data Entry: [sip_bars_1m] ts=2021-07-12 14:32:00+00:00, key=CAT, value={'open': 218.3, 'close': 218.633, 'high': 218.64, 'low': 218.2367, 'volume': 10253}

…by the way, if anyone noticed my timestamps seemed “off by 1m” by alpaca standards–that’s because I use “right-justified” time as a convention. See: Timestamp on SIP websocket minute bars -- beginning or end of minute? where I asked about that. So my timestamps represent the end of the 1m accumulation period rather than the beginning.

I recommend you raise a ticket with them specifically about these issues or else you probably won’t hear back. Recent response times on forum posts have been extremely long.

@polymathcoder If there are ‘valid’ trades during a minute, then a bar will be generated which includes trades having a “participant timestamp” (ie when the trade occurred) within that minute. So for example the minute bar labeled 9:45 would include trades with 9:45 <= timestamp < 9:46.

There is a delay between the time a trade is executed and when Alpaca receives the data. Typically this is about 20ms but can be longer. The SEC actually only requires trades be reported within 10 seconds (15 minutes in after hour trading). Therefore, there can be situations where valid trades get received by Alpaca a bit late to be included in the minute bar calculations (though they should have been). Therefore, we wait 30 seconds. If any valid trades arrive which were to be included in the previous bar, an ‘updated’ bar is sent out including these trades. That is why there may be bars at the 30 second point. They are updates. They should overwrite the bar received at the minute (if any).

This isn’t well documented but should be. Good catch.

Thanks for the response. That makes sense. Would it be possible to reduce that delay (e.g. to 10-15s)? It would make things feel more real-time if revised data arrived in the earlier part of the minute. If the data is coming in as late as 20s late, then that doesn’t seem feasible–but if the data is coming in within 10s, I would think it would be feasible to output the updates pretty much immediately.