[v2 API] What is the timestamp in stream trade schema and how to filter late reported trades

Reference: Real-time data - Documentation | Alpaca

I am using stream API to get real time trade data.
What is the timestamp in trade schema?
“t RFC-3339 formatted timestamp with nanosecond precision.”

Assuming it is SIP timestamp, how can we filter trades which are reported too late.
Because if a trade was reported 10 second late, I simply want to ignore it.

Here is an example of trade which SIP feed report 5 second late (AAPL, 2021-02-26, data obtained from polygon APIs):

price: 122.6233
size: 400
trade_id: “1060”
exchange_id: 4
trf_id: 10
tape: 3
sip_timestamp {
seconds: 1614349806
nanos: 90977051
}
exchange_timestamp {
seconds: 1614349801
nanos: 788000000
}
trf_timestamp {
seconds: 1614349806
nanos: 90618064
}
sequence_no: 195469

You can see that “sip_timestamp” is 5 second after “exchange_timestamp”, so it was reported 5 seconds after trade actually took place at exchange.
Majority of trades (~95%) get reported within 100-200ms, but there are some which are reported too late and I want to ignore them.
But without exchange timestamp in stream trade schema, it is impossible to do that.
Can we please also have exchange timestamp?

The current timestamp is actually the ‘participant_timestamp’ and not the ‘sip_timestamp’. So that is the time which the trade took place. Alpaca is looking to maybe include both but that’s the current state.

The best way to filter ‘late’ data is to look at the trade condition. Trades which are reported ‘late’ will have an L as one of the trade conditions. This implies it was reported more than 10 seconds after the actual trade during market hours or 15 minutes after the trade during extended hours trading.

As an alternative, One could also just compare the current time to the timestamp. Consider using the clock API to get the current time synchronized with the markets.

Hope that helps.

1 Like

After running websocket for 1 day I could verify that it is indeed “participant_timestamp”. It is sufficient for me.

Thanks for the response. And thanks for being so actively responding to everyone questions on slack and here. I learned a lot just by reading your responses to various questions. :smile:

A quick update. Alpaca is looking into changing the way v2 data is presented and aggregate bars are calculated. First, both the participant_timestamp and the sip_timestamp will be surfaced. Additionally, aggregated bars will be available immediately after the minute. However, an additional check is made a bit later (we are thinking 30 seconds). If there were any late but still ‘valid’ trades which should have appeared in the original bar, we will send out an ‘updated bar’. There will be a field in a bar which indicates if it is an update or not. This is a way we can provide both timely ‘real time’ data but also balance that with more correct aggregates.

1 Like

Thanks for the update. Providing both sip_timestamp and participant_timestamp will be super cool.

1 Like

Has this change been implemented yet? IS the SIP timestamp now available in the realtime stream?