Selective WebSocket Feed Failures During Market Open - Need Feedback

I’m experiencing selective WebSocket feed failures for individual symbols during market open hours that are causing missed trading opportunities. Looking for community feedback on whether others have experienced this and potential solutions.

:bar_chart: My Setup

  • Data Plan: Algo Trader Plus ($99/month)

  • Connection: Single WebSocket connection (as per plan limits)

  • Symbols: ~35 symbols subscribed via batched subscriptions

  • Implementation: Python with alpaca-trade-api, real-time algorithmic trading

:detective: Specific Incident Details

Date: August 7, 2025 Symbol: TFPM Breakout: $25.45

Timeline:

  • 09:29:47 - TFPM: $25.22 (Extended hours, below breakout, Tier: active)

  • 09:30:34 - TFPM: $25.45 (Regular hours, exactly at breakout, Tier: active) :white_check_mark:

  • 09:30:34 - 09:34:05 - NO TFPM PRICE UPDATES (4-minute gap) :cross_mark:

  • 09:34:05 - Trade triggered at $25.50 limit price

  • 09:34:05 - TFPM: $26.28 (price already $0.78 above limit!) :cross_mark:

Result: Order never filled, missed profitable trade

:magnifying_glass_tilted_left: Key Observations

What Was Working:

  • Other symbols updated normally during the gap period

  • WebSocket connection remained active (no disconnection messages)

  • No spread rejections for TFPM

  • System correctly classified TFPM as “active tier”

What Failed:

  • TFPM specifically stopped receiving price updates

  • No error messages or warnings in logs

  • Only TFPM affected, indicating selective feed failure

  • Resumed normal updates after the gap

:laptop: Code Implementation

Using standard WebSocket subscription in batches:

python

for batch in symbol_batches:
    await ws.send_json({
        "action": "subscribe",
        "trades": batch,
        "quotes": batch
    })

  1. Has anyone experienced selective symbol feed failures during market open (9:30-9:35 AM)?

  2. Is this a known issue with high volatility periods or specific symbols?

  3. Are there WebSocket best practices I should implement for market open reliability?

  4. Should I implement more aggressive REST polling as a failsafe during market open?

  5. Is there a way to detect when individual symbols stop updating within a WebSocket connection?

:shield: Current Mitigation Strategies

I’m considering implementing:

  • Market open enhanced monitoring (9:30-9:35 AM)

  • Selective symbol health checks

  • Aggressive REST failover for stale symbols

  • Pre-market connection validation

:light_bulb: Community Input Needed

For Algo Trader Plus subscribers:

  • Have you seen similar selective feed issues?

  • What market open reliability strategies work for you?

  • Any recommended failsafe implementations?

For Alpaca Team:

  • Is this a known issue during high volatility periods?

  • Are there specific symbols that commonly have feed issues?

  • Any planned improvements to WebSocket reliability?

:chart_increasing: Impact

This type of selective feed failure can be costly for algorithmic trading strategies, especially during market open when opportunities are time-sensitive. With a $99/month data plan, I expect better reliability for individual symbol feeds.

Any insights, similar experiences, or solutions would be greatly appreciated!


Technical Details Available Upon Request:

  • Full log files from the incident

  • WebSocket connection details

  • Symbol subscription patterns

  • Failover implementation details

@DirtyDel There weren’t any system issues at the time which would cause specific symbols to not stream.

The first thing is to verify you are connecting to the sip feed. If you are using the alpaca-trade-api SDK, the Datastream object should be instantiated like this

my_data_stream = DataStream(key_id='xxxx', secret_key='xxxx', feed='sip')

If you are connecting to the sip feed, then I can check the logs for help in troubleshooting. A few questions… do you know the ip address of your client (that is ideal but understand if you don’t), what is the account number associated with the API key you used to authenticate, what were the exact symbols you subscribed to and were they trades, quotes, or both. Also verify this was an issue on 2025-08-07 between 9:30-9:34 EDT (ie market time). I can use that information to search the logs for your connection entries.

There were a number of connections which didn’t keep up with the data stream yesterday. This is very common. There can be a huge amount of data streamed immediately at market open and many algos simply cannot or do not process it fast enough.

A check which every algo should do is to compare the trade or quote timestamp with the local system time. If the local system is synchronized to NIST time, quotes should typically never have a delta more than 25ms. Trades should also be in this range but could be as high as 10 seconds (execution venues have 10 seconds to report a trade). If your algo sees time deltas more than 50ms for quotes and certainly ever more than 10 seconds for trades, then the algo isn’t reading the data fast enough and more than likely missing data. Log if the delta time ever exceeds these values.

Also verify this was an issue on 2025-08-07 between 9:30-9:34 EDT (ie market time)
Yes, I’m verifying**.
**
Here are the symbols that were being quoted, Active symbols are updated once per second, the others are updated 10s - 1m based on percentage away from breakout price.**

Active Symbols:**

  • ENVA, ATAT, CALM, LPLA, NPB, NVMI, CBOE, IBN, RL, URBN, ADSK, BSY, FTNT, HLI, SNEX, CASH, GLW, FOXA, TOST, IBEX, RAMP, ENSG

Additional Symbols from Extended List:

  • YMM, COCO, AEM, RACE, TBBK, QFIN, BE, JBTM, CAT, PNR, ALV, AYI, WPM, FIVE, ALL, AMP, ROAD, UI, BMI, EWBC, MLI, B, GFI, LBRDK

From your response, here are some features I’m implementing:

Immediate Diagnostic Steps

Check WebSocket Processing Speed:

  1. Add Timestamp Comparison

python

# In your WebSocket message handler:
if item['T'] == 'q':  # Quote data
    alpaca_timestamp = item['t']  # Alpaca timestamp
    local_timestamp = datetime.now()
    delta_ms = (local_timestamp - alpaca_timestamp).total_seconds() * 1000
    
    if delta_ms > 50:  # More than 50ms lag
        log_warning(f"WebSocket lag detected: {symbol} {delta_ms:.1f}ms behind")
  1. Monitor Processing Time

python

# Time how long each update_price() call takes
start_time = time.time()
await update_price(symbol)
processing_time = (time.time() - start_time) * 1000
if processing_time > 25:  # Should be under 25ms
    log_warning(f"Slow processing: {symbol} took {processing_time:.1f}ms")

I can share my client ip and account # with you privately.

@DirtyDel The symbols you listed do not include TFPM. Could it be as simple as you weren’t subscribed to TFPM?

What do you mean “Here are the symbols that were being quoted”. Does that mean that you subscribe to quotes for those symbols or are you subscribing to trades and quotes?

Also what do you mean by “Active symbols are updated once per second, the others are updated 10s - 1m based on percentage away from breakout price.” Are you changing the symbols you are subscribing to every 1-10 seconds? Are you unsubscribing then re-subscribing?

Also, if you could direct message me the API key (not secret) and/or the account number associated with the key which you are using to authenticate. There are thousands of log entries and I’m trying to pinpoint yours.

@Dan_Whitnable_Alpaca 100% TFPM was included, that list was all other symbols. Also, thanks for clarifying the terminology. Let me explain my setup**:**

WebSocket Subscriptions:

  • I subscribe to BOTH “trades” and “quotes” for all symbols at startup

  • Subscription happens once via batched messages (10 symbols per batch)

  • I never unsubscribe or resubscribe during trading hours

  • All ~35 symbols remain subscribed for the entire session

Processing Logic (NOT subscription changes):

  • I receive quote data for ALL symbols via WebSocket

  • I process the received data at different frequencies:

    • Active symbols (near breakout): Process immediately when received

    • Watch symbols (2-5% from breakout): Process every 5 minutes

    • Dormant symbols (>5% from breakout): Process every hour

  • This is CLIENT-SIDE processing logic, not server-side subscription changes

The TFPM Issue:

  • TFPM was subscribed to both trades and quotes (like all symbols)

  • We should have been receiving quote data continuously

  • Between 09:30:34 - 09:34:05, no quote messages arrived for TFPM

  • Other symbols continued receiving quote data normally

  • After 09:34:05, TFPM quote data resumed

Code Implementation:

python

# Startup: Subscribe once to all symbols
for batch in symbol_batches:
    await ws.send_json({
        "action": "subscribe", 
        "trades": batch,     # Subscribe to trades
        "quotes": batch      # Subscribe to quotes
    })

# Runtime: Process received data based on symbol tier
if item['T'] == 'q':  # Quote message received
    symbol = item['S']
    # Process based on symbol's current tier (but always subscribed)

Please check private message for account info.

@DirtyDel Thank you for the ip and account info. That makes finding your connection in the logs much easier.

So, I confirmed 1) you are connecting to the sip feed and 2) you subscribe to quotes and trades for 29 symbols which you subscribe to in ‘batches’ of 10 as you mentioned and 3) as of 9:30 ET 2025-08-07 you were subscribed at TFPM as you noted. Your connection was actually established at 16:01:54 ET the previous afternoon 2025-08-06 and remained solid overnight.

The issue is at 9:30:36 the Alpaca websocket server stopped getting the required ‘pong’ handshake from the app, thought the app had disconnected, then closed the connection. Specifically the log error is ConnectionClosedError . The Websocket connection suffered from excessive latency and was closed after reaching the timeout of websockets’ keepalive mechanism. The websockets docs state:

There are two main reasons why latency may increase:

  • Poor network connectivity.

  • More traffic than the recipient can handle.

A poor network connection can be an issue, but in your case the issue is probably the latter. Your app simply isn’t keeping up with the messages, get’s bogged down in processing them, and doesn’t write back the required ‘pong’ message to the Alpaca server in a timely manner.

This is transparent to you and your app for two reasons. First, the alpaca-py DataStream class implements automatic re-connect and 2) websockets maintains a local low-level buffer of the streamed data which your app may be reading from during the time the server is disconnected giving the impression there is still a connection.

So at 9:30:36 the Alpaca websocket server stopped getting the required ‘pong’ handshake from the app and disconnected. At 9:31:29 the connection was re-established automatically by alpaca-py. It took a few seconds to reinstate the subscriptions and everything was up and running again at 9:31:31. Unfortunately, the app re-connects ‘silently’ so you can miss data without knowing it.

This is actually quite typical. There is a big jump in data when markets open at 9:30 and many apps get bogged down. Below is a graph of the total systemwide ConnectionClosedErrors per minute. Notice the spike starting at 9:30:24. These are the number of apps which didn’t respond with the required ‘pong’ in a timely manner probably because they couldn’t handle the large quantity of messages at market open.

During the time the server disconnected your app wouldn’t have received any symbol data. Not sure why you feel you got data but just not TFPM.

Below are the specific log entries where the connection is lost, disconnected, and re-established with 3 batches of subscriptions.

{"level":"warn","timestamp":"2025-08-07T13:30:36.131Z","caller":"stream/conn_handler.go:102","msg":"maintain connection error","error":"read error: timeout","clientIP":"38.30.175.126","connID":10477910}

{"level":"info","timestamp":"2025-08-07T13:30:36.131Z","caller":"stream/client.go:507","msg":"client disconnected","clientIP":"38.30.175.126","connID":10477910}

{"level":"info","timestamp":"2025-08-07T13:31:30.022Z","caller":"stream/client.go:769","msg":"client subscription changed","clientIP":"38.30.175.126","connID":42867381,"path":"/v2/sip","query":"","auth_type":"account","auth_id":"74d8c2f8-5a50-4c63-ba1a-f67a94c53acd","kind":"subscribe","corrections":{"symbols":"ENVA,ATAT,CALM,NVMI,IBN,RL,ADSK,BSY,FTNT,HLI","count":10},"cancel_errors":{"symbols":"ENVA,ATAT,CALM,NVMI,IBN,RL,ADSK,BSY,FTNT,HLI","count":10},"trades":{"symbols":"ENVA,ATAT,CALM,NVMI,IBN,RL,ADSK,BSY,FTNT,HLI","count":10},"quotes":{"symbols":"ENVA,ATAT,CALM,NVMI,IBN,RL,ADSK,BSY,FTNT,HLI","count":10}}

{"level":"info","timestamp":"2025-08-07T13:31:30.980Z","caller":"stream/client.go:769","msg":"client subscription changed","clientIP":"38.30.175.126","connID":42867381,"path":"/v2/sip","query":"","auth_type":"account","auth_id":"74d8c2f8-5a50-4c63-ba1a-f67a94c53acd","kind":"subscribe","trades":{"symbols":"CASH,FOXA,TOST,IBEX,RAMP,ENSG,TFPM,AMP,GRMN,NWG","count":20},"quotes":{"symbols":"CASH,FOXA,TOST,IBEX,RAMP,ENSG,TFPM,AMP,GRMN,NWG","count":20},"corrections":{"symbols":"CASH,FOXA,TOST,IBEX,RAMP,ENSG,TFPM,AMP,GRMN,NWG","count":20},"cancel_errors":{"symbols":"CASH,FOXA,TOST,IBEX,RAMP,ENSG,TFPM,AMP,GRMN,NWG","count":20}}

{"level":"info","timestamp":"2025-08-07T13:31:31.984Z","caller":"stream/client.go:769","msg":"client subscription changed","clientIP":"38.30.175.126","connID":42867381,"path":"/v2/sip","query":"","auth_type":"account","auth_id":"74d8c2f8-5a50-4c63-ba1a-f67a94c53acd","kind":"subscribe","cancel_errors":{"symbols":"RSI,MTSI,AII,GOOG,PJT,PINS,ACAD,ODD,AS","count":29},"trades":{"symbols":"RSI,MTSI,AII,GOOG,PJT,PINS,ACAD,ODD,AS","count":29},"quotes":{"symbols":"RSI,MTSI,AII,GOOG,PJT,PINS,ACAD,ODD,AS","count":29},"corrections":{"symbols":"RSI,MTSI,AII,GOOG,PJT,PINS,ACAD,ODD,AS","count":29}}

Your app is actually consistently dropping the connection in a similar manner (which you probably weren’t aware of). Below is a graph showing the number of times your app disconnected per hour on 2025-08-06 and 2025-08-07.

Notice your app generally didn’t disconnect when markets were closed. This is why I do not think the issue is network related but rather simply the volume of trades and quotes.

To test, try subscribing to a single symbol. As an alternative, subscribe to the symbols you have but omit any logic in your callback routine. Only include pass and return. Your algo should be able to process that very quickly. I can monitor the logs and see if your app disconnects. As a solution, consider not streaming but rather simply use the REST APIs to fetch data. It’s the same data and available with about the same 25ms latency.

@Dan_Whitnable_Alpaca I couldn’t have asked for a better explanation. Thank you for the pictures. I will decipher your analysis and continue development. I will get back to you shortly with potential enhancements.

1 Like