Open/Close Daily Bar Prices vs. Open/Close Auction Prices on Primary Exchange

I’m curious what data source Alpaca uses for determining the open and close prices for daily bars. Is this the SIP data feed?

I have noticed that these open/close price daily bar values across all equities in Alpaca’s data is nearly identical to the values seen when using TradingView, so it doesn’t look like there is a specific Alpaca data feed difference.

However it appears there is something that is more systematically incorrect in that these open/close values do not always match the open/close print value on the primary exchange on which the stock is listed. Perhaps I am making a really broad assumption, but I have always thought that even if the opening print doesn’t happen right at the open, the data feed eventually corrects to this price at some point during the day. Is this not the case?

For instance, on 5/16/24, the open daily bar price for XOM is 118.54 (using StockBarsRequest)

'XOM': [{   'close': 117.87,
      'high': 119.3,
      'low': 117.54,
      'open': 118.54,
      'symbol': 'XOM',
      'timestamp': datetime.datetime(2024, 5, 16, 4, 0, tzinfo=TzInfo(UTC)),
      'trade_count': 117796.0,
      'volume': 15686785.0,
      'vwap': 118.233548}]

However, the actual Open Auction print on NYSE is 118.89 (using StockTradesRequest)

{'symbol': 'XOM',
  'timestamp': Timestamp('2024-05-16 13:33:00.236880+0000', tz='UTC'),
  'exchange': 'N',
  'price': 118.89,
  'size': 181275.0,
  'id': 52983525034828,
  'conditions': [' ', 'O'],
  'tape': 'A'}]

Note that the Opening Print from the Market Maker on the Primary Exchange does not take place until 3 minutes after the market has opened.

I’m retrieving the primary exchange open/close information using StockTradesRequest and then filtering on trades with conditions ‘O’ (open) and ‘6’ (close).

If anybody has any information on this phenomenon, I would love to hear more.

@dpats Excellent questions on open prices. I’ll try to answer.

First, Alpaca uses SIP data for all historical bar calculations. This is the same data most, if not all, data providers use to calculate daily bars. There is a small caveat. As long as the specified end datetime is not within 15 minutes of the current time, both the free and the paid Alpaca data plans will return bars based upon SIP data. However, due to licensing restrictions, if the end datetime includes the most current 15 minutes, the free data plan will return bars based only upon trades executed on the IEX exchange. To ensure one is getting bars based upon full market SIP data it’s best to specify feed=SIP in the request.

Now, a bit about how Open, High, Low, and Close prices are determined. There are two Securities Information Providers (SIPs) and each publishes guidelines of which types of trades to include in the High/Low and the Close/Last calculations. That guidance is on page 64 of the Consolidated Tape System (CTS) Specification and page 43 of the UTP Specification. The glaring omission however, is neither has any guidance on how to calculate Open prices. Each data provider (such as Alpaca) is left to decide that on their own. Alpaca calculates the Open price by filtering for the same trade conditions used to calculate the Close/Last prices then simply takes the first trade of the day. While this often may be an exchange’s opening print it isn’t guaranteed. In case you are interested, there is an article here which explains in detail how bars are calculated.

Other data providers may choose to use an opening auction price or exchange’s opening print as the daily open. This creates several issues. First, not all symbols participate in opening auctions. Additionally, there can be multiple opening auctions on multiple exchanges so choosing which exchange’s price to use can be a bit arbitrary. That is why Alpaca chose to use a very deterministic rule of simply using the first ‘valid’ trade of the day.

If one wants to know the opening auction prices (eg to determine fill prices for MOO orders) there is an endpoint for that called auctions.

Let’s look at the specific XOM example on 2024-05-16 from above. This is the daily bar for that day.

    "XOM": [
      {
        "c": 117.87,
        "h": 119.3,
        "l": 117.54,
        "n": 117965,
        "o": 118.54,
        "t": "2024-05-16T04:00:00Z",
        "v": 15745202,
        "vw": 118.232237
      }

The OHLC match the example above. Notice the trade count and volume are both a bit higher then those posted above. The reason? Those two values include after hour trades. The request above was apparently made before 20:00 ET when after hour trading was still going on so didn’t include the entire day.

However, lets look at the opening trades.

Notice the first 20 or so trades are all ‘odd lots’ (ie trades under 100 shares with a trade condition I). These are excluded from the bar calculation per SIP guidelines (using the guidelines for Close). Therefore the first ‘valid’ trade after that is 118.54. Alpaca simply uses that first trade as the Open. This was actually the opening auction trade on the NASDAQ Intl exchange (notice trade conditions O, X and exchange T), but it cold have just as easily been a regular market trade too.

Now let’s look at the opening auction trades (sometimes called ‘cross’ on the NASDAQ exchanges).

{
  "auctions": {
    "XOM": [
        "d": "2024-05-16",
        "o": [
          {
            "c": "O",
            "p": 118.54,
            "s": 124,
            "t": "2024-05-16T13:30:01.02339584Z",
            "x": "T"
          },
          {
            "c": "Q",
            "p": 118.54,
            "s": 124,
            "t": "2024-05-16T13:30:01.023398912Z",
            "x": "T"
          },
          {
            "c": "Q",
            "p": 118.95,
            "s": 2,
            "t": "2024-05-16T13:30:13.56120704Z",
            "x": "P"
          },
          {
            "c": "O",
            "p": 118.89,
            "s": 181275,
            "t": "2024-05-16T13:33:00.236880128Z",
            "x": "N"
          },
          {
            "c": "Q",
            "p": 118.89,
            "s": 181275,
            "t": "2024-05-16T13:33:00.2368832Z",
            "x": "N"
          }

The first thing to note is that XOM participated in the opening auctions, or provided opening prints, on three separate exchanges (NASDAQ Int “T”, NYSE Arca “P”, and NYSE “N”). Also note they all have different prices. This illustrates the fundamental problem with using the “official open” from the exchanges. Which exchange to choose?

So, I hope this addresses your issue on opening prices. There is nothing ‘systematically incorrect’ with the open prices. Alpaca simply uses the first ‘valid’ trade. Other data providers, since there are no guidelines provided by the SIPs, may choose other approaches.

Thank you for taking the time to write this incredible, high quality response, @Dan_Whitnable_Alpaca !

I definitely want to review the specification links you provided as I have never run across these before. Thank you as well for providing the auctions endpoint. I see this is not been integrated into the alpaca-py library yet. Are there any plans to do so? I’m primarily using that library and not trying to mix and match API calls using the library with raw curl requests if I can avoid it.

So, it seems that open prices are a bit of a wild west scenario. I can handle that since it appears everyone is having to handle it :slightly_smiling_face:. However, If I want to use Alpaca in a production scenario, is there any mechanism by which Alpaca can guarantee an open auction order participates on the primary exchange on which the symbol is listed? I say this because I know that Alpaca gets paid for order flow and funnels opening auction orders to various providers. Do these providers disclose on which exchange they send their open auction orders by chance?

Thanks again! I really appreciate your insightful response.