Sort positions by creation time

Hello everyone,
I was wondering if it is possible to sort my current positions by time of creation in ascending or descending order in Python.
Also one idea, since I’m developing a pairs trading strategy, it would be awesome to have some sort of labeling in the alpaca-py librairy, to find these current pairs much easier in the long-run.

Have a great day!
Vejas

@Vejas Good ideas/suggestions. The basic issue with the current position representation is that it’s static. It’s simply a snapshot of one’s current holdings. Because it’s a literal ‘picture’ there aren’t attributes for ‘create time’ or the ability to add labels. Not that this functionality couldn’t be added but the devil is in the details.

I personally create a positions dataframe from the orders which looks like this

A ‘position’ always starts with 0 qty. The first filled order determines if the position is long or short (a buy is long and a sell is short). There can be multiple buys and sells but the position is closed if/when the net qty is 0. A new fill starts a new position. Each time a new position is started it is assigned a sequential ‘position_number’. If there is a non-zero value in the position qty field it implies the position is open.

The profit and gain values are really only valid for closed positions. Calculating the cost basis for open positions is 1) dependent upon the method one chooses (eg FIFO or average) and 2) is a bit complex. I simply didn’t address those.

Below is the code if anyone is interested. There is a bit of initialization and then 3 functions. A positions dataframe can be generated something like this

# First get orders.
# Ideally the start would be a time when all positions are zero.
# The normalize method ensures the time is midnight.

start_time = pd.to_datetime('2023-10-26').tz_localize('America/New_York').normalize()
orders = get_orders(start_time)
transactions = create_transactions(orders)
positions = create_positions(transactions)

A requirement is that a position must be at 0 at the start_time. One can simply use a start_time at or before the first trade in the account (ie fetch all orders in an account) or pick a date when all positions are 0. This works for me because I generally close all positions at the end of each day. Just be mindful of this limitation.

Current positions can be found by filtering the dataframe by qty==0. The dataframe can also be sorted by the opened_at time.

2 Likes
# First install and import the Alpaca python API SDK 'wrapper'
!pip install -q alpaca-trade-api
import alpaca_trade_api as alpacaapi
# Import some other useful packages
import pandas  as pd
import numpy as np
# Alpaca keys etc
ALPACA_API_ENDPOINT = 'https://paper-api.alpaca.markets'
ALPACA_API_KEY = 'xxxx'
ALPACA_API_SECRET_KEY = 'xxxx'

# Instatiate the REST object
api = alpacaapi.REST(ALPACA_API_KEY, ALPACA_API_SECRET_KEY, ALPACA_API_ENDPOINT)
def get_orders(start_time):
  # Initialize things needed for the while loop
  CHUNK_SIZE = 500
  all_orders = []
  check_for_more_orders = True

  while check_for_more_orders:
    # Fetch a 'chunk' of orders and append it to our list
    api_orders = api.list_orders(status='all',
                                  after=start_time.isoformat(),
                                  direction='asc',
                                  limit=CHUNK_SIZE,
                                  nested=False,
                                  )
    all_orders.extend(api_orders)

    if len(api_orders) == CHUNK_SIZE:
      # Since length equals the CHUNK_SIZE there may be more orders
      # Set the ending timestamp for the next chunk of orders
      # A hack to work around complex orders having the same submitted_at time
      # and avoid potentially missing one, is to get more than we need
      start_time = all_orders[-3].submitted_at

    else:
      # That was the final chunk so exit
      check_for_more_orders = False

  # Convert the list into a dataframe and drop any duplicate orders
  orders_df = pd.DataFrame([order._raw for order in all_orders])
  orders_df.drop_duplicates('id', inplace=True)

  return orders_df
def create_transactions(all_orders):
  # just look at the filled orders
  orders = all_orders.query('status in ["filled", "partially_filled", "canceled"] and filled_qty != "0"').copy()
  
  # create an empty dataframe to hold transactions
  transactions = pd.DataFrame()

  # add fields we need from orders to transactions
  transactions['symbol'] = orders.symbol
  transactions['side'] = orders.side
  transactions['filled_at'] = pd.to_datetime(orders.filled_at).dt.tz_convert('America/New_York')
  transactions['filled_avg_price'] = orders.filled_avg_price.astype('float')

  # set filled qty sign based upon buy or sell
  transactions['qty'] = orders.filled_qty.astype('float')
  transactions['qty'] = transactions.qty.where(transactions.side=='buy', -transactions.qty)

  # calculate filled dollar amount (a sell increases cash and is positive)
  transactions['dollar_amt'] = -(transactions.qty * transactions.filled_avg_price)

  # create a running net position qty
  transactions['position_qty'] = transactions.sort_values('filled_at').groupby('symbol').expanding().qty.sum().droplevel(0)

  # create a position number for each time a position goes to zero
  # for example buy 10 then sell 10 creates a position with a position number 1
  # those two transactions are part of position 1
  # buying another 10 will create a new position with position number 2
  # any additional transactions will be part of position 2 until the net_qty goes to 0
  transactions['position_number'] = (transactions.loc[transactions.position_qty==0]
                                    .groupby('symbol')
                                    .expanding().side.count().droplevel(0)
  )
  transactions['position_number'] = transactions.groupby('symbol').position_number.bfill()

  return transactions
def create_positions(transactions):
  # create an empty dataframe to hold our positions
  positions = pd.DataFrame()

  # add fields from transactions for open and close time and price
  positions['opened_at'] = transactions.groupby(['symbol', 'position_number']).filled_at.first()
  positions['closed_at'] = transactions.groupby(['symbol', 'position_number']).filled_at.last()
  positions['open_price'] = transactions.groupby(['symbol', 'position_number']).filled_avg_price.first()
  positions['close_price'] = transactions.groupby(['symbol', 'position_number']).filled_avg_price.last()
  positions['qty'] = transactions.groupby(['symbol', 'position_number']).qty.sum()

  # add the position 'side' replace labels buy->long and sell->short for easier reading
  positions['side'] = transactions.groupby(['symbol', 'position_number']).side.first()
  positions['side'] = positions.side.replace(['buy', 'sell'], ['long', 'short'])
  
  # calculate realized gain from cost basis and sales
  # log gain is used so values can be averaged etc
  positions['cost_basis'] = -transactions.query('side=="buy"').groupby(['symbol', 'position_number']).dollar_amt.sum().round(2)
  positions['sales'] = transactions.query('side=="sell"').groupby(['symbol', 'position_number']).dollar_amt.sum().round(2)
  positions['profit'] = positions.sales - positions.cost_basis
  positions['gain'] = np.log1p(positions.profit / positions.cost_basis)

  return positions
2 Likes

Yes, I see it now. Thank you very much for the response and your code! Definitely tuned for the upcoming updates on Alpaca markets!