Hello everyone,
I was wondering if it is possible to sort my current positions by time of creation in ascending or descending order in Python.
Also one idea, since I’m developing a pairs trading strategy, it would be awesome to have some sort of labeling in the alpaca-py librairy, to find these current pairs much easier in the long-run.
@Vejas Good ideas/suggestions. The basic issue with the current position representation is that it’s static. It’s simply a snapshot of one’s current holdings. Because it’s a literal ‘picture’ there aren’t attributes for ‘create time’ or the ability to add labels. Not that this functionality couldn’t be added but the devil is in the details.
I personally create a positions dataframe from the orders which looks like this
A ‘position’ always starts with 0 qty. The first filled order determines if the position is long or short (a buy is long and a sell is short). There can be multiple buys and sells but the position is closed if/when the net qty is 0. A new fill starts a new position. Each time a new position is started it is assigned a sequential ‘position_number’. If there is a non-zero value in the position qty field it implies the position is open.
The profit and gain values are really only valid for closed positions. Calculating the cost basis for open positions is 1) dependent upon the method one chooses (eg FIFO or average) and 2) is a bit complex. I simply didn’t address those.
Below is the code if anyone is interested. There is a bit of initialization and then 3 functions. A positions dataframe can be generated something like this
# First get orders.
# Ideally the start would be a time when all positions are zero.
# The normalize method ensures the time is midnight.
start_time = pd.to_datetime('2023-10-26').tz_localize('America/New_York').normalize()
orders = get_orders(start_time)
transactions = create_transactions(orders)
positions = create_positions(transactions)
A requirement is that a position must be at 0 at the start_time. One can simply use a start_time at or before the first trade in the account (ie fetch all orders in an account) or pick a date when all positions are 0. This works for me because I generally close all positions at the end of each day. Just be mindful of this limitation.
Current positions can be found by filtering the dataframe by qty==0. The dataframe can also be sorted by the opened_at time.
def get_orders(start_time):
# Initialize things needed for the while loop
CHUNK_SIZE = 500
all_orders = []
check_for_more_orders = True
while check_for_more_orders:
# Fetch a 'chunk' of orders and append it to our list
api_orders = api.list_orders(status='all',
after=start_time.isoformat(),
direction='asc',
limit=CHUNK_SIZE,
nested=False,
)
all_orders.extend(api_orders)
if len(api_orders) == CHUNK_SIZE:
# Since length equals the CHUNK_SIZE there may be more orders
# Set the ending timestamp for the next chunk of orders
# A hack to work around complex orders having the same submitted_at time
# and avoid potentially missing one, is to get more than we need
start_time = all_orders[-3].submitted_at
else:
# That was the final chunk so exit
check_for_more_orders = False
# Convert the list into a dataframe and drop any duplicate orders
orders_df = pd.DataFrame([order._raw for order in all_orders])
orders_df.drop_duplicates('id', inplace=True)
return orders_df
def create_transactions(all_orders):
# just look at the filled orders
orders = all_orders.query('status in ["filled", "partially_filled", "canceled"] and filled_qty != "0"').copy()
# create an empty dataframe to hold transactions
transactions = pd.DataFrame()
# add fields we need from orders to transactions
transactions['symbol'] = orders.symbol
transactions['side'] = orders.side
transactions['filled_at'] = pd.to_datetime(orders.filled_at).dt.tz_convert('America/New_York')
transactions['filled_avg_price'] = orders.filled_avg_price.astype('float')
# set filled qty sign based upon buy or sell
transactions['qty'] = orders.filled_qty.astype('float')
transactions['qty'] = transactions.qty.where(transactions.side=='buy', -transactions.qty)
# calculate filled dollar amount (a sell increases cash and is positive)
transactions['dollar_amt'] = -(transactions.qty * transactions.filled_avg_price)
# create a running net position qty
transactions['position_qty'] = transactions.sort_values('filled_at').groupby('symbol').expanding().qty.sum().droplevel(0)
# create a position number for each time a position goes to zero
# for example buy 10 then sell 10 creates a position with a position number 1
# those two transactions are part of position 1
# buying another 10 will create a new position with position number 2
# any additional transactions will be part of position 2 until the net_qty goes to 0
transactions['position_number'] = (transactions.loc[transactions.position_qty==0]
.groupby('symbol')
.expanding().side.count().droplevel(0)
)
transactions['position_number'] = transactions.groupby('symbol').position_number.bfill()
return transactions
def create_positions(transactions):
# create an empty dataframe to hold our positions
positions = pd.DataFrame()
# add fields from transactions for open and close time and price
positions['opened_at'] = transactions.groupby(['symbol', 'position_number']).filled_at.first()
positions['closed_at'] = transactions.groupby(['symbol', 'position_number']).filled_at.last()
positions['open_price'] = transactions.groupby(['symbol', 'position_number']).filled_avg_price.first()
positions['close_price'] = transactions.groupby(['symbol', 'position_number']).filled_avg_price.last()
positions['qty'] = transactions.groupby(['symbol', 'position_number']).qty.sum()
# add the position 'side' replace labels buy->long and sell->short for easier reading
positions['side'] = transactions.groupby(['symbol', 'position_number']).side.first()
positions['side'] = positions.side.replace(['buy', 'sell'], ['long', 'short'])
# calculate realized gain from cost basis and sales
# log gain is used so values can be averaged etc
positions['cost_basis'] = -transactions.query('side=="buy"').groupby(['symbol', 'position_number']).dollar_amt.sum().round(2)
positions['sales'] = transactions.query('side=="sell"').groupby(['symbol', 'position_number']).dollar_amt.sum().round(2)
positions['profit'] = positions.sales - positions.cost_basis
positions['gain'] = np.log1p(positions.profit / positions.cost_basis)
return positions