Skip to content

Top Records

Add key Source: adopted from here

Introduction

The first thing to do in most quantitative analysis in the electronic trading is to determine the data scope, which covers two aspects:

  • the sample period, i.e. the start and end date/time of the data
  • the cross sectional space, i.e. the universe of securities included in the analysis

A common approach to determine the security universe is to include the top n most actively traded symbols.

Question

Assuming a trade table, which has three columns: date, ticker and volume, is simulated with the function simulateTrades defined below. This function generates 500000 trades for all week days in March 2020. The stock ticker is generated as three random letters. The column volume indicates how many shares are traded.

getTradingDays:{
    firstDate:2020.03.01;
    lastDate:2020.03.31;
    dates:firstDate+til (lastDate-firstDate)+1;
    dates where not (dates mod 7) in 0 1
  };

simulateTrades:{[seed;nTrades]
    tradingDays:getTradingDays[];

    system "S ",string seed;
    dates:nTrades?tradingDays;

    system "S ",string seed;
    tickers:nTrades?`3;

    system "S ",string seed;
    volumes:100*nTrades?1+til 100;

    ([] date:dates;ticker:tickers;volume:volumes)
  };
trades:simulateTrades[-314159;5000000]

Find the top 10 symbols that are most actively traded (measured by total traded shares) on each day. The returned table should have 220 rows, i.e. 10 rows for each day and the following schema:

topTradedSymbols:([]
  date:`date$();
  ticker:`symbol$();
  dailyVol:`long$()
)

Answer

First we compute the daily volume from the trade tick data and sort the volume in descending order for each date. Note that 0! is used to remove the key from the table.

dailyVolume:`date`dailyVol xdesc 0!select dailyVol:sum volume by date,ticker from trades;

Below four different approaches are provided to find the top 10 most actively trades symbols.

Use fby

Find the first 10 indices on each day and then filter out these rows.

select from dailyVolume where ({x in 10#x};i) fby date

Use sublist

ungroup select sublist[10] ticker,sublist[10] dailyVol by date from dailyVolume

Use #

Given sublist is implemented internally with take operator, using # directly is slightly faster than using sublist.

ungroup select 10#ticker,10#dailyVol by date from dailyVolume

Use group

select from dailyVolume where i in raze 10#/:group date

Remarks

The last approach using group is preferred due to two reasons:

  • it runs fastest
  • it is cleaner if multiple columns are present in the table