Top Records¶

Add key Source: adopted from here

Introduction¶

The first thing to do in most quantitative analysis in the electronic trading is to determine the data scope, which covers two aspects:

the sample period, i.e. the start and end date/time of the data
the cross sectional space, i.e. the universe of securities included in the analysis

A common approach to determine the security universe is to include the top n most actively traded symbols.

Question¶

Assuming a trade table, which has three columns: date, ticker and volume, is simulated with the function simulateTrades defined below. This function generates 500000 trades for all week days in March 2020. The stock ticker is generated as three random letters. The column volume indicates how many shares are traded.

getTradingDays:{
    firstDate:2020.03.01;
    lastDate:2020.03.31;
    dates:firstDate+til (lastDate-firstDate)+1;
    dates where not (dates mod 7) in 0 1
  };

simulateTrades:{[seed;nTrades]
    tradingDays:getTradingDays[];

    system "S ",string seed;
    dates:nTrades?tradingDays;

    system "S ",string seed;
    tickers:nTrades?`3;

    system "S ",string seed;
    volumes:100*nTrades?1+til 100;

    ([] date:dates;ticker:tickers;volume:volumes)
  };
trades:simulateTrades[-314159;5000000]

Find the top 10 symbols that are most actively traded (measured by total traded shares) on each day. The returned table should have 220 rows, i.e. 10 rows for each day and the following schema:

topTradedSymbols:([]
  date:`date$();
  ticker:`symbol$();
  dailyVol:`long$()
)

Answer¶

First we compute the daily volume from the trade tick data and sort the volume in descending order for each date. Note that 0! is used to remove the key from the table.

dailyVolume:`date`dailyVol xdesc 0!select dailyVol:sum volume by date,ticker from trades;

Below four different approaches are provided to find the top 10 most actively trades symbols.

Use `fby`¶

Find the first 10 indices on each day and then filter out these rows.

select from dailyVolume where ({x in 10#x};i) fby date

Use `sublist`¶

ungroup select sublist[10] ticker,sublist[10] dailyVol by date from dailyVolume

Use `#`¶

Given sublist is implemented internally with take operator, using # directly is slightly faster than using sublist.

ungroup select 10#ticker,10#dailyVol by date from dailyVolume

Use `group`¶

select from dailyVolume where i in raze 10#/:group date

Remarks¶

The last approach using group is preferred due to two reasons:

it runs fastest
it is cleaner if multiple columns are present in the table

Top Records¶

Introduction¶

Question¶

Answer¶

Use fby¶

Use sublist¶

Use #¶

Use group¶

Remarks¶

Use `fby`¶

Use `sublist`¶

Use `#`¶

Use `group`¶