Functional Select¶
Source: adopted from here
Introduction¶
Sometimes the source data we have to deal with is given and we have no control on how the data is named, typed and stored. One example is there is a space in the table column name or a q keyword is used as the column name. In this case, querying the table becomes tricky.
Question¶
Below is a simple "badly" constructed table dataTbl
, which has a column named order type
, noting the space between the two words.
dataList:(
"date,sym,order type,orderQty"
;"2020.03.09,AAPL.OQ,MID QUOTE,80000"
;"2020.03.09,AAPL.OQ,FAR TOUCH,50000"
;"2020.03.09,AAPL.OQ,NEAR TOUCH,120000"
;"2020.03.10,AAPL.OQ,MID QUOTE,100000"
;"2020.03.10,AAPL.OQ,FAR TOUCH,70000"
;"2020.03.10,AAPL.OQ,NEAR TOUCH,170000"
;"2020.03.09,IBM.N,MID QUOTE,83000"
;"2020.03.09,IBM.N,FAR TOUCH,54000"
;"2020.03.09,IBM.N,NEAR TOUCH,129000"
;"2020.03.10,IBM.N,MID QUOTE,130000"
;"2020.03.10,IBM.N,FAR TOUCH,79000"
;"2020.03.10,IBM.N,NEAR TOUCH,175000"
;"2020.03.09,BABA.N,MID QUOTE,120000"
;"2020.03.09,BABA.N,FAR TOUCH,68000"
;"2020.03.09,BABA.N,NEAR TOUCH,930000"
;"2020.03.10,BABA.N,MID QUOTE,150000"
;"2020.03.10,BABA.N,FAR TOUCH,96000"
;"2020.03.10,BABA.N,NEAR TOUCH,2030000"
);
dataTbl:("DSSJ"; enlist ",") 0:dataList;
Answer the following two questions:
- Find the total order qty (
orderQty
) across all dates grouped byorder type
- Find the total order qty for order type
MID QUOTE
, grouped bysym
Answer¶
When selecting from a "problematic" table with space in column name or keyword as column name, the classical select ... from ...
template does not work. We have two ways to deal with tables like this.
Functional Select¶
The basic syntax of functional select is ?[t;a;b;c]
where:
t
: a table or table namea
: a dictionary of aggregatesb
: a dictionary of group byc
: a dictionary of constraints
More details on functional select are here and here.
Question 1¶
a:();
b:(enlist `orderType)!(enlist `$"order type");
c:(enlist `totalOrderQty)!(enlist (sum;`orderQty));
?[dataTbl;a;b;c]
Question 2¶
a:(enlist (=;`$"order type";enlist `$"MID QUOTE"));
b:(enlist `sym)!(enlist `sym);
c:(enlist `totalOrderQty)!(enlist (sum;`orderQty));
?[dataTbl;a;b;c]
Rename Columns¶
We can use xcol
to rename the column names. Also note that we need to caste the string "MID QUOTE" to a symbol since it contains a space.
Question 1¶
select totalOrderQty:sum orderQty by orderType from `date`sym`orderType xcol dataTbl
Question 2¶
select totalOrderQty:sum orderQty by sym from (
`date`sym`orderType xcol dataTbl
) where orderType=`$"MID QUOTE"
Alternatively, you can also use like
to match a string pattern:
select totalOrderQty:sum orderQty by sym from (
`date`sym`orderType xcol dataTbl
) where orderType like "MID QUOTE"
Note that using like
is slower than the first approach. It is 20%
slower when run from my machine for 10 million times.