Skip to content

Loading Data From File to Kdb

span style="display:block;text-align:center"> Query Source: adopted from here

Introduction

I/O operation is one of the most common features supported by all programming languages. Reading from and writing to a file in q is extremely powerful and flexible. Reading a standard csv is quite straightforward and is covered extensively in Q For Mortals. However, the files we have to read sometimes are not standard and some pre-processing is required before they are loaded into kdb.

Question

NASDAQ publishes the listed symbols on its exchange on a daily basis. Download the text file nasdaqlisted.txt from here and save the file on your personal computer. The file you downloaded has a header and footer. For example

Header

Symbol|Security Name|Market Category|Test Issue|Financial Status|Round Lot Size|ETF|NextShares

Footer

File Creation Time: 0316202018:01|||||||

Write a q script snippet to load this text file into kdb table nasdaqlisted with the following schema:

q) meta nasdaqlisted
| c              | t f a |
|----------------|-------|
| symbol         | s     |
| securityName   | s     |
| marketCategory | s     |
| isTestIssue    | b     |
| lotSize        | j     |
| isETF          | b     |

Answer

The following q script snippet is proposed. On my machine, I save the file under C:\data\. Note that the path separator in q is forward slash (/) on all Operating Systems including Windows, Linux, and Mac.

infile:hsym `$"C:/data/nasdaqlisted.txt";
nasdaqlisted:("SSSB JB ";enlist "|") 0:-1_read0 infile;
`symbol`securityName`marketCategory`isTestIssue`lotSize`isETF xcol nasdaqlisted

Some additional details on a few keywords used in the answer:

  • hsym is used to create a file handle from a symbol.
  • read0 reads a text file and the last line is dropped with -1_.
  • 0: interprets a field-delimited list of strings as a matrix.
  • Use a space to omit a field from the load if you don't want to load it.
  • Use xcol to rename column name with space in it as discussed last week.