Addons/tables/csv
tables/csv - CSV utilities
- Provides verbs to read from and write to comma-separated-value (CSV) files or strings.
- supports appending arrays to an existing csv file,
- ability to convert fields to numeric type where possible
- old code that uses the base library csv script should not need any modification
(apart from loading) to use this addon instead
- CSV is a specific case of delimiter-separated-value (DSV) format and the verbs in this addon are covers of those in tables/dsv addon
Browse history, source and examples in SVN.
Verbs available
appendcsv v Appends an array to a csv file fixcsv v Convert csv data into J array makecsv v Makes a CSV string from an array makenum v Converts cells in array of boxed literals to numeric where possible enclose v Encloses string in quotes readcsv v Reads csv file into a boxed array writecsv v Writes an array to a csv file
Installation
Use JAL/Package Manager to install both the tables/csv and tables/dsv addons.
If you wish to replace the use of the base library csv script with the tables/csv addon, add the following lines to your ~config/startup.ijs script:
PUBLIC_j_=: (<<<({."1 PUBLIC_j_) i. <'csv'){PUBLIC_j_ buildpublic_j_ 0 : 0 csv ~addons/tables/csv/csv )
If you do this, then require 'csv' and load 'csv' will target the csv addon rather than the base library csv script.
Usage
Load csv addon with the following line
load 'tables/csv'
Verbs are documented in the csv.ijs script.
]dat=: (34;'45';'hello';_5.34),: 12;'32';'goodbye';1.23 ββββ¬βββ¬ββββββββ¬ββββββ β34β45βhello β_5.34β ββββΌβββΌββββββββΌββββββ€ β12β32βgoodbyeβ1.23 β ββββ΄βββ΄ββββββββ΄ββββββ datatype each dat βββββββββ¬ββββββββ¬ββββββββ¬βββββββββ βintegerβliteralβliteralβfloatingβ βββββββββΌββββββββΌββββββββΌβββββββββ€ βintegerβliteralβliteralβfloatingβ βββββββββ΄ββββββββ΄ββββββββ΄βββββββββ makecsv dat 34,"45","hello",-5.34 12,"32","goodbye",1.23 dat writecsv jpath '~temp/test.csv' 47 ]datcsv=: freads jpath '~temp/test.csv' 34,"45","hello",-5.34 12,"32","goodbye",1.23 fixcsv datcsv ββββ¬βββ¬ββββββββ¬ββββββ β34β45βhello β-5.34β ββββΌβββΌββββββββΌββββββ€ β12β32βgoodbyeβ1.23 β ββββ΄βββ΄ββββββββ΄ββββββ readcsv jpath '~temp/test.csv' ββββ¬βββ¬ββββββββ¬ββββββ β34β45βhello β-5.34β ββββΌβββΌββββββββΌββββββ€ β12β32βgoodbyeβ1.23 β ββββ΄βββ΄ββββββββ΄ββββββ
Note that if you wish to use custom field and/or string delimiters, please see the tables/dsv addon (the tables/csv addon is a special case of the tables/dsv addon with the field delimiter set to ',' and the string delimiter set to '"'.
To see more samples of usage, open and inspect the test_csv.ijs script.
Comparison with `csv.ijs` script in base library
The tables/csv addon is no longer as concise (and clean) as the original csv script in the base library. However it supports more features, fixes some bugs? and, in most cases, has better performance than the original.
Most of the verbs from the base library csv script are unchanged. The structural changes can be summarised as follows:
- The algorithm used by chopcsv to convert a line from a csv string into a
list of boxed fields has been replaced
- the portion of writecsv used to make a csv string from a J array has been
factored out into a separate verb - makecsv
- the algorithm used by makecsv to make a csv string from a J array has been replaced.
The new algorithm used, now depends on the type of J array
- appendcsv was added to allow a J array to be converted to a csv string and
appended to an existing file
- makenum was added to convert cells of arrays created with fixcsv to be
converted to numeric types where possible
Features
Feature changes from the base library csv script:
- supports appending arrays to an existing csv file,
- optional user-defined field delimiter and string delimiter(s) - see Addons/tables/dsv
- only literal cells of J array are enclosed by string delimiters
- writecsv/makecsv can handle boxed arrays with cells containing numeric
arrays, boxed or complex data
]tstarry=: ((34j3;2;<<4),:2;3 6;3) ββββββ¬ββββ¬ββββ β34j3β2 βββββ β β ββ4ββ β β βββββ ββββββΌββββΌββββ€ β2 β3 6β3 β ββββββ΄ββββ΄ββββ load '~system/packages/files/csv.ijs' tstarry writecsv jpath '~temp/tstcsv.csv' |domain error: writecsv | dat=.,each 8!:2 each x load 'tables/csv' tstarry writecsv jpath '~temp/tstcsv.csv' 19 freads jpath '~temp/tstcsv.csv' 34j3,2,4 2,3 6,3
Fixed bugs?
- writecsv/makecsv does not append LF to an empty string.
- fixcsv correctly unescapes quotes embedded in fields
tstcsv=: '"Symbol "" is Rank",38,"abc"',LF,'"Hello world",56,"efg"',LF load '~system/packages/files/csv.ijs' fixcsv tstcsv βββββββββββββββββββ¬βββ¬ββββ βSymbol "" is Rankβ38βabcβ βββββββββββββββββββΌβββΌββββ€ βHello world β56βefgβ βββββββββββββββββββ΄βββ΄ββββ load 'tables/csv' fixcsv tstcsv ββββββββββββββββββ¬βββ¬ββββ βSymbol " is Rankβ38βabcβ ββββββββββββββββββΌβββΌββββ€ βHello world β56βefgβ ββββββββββββββββββ΄βββ΄ββββ
Performance
- Performance of fixcsv is pretty much unchanged (a bit faster if
anything).
- The new algorithms in makecsv are generally 3-9 times leaner, and in most
cases faster.
- Large arrays of a single type or with columns, each of a single type, are
processed at least as fast as the old version and simple numeric arrays are over 4 times faster.
- For small arrays containing different datatypes the new version can be up
to twice as slow as the old version, but because total time taken is small, this will not generally be practically significant.
- Large arrays with multiple types within a column are about 80% as fast as the
old version, but use 8 times less space. See table below.
Library csv.ijs Addon csv.ijs Ratio Data type Iterations Code Time Space Time Space Time Space Simple numeric 100 makecsv i. 50 70 0.0153 2913090 0.0035 417344 4.422 6.980 Simple numeric (big) 1 makecsv i.5000 70 2.3214 293626000 0.5485 45474900 4.232 6.457 Boxed numeric 100 makecsv <"0 i. 50 70 0.0148 2913340 0.0092 850624 1.602 3.425 Boxed numeric (big) 1 makecsv <"0 i.5000 70 2.3212 293626000 1.9981 87621100 1.162 3.351 Simple literal (big) 1 makecsv 5000 70$'abcd' 4.5013 644609000 4.0594 645135000 1.109 0.999 Columns of single type 100 makecsv simpcol 0.0002 38272 0.0003 9792 0.619 3.908 Columns of single type (big) 1 makecsv 5000$simpcol 0.3163 45443200 0.0904 5180220 3.499 8.772 Columns of mixed type 100 makecsv mixcol 0.0003 33536 0.0004 11648 0.589 2.879 Columns of mixed type (big) 1 makecsv 5000$mixcol 0.2862 38818700 0.3302 4959490 0.867 7.827 String (small) 100 fixcsv ssimpcol 0.0002 10624 0.0002 10496 1.029 1.012 String (big) 1 fixcsv 171250$ssimpcol 0.2588 4530620 0.2400 4530690 1.078 1.000
simpcol ββββ¬βββββββββββββββββ¬ββ¬ββ¬βββββ¬βββββ¬βββ β12βThe black dog β1βEβ9.32β54 βXLβ ββββΌβββββββββββββββββΌββΌββΌβββββΌβββββΌβββ€ β15βlikes to β0βRβ4.45β5.24β β ββββΌβββββββββββββββββΌββΌββΌβββββΌβββββΌβββ€ β22βeat β1βEβ β455 βXSβ ββββΌβββββββββββββββββΌββΌββΌβββββΌβββββΌβββ€ β96βjuicy, red bonesβ1βWβ5.45β924 βM β ββββ΄βββββββββββββββββ΄ββ΄ββ΄βββββ΄βββββ΄βββ mixedcol ββββββ¬ββββββββββββββ¬βββββββββ¬ββββ¬βββββββββββββββββ¬βββββ β12 βThe black dogβ1 βE β9.32 β54 β ββββββΌββββββββββββββΌβββββββββΌββββΌβββββββββββββββββΌβββββ€ βXL β15 βlikes toβ0 βR β4.45β ββββββΌββββββββββββββΌβββββββββΌββββΌβββββββββββββββββΌβββββ€ β5.24β β22 βeatβ1 βE β ββββββΌββββββββββββββΌβββββββββΌββββΌβββββββββββββββββΌβββββ€ β β455 βXS β96 βjuicy, red bonesβ1 β ββββββ΄ββββββββββββββ΄βββββββββ΄ββββ΄βββββββββββββββββ΄βββββ
Authors
Adapted from the base library csv script by Ric Sherlock
Suggestions and/or SVN improvements to the addon are welcome.
See Also
- csvedit addon - GUI application for creating and editing CSV files.
- dsv addon - general utility for any delimiter-separated-value formated string.