Essays/Inverted Table
Introduction
A table is 2-dimensional data in which the items of a column have the same data type; an inverted table is a table in which the columns are boxed. An inverted table is more efficient in time and space than other representations, such as one in which each atom of the table is boxed.
x below is an example of an inverted table. It has 5 columns and 8 rows. The i-th row is i{&.>x .
x0=: ];._1 ' Smith Jones Chan Wilson Saxon Angelo Smith Wilson' x1=: ];._1 ' John Dakota Wilson Diana Joan Roberto John John' x2=: 0 1 0 1 1 0 0 1 x3=: 23 29 47 23 31 19 23 23 x4=: 1.25 0.97 2.11 1.25 2.8 1.11 1.25 1.25 x=: x0;x1;x2;x3;x4 x ββββββββ¬ββββββββ¬ββββββββββββββββ¬ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββ βSmith βJohn β0 1 0 1 1 0 0 1β23 29 47 23 31 19 23 23β1.25 0.97 2.11 1.25 2.8 1.11 1.25 1.25β βJones βDakota β β β β βChan βWilson β β β β βWilsonβDiana β β β β βSaxon βJoan β β β β βAngeloβRobertoβ β β β βSmith βJohn β β β β βWilsonβJohn β β β β ββββββββ΄ββββββββ΄ββββββββββββββββ΄ββββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββββββββ ,.&.> x ββββββββ¬ββββββββ¬ββ¬βββ¬βββββ βSmith βJohn β0β23β1.25β βJones βDakota β1β29β0.97β βChan βWilson β0β47β2.11β βWilsonβDiana β1β23β1.25β βSaxon βJoan β1β31β 2.8β βAngeloβRobertoβ0β19β1.11β βSmith βJohn β0β23β1.25β βWilsonβJohn β1β23β1.25β ββββββββ΄ββββββββ΄ββ΄βββ΄βββββ $ x 5 #&> x 8 8 8 8 8 (<_1 2) {&.> x ββββββββ¬ββββββββ¬ββββ¬ββββββ¬ββββββββββ βWilsonβJohn β1 0β23 47β1.25 2.11β βChan βWilson β β β β ββββββββ΄ββββββββ΄ββββ΄ββββββ΄ββββββββββ
Alternative Representation
As indicated above, an alternative representation of a table is one where the table atoms are individually boxed. It is straightforward to convert from an inverted table to this representation and vice versa.
] x1=: |:@:(<"_1&>) x ββββββββ¬ββββββββ¬ββ¬βββ¬βββββ βSmith βJohn β0β23β1.25β ββββββββΌββββββββΌββΌβββΌβββββ€ βJones βDakota β1β29β0.97β ββββββββΌββββββββΌββΌβββΌβββββ€ βChan βWilson β0β47β2.11β ββββββββΌββββββββΌββΌβββΌβββββ€ βWilsonβDiana β1β23β1.25β ββββββββΌββββββββΌββΌβββΌβββββ€ βSaxon βJoan β1β31β2.8 β ββββββββΌββββββββΌββΌβββΌβββββ€ βAngeloβRobertoβ0β19β1.11β ββββββββΌββββββββΌββΌβββΌβββββ€ βSmith βJohn β0β23β1.25β ββββββββΌββββββββΌββΌβββΌβββββ€ βWilsonβJohn β1β23β1.25β ββββββββ΄ββββββββ΄ββ΄βββ΄βββββ 7!:5 ;:'x x1' NB. # bytes 640 2816 x2=: <@(>"1)@|: x1 x -: x2 1
The remainder of this note presents utilities for working with inverted tables. If a table has "key" columns, then computations such as index of should be applied to those columns rather than to the entire table.
Assertions
tassert asserts that the argument is an inverted table: a boxed vector with a least one element, having the same number of items in each box.
tassert=: 3 : 0 assert. (1>:#$y) *. 32=3!:0 y NB. boxed scalar or vector assert. 1=#~.#&>y NB. same # items in each box (with at least one box) 1 ) tassert x 1 tassert x1 |assertion failure: tassert | (1=#$y)*.32=3!:0 y tassert (i.5);i.6 |assertion failure: tassert | 1=#~.#&>y
Tally
The number of rows in an inverted table is the tally (#) of one of its opened atoms.
ttally=: #@>@{. ttally x 8
From
i tfrom x selects record(s) i from table x .
tfrom=: <@[ {&.> ] 4 3 tfrom x ββββββββ¬ββββββββ¬ββββ¬ββββββ¬βββββββββ βSaxon βJoan β1 1β31 23β2.8 1.25β βWilsonβDiana β β β β ββββββββ΄ββββββββ΄ββββ΄ββββββ΄βββββββββ
Index of
To compute index-of (analog of i.), replace column i by its indices in column i of x , then compute i. as usual on the transposed tables of integers.
y=: 3 1 1 2 tfrom x |: x i.&> x 0 0 0 0 0 1 1 1 1 1 2 2 0 2 2 3 3 1 0 0 4 4 1 4 4 5 5 0 5 5 0 0 0 0 0 3 0 1 0 0 |: x i.&> y 3 3 1 0 0 1 1 1 1 1 1 1 1 1 1 2 2 0 2 2 (x i.&> x) i.&|: (x i.&> y) 3 1 1 2 tindexof=: i.&>~@[ i.&|: i.&> x tindexof y 3 1 1 2
Member of
x tmemberof y are those indices in y tindexof x that are less than the number of rows in y ; alternatively, it can be computed more directly from the transposed tables of integer indices on columns of y rather than on columns of x . Thus:
y tindexof x 4 1 3 0 4 4 4 4 ttally y 4 (y tindexof x) < ttally y 0 1 1 1 0 0 0 0 tmemberof1=: tindexof~ < ttally@] x tmemberof1 y 0 1 1 1 0 0 0 0 y i.&> x 4 1 3 0 4 4 4 0 4 1 3 0 4 4 4 4 3 0 3 0 0 3 3 0 0 1 3 0 4 4 0 0 0 1 3 0 4 4 0 0 y i.&> y 0 1 1 3 0 1 1 3 0 0 0 3 0 1 1 3 0 1 1 3 (y i.&> x) e.&|: (y i.&> y) 0 1 1 1 0 0 0 0 tmemberof=: i.&>~ e.&|: i.&>~@] x tmemberof y 0 1 1 1 0 0 0 0
Less
x tless y are rows of x which are not rows of y . Thus:
tless=: <@:-.@tmemberof #&.> [ ,.&.> x tless y ββββββββ¬ββββββββ¬ββ¬βββ¬βββββ βSmith βJohn β0β23β1.25β βSaxon βJoan β1β31β 2.8β βAngeloβRobertoβ0β19β1.11β βSmith βJohn β0β23β1.25β βWilsonβJohn β1β23β1.25β ββββββββ΄ββββββββ΄ββ΄βββ΄βββββ
Nub Sieve and Nub
The nub sieve of a table can be computed by comparison of its self-tindexof values against the indices from 0 to n-1 ; alternatively, it can be computed directly from the transposed table of integer indices.
x tindexof x 0 1 2 3 4 5 0 7 i. ttally x 0 1 2 3 4 5 6 7 (x tindexof x) = i. ttally x 1 1 1 1 1 1 0 1 tnubsieve1=: tindexof~ = i.@ttally tnubsieve1 x 1 1 1 1 1 1 0 1 x i.&> x 0 1 2 3 4 5 0 3 0 1 2 3 4 5 0 0 0 1 0 1 1 0 0 1 0 1 2 0 4 5 0 0 0 1 2 0 4 5 0 0 ~: |: x i.&> x 1 1 1 1 1 1 0 1 tnubsieve=: ~:@|:@:(i.&>)~ tnubsieve x 1 1 1 1 1 1 0 1
Nub obtains by selecting each column using the nub sieve.
tnub=: <@tnubsieve #&.> ] tnub x ββββββββ¬ββββββββ¬ββββββββββββββ¬βββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ βSmith βJohn β0 1 0 1 1 0 1β23 29 47 23 31 19 23β1.25 0.97 2.11 1.25 2.8 1.11 1.25β βJones βDakota β β β β βChan βWilson β β β β βWilsonβDiana β β β β βSaxon βJoan β β β β βAngeloβRobertoβ β β β βWilsonβJohn β β β β ββββββββ΄ββββββββ΄ββββββββββββββ΄βββββββββββββββββββββ΄ββββββββββββββββββββββββββββββββββ ,.&.> tnub x ββββββββ¬ββββββββ¬ββ¬βββ¬βββββ βSmith βJohn β0β23β1.25β βJones βDakota β1β29β0.97β βChan βWilson β0β47β2.11β βWilsonβDiana β1β23β1.25β βSaxon βJoan β1β31β 2.8β βAngeloβRobertoβ0β19β1.11β βWilsonβJohn β1β23β1.25β ββββββββ΄ββββββββ΄ββ΄βββ΄βββββ
Key
x and y are inverted tables; rows of x specify keys for the corresponding rows of y . Then: x u tkey y applies u to rows of y having identical keys.
tkey=: 1 : '<@tindexof~@[ u/.&.> ]' (0 1{x) < tkey 2 3 4{x βββββββββββββββββββ¬ββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ ββββββ¬ββ¬ββ¬ββ¬ββ¬ββ¬ββββββββββ¬βββ¬βββ¬βββ¬βββ¬βββ¬βββββββββββββββ¬βββββ¬βββββ¬βββββ¬ββββ¬βββββ¬ββββββ ββ0 0β1β0β1β1β0β1βββ23 23β29β47β23β31β19β23βββ1.25 1.25β0.97β2.11β1.25β2.8β1.11β1.25ββ ββββββ΄ββ΄ββ΄ββ΄ββ΄ββ΄ββββββββββ΄βββ΄βββ΄βββ΄βββ΄βββ΄βββββββββββββββ΄βββββ΄βββββ΄βββββ΄ββββ΄βββββ΄ββββββ βββββββββββββββββββ΄ββββββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββββββββββ (0 1{x) +/ tkey 2 3 4{x βββββββββββββββ¬βββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββ β0 1 0 1 1 0 1β46 29 47 23 31 19 23β2.5 0.97 2.11 1.25 2.8 1.11 1.25β βββββββββββββββ΄βββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββ
Grade and Sort
A table can be graded by grading column c-1 , then column c-2 , and so on, to column 0 .
(0{::x) (]/:{~) (1{::x) (]/:{~) (2{::x) (]/:{~) (3{::x) (]/:{~) /:4{::x 5 2 1 4 0 6 3 7 tgrade =: > @ ((] /: {~)&.>/) @ (}: , /:&.>@{:) tgradedown=: > @ ((] \: {~)&.>/) @ (}: , \:&.>@{:) tgrade x 5 2 1 4 0 6 3 7 tgradedown x 7 3 0 6 4 1 2 5 ,.&.> (<tgrade x) {&.> x ββββββββ¬ββββββββ¬ββ¬βββ¬βββββ βAngeloβRobertoβ0β19β1.11β βChan βWilson β0β47β2.11β βJones βDakota β1β29β0.97β βSaxon βJoan β1β31β 2.8β βSmith βJohn β0β23β1.25β βSmith βJohn β0β23β1.25β βWilsonβDiana β1β23β1.25β βWilsonβJohn β1β23β1.25β ββββββββ΄ββββββββ΄ββ΄βββ΄βββββ ,.&.> (<tgradedown x) {&.> x ββββββββ¬ββββββββ¬ββ¬βββ¬βββββ βWilsonβJohn β1β23β1.25β βWilsonβDiana β1β23β1.25β βSmith βJohn β0β23β1.25β βSmith βJohn β0β23β1.25β βSaxon βJoan β1β31β 2.8β βJones βDakota β1β29β0.97β βChan βWilson β0β47β2.11β βAngeloβRobertoβ0β19β1.11β ββββββββ΄ββββββββ΄ββ΄βββ΄βββββ
The grade can also be computed by replacing each column with the rankings within each column, with equal items being assigned equal rankings. (!.0 is used because /: is non-tolerant.) Thus:
ranking=: i.!.0~ { /:@/: ranking&> x 4 2 1 6 3 0 4 6 3 0 7 1 2 6 3 3 0 4 0 4 4 0 0 4 1 5 7 1 6 0 1 1 2 0 6 2 7 1 2 2 /: |: ranking&> x 5 2 1 4 0 6 3 7 tgrade1=: /: @ |: @: (ranking&>) tgrade1 x 5 2 1 4 0 6 3 7
Sort obtains by indexing with the grade.
tsort=: <@tgrade {&.> ] ,.&.> tsort x ββββββββ¬ββββββββ¬ββ¬βββ¬βββββ βAngeloβRobertoβ0β19β1.11β βChan βWilson β0β47β2.11β βJones βDakota β1β29β0.97β βSaxon βJoan β1β31β 2.8β βSmith βJohn β0β23β1.25β βSmith βJohn β0β23β1.25β βWilsonβDiana β1β23β1.25β βWilsonβJohn β1β23β1.25β ββββββββ΄ββββββββ΄ββ΄βββ΄βββββ
Selfie
Index-of with both arguments being the same, an instance of a βselfieβ. (i.~ x) are like ID numbers; questions of identity on x can often be answered more efficiently on (i.~ x) than on x itself.
tselfie=: |:@:(i.~&>) tselfie x 0 0 0 0 0 1 1 1 1 1 2 2 0 2 2 3 3 1 0 0 4 4 1 4 4 5 5 0 5 5 0 0 0 0 0 3 0 1 0 0
Collected Definitions
ifa =: <@(>"1)@|: NB. inverted from atoms afi =: |:@:(<"_1@>) NB. atoms from inverted tassert=: 3 : 0 assert. (1>:#$y) *. 32=3!:0 y NB. boxed scalar or vector assert. 1=#~.#&>y NB. same # items in each box (with at least one box) 1 ) ttally =: #@>@{. tfrom =: <@[ {&.> ] tindexof =: i.&>~@[ i.&|: i.&> tmemberof =: i.&>~ e.&|: i.&>~@] tless =: <@:-.@tmemberof #&.> [ tnubsieve =: ~:@|:@:(i.&>)~ tnub =: <@tnubsieve #&.> ] tkey =: 1 : '<@tindexof~@[ u/.&.> ]' tgrade =: > @ ((] /: {~)&.>/) @ (}: , /:&.>@{:) tgradedown=: > @ ((] \: {~)&.>/) @ (}: , \:&.>@{:) tsort =: <@tgrade {&.> ] tselfie =: |:@:(i.~&>) NB. alternatives tmemberof1=: tindexof~ < ttally@] tnubsieve1=: tindexof~ = i.@ttally ranking =: i.!.0~ { /:@/: tgrade1 =: /: @ |: @: (ranking&>)
Contributed by Roger Hui. [[Category:Inverted table primitives]