reorder-columns
Overview¶
kgtk reorder-columns reorders file columns while copying a KGTK file from input to output.
You provide a list of column names in the order in which they should appear in the output file.
Note
You may not omit any column names from the list of column names unless
you use "..", ..., or --trim, described below.
Note
This comand can be used to reorder the columns of non-KGTK input TSV files (quasi-KGTK files)
by using the expert option --mode=NONE.
Note
The output file should still have required columns (id for a KGTK node file, (node1, label, node2)
for a KGTK edge file). This requirement may be disabled with the expert option --mode=NONE, but the
output file will not be a valid KGTK node or edge file.
Info
See kgtk remove-columns if you wish to remove columns.
See kgtk rename-columns if you wish to rename columns.
See kgtk select-columns if you wish to select a subset of
columns, dropping unmentioned columns. This provides the same functionality as
kgtk reorder-columns --trim.
List of Column Names¶
When you use this command, you supply the --columns option with
a list of column names in the
order you wish them to appear in the output file.
Column Ranges¶
When listing the output column names, you may use .. to indicate a range of column names (e.g., first .. last)
in the order of the columns in the input file.
Elipses¶
When listing the output column names, you may use ... to indicate all columns not explicitly mentioned.
Trimming Columns¶
You may remove ("trim") columns with this command using the --trim option.
When this option is specified, all unmentioned columns will be removed
from the output file.
Note
An elipses (...) will consume all remaining column names, leaving none left
for --trim to remove.
Note
kgtk remove-columns --trim will not complain if there are no column names to
remove.
Note
If you use this option to remove a required column name (id for KGTK
node files, (node1, label, node2) for a KGTK edge file), then
you will create an invalid KGTK file (a quasi-KGTK file). You must
include the expert option --mode=NONE on the command line to
accomplish this.
Note
kgtk reorder-columns --trim may be used as an alternative to kgtk remove-columns.
Usage¶
usage: kgtk reorder-columns [-h] [-i INPUT_FILE] [-o OUTPUT_FILE] -c
COLUMN_NAME [COLUMN_NAME ...]
[--as COLUMN_NAME [COLUMN_NAME ...]]
[--trim [True|False]] [-v [optional True|False]]
This command reorders one or more columns in a KGTK file.
Reorder all columns using --columns col1 col2
Reorder selected columns using --columns col1 col2 ... coln-1 coln
Move a column to the front: --columns col ...
Move a column to the end: --columns ... col
Extract named columns, omitting the rest: --columns col1 col2 --trim
Move a range of columns: --columns coln .. colm ...
If no input filename is provided, the default is to read standard input.
Additional options are shown in expert help.
kgtk --expert reorder-columns --help
optional arguments:
-h, --help show this help message and exit
-i INPUT_FILE, --input-file INPUT_FILE
The KGTK input file. (May be omitted or '-' for
stdin.)
-o OUTPUT_FILE, --output-file OUTPUT_FILE
The KGTK output file. (May be omitted or '-' for
stdout.)
-c COLUMN_NAME [COLUMN_NAME ...], --columns COLUMN_NAME [COLUMN_NAME ...], --column COLUMN_NAME [COLUMN_NAME ...]
The list of reordered column names, optionally
containing '...' for column names not explicitly
mentioned.
--as COLUMN_NAME [COLUMN_NAME ...], --as-columns COLUMN_NAME [COLUMN_NAME ...], --as-column COLUMN_NAME [COLUMN_NAME ...]
Replacement column names.
--trim [True|False] If true, omit unmentioned columns. (default=False).
-v [optional True|False], --verbose [optional True|False]
Print additional progress messages (default=False).
Examples¶
Sample Data¶
Suppose that file1.tsv contains the following table in KGTK format:
kgtk cat -i examples/docs/reorder-columns-file1.tsv
| node1 | label | node2 | location | years |
|---|---|---|---|---|
| john | zipcode | 12345 | home | 10 |
| john | zipcode | 12346 | ||
| peter | zipcode | 12040 | home | |
| peter | zipcode | 12040 | cabin | |
| peter | zipcode | 12040 | work | 5 |
| peter | zipcode | 12040 | 6 | |
| steve | zipcode | 45601 | 3 | |
| steve | zipcode | 45601 | 4 | |
| steve | zipcode | 45601 | 5 | |
| steve | zipcode | 45601 | home | 1 |
| steve | zipcode | 45601 | work | 2 |
| steve | zipcode | 45601 | cabin |
Giving a Complete List of Column Names¶
Copy file1.tsv, sending the output to standard output, giving
the complete list of columns, while swapping the order of
some columns:
kgtk reorder-columns -i examples/docs/reorder-columns-file1.tsv \
--columns label node1 node2 years location
| label | node1 | node2 | years | location |
|---|---|---|---|---|
| zipcode | john | 12345 | 10 | home |
| zipcode | john | 12346 | ||
| zipcode | peter | 12040 | home | |
| zipcode | peter | 12040 | cabin | |
| zipcode | peter | 12040 | 5 | work |
| zipcode | peter | 12040 | 6 | |
| zipcode | steve | 45601 | 3 | |
| zipcode | steve | 45601 | 4 | |
| zipcode | steve | 45601 | 5 | |
| zipcode | steve | 45601 | 1 | home |
| zipcode | steve | 45601 | 2 | work |
| zipcode | steve | 45601 | cabin |
Giving a Range of Column Names¶
Copy file1.tsv, sending the output to standard output, giving
a range of columns, while swapping the order of
the lsat two columns:
kgtk reorder-columns -i examples/docs/reorder-columns-file1.tsv \
--columns node1 .. node2 years location
| node1 | label | node2 | years | location |
|---|---|---|---|---|
| john | zipcode | 12345 | 10 | home |
| john | zipcode | 12346 | ||
| peter | zipcode | 12040 | home | |
| peter | zipcode | 12040 | cabin | |
| peter | zipcode | 12040 | 5 | work |
| peter | zipcode | 12040 | 6 | |
| steve | zipcode | 45601 | 3 | |
| steve | zipcode | 45601 | 4 | |
| steve | zipcode | 45601 | 5 | |
| steve | zipcode | 45601 | 1 | home |
| steve | zipcode | 45601 | 2 | work |
| steve | zipcode | 45601 | cabin |
Move a Column to the Beginning¶
Copy file1.tsv, sending the output to standard output, with
the location column first. Use the elipses (...) to name
all other columns.
kgtk reorder-columns -i examples/docs/reorder-columns-file1.tsv \
--columns location ...
The result will be the following table in KGTK format:
| location | node1 | label | node2 | years |
|---|---|---|---|---|
| home | john | zipcode | 12345 | 10 |
| john | zipcode | 12346 | ||
| home | peter | zipcode | 12040 | |
| cabin | peter | zipcode | 12040 | |
| work | peter | zipcode | 12040 | 5 |
| peter | zipcode | 12040 | 6 | |
| steve | zipcode | 45601 | 3 | |
| steve | zipcode | 45601 | 4 | |
| steve | zipcode | 45601 | 5 | |
| home | steve | zipcode | 45601 | 1 |
| work | steve | zipcode | 45601 | 2 |
| cabin | steve | zipcode | 45601 |
Swap Two Columns, Naming Only Those Columns¶
Copy file1.tsv, sending the output to standard output, swapping
the positions of the location and years columns at the end
of the list of column names.
kgtk reorder-columns -i examples/docs/reorder-columns-file1.tsv \
--columns ... years location
The result will be the following table in KGTK format:
| node1 | label | node2 | years | location |
|---|---|---|---|---|
| john | zipcode | 12345 | 10 | home |
| john | zipcode | 12346 | ||
| peter | zipcode | 12040 | home | |
| peter | zipcode | 12040 | cabin | |
| peter | zipcode | 12040 | 5 | work |
| peter | zipcode | 12040 | 6 | |
| steve | zipcode | 45601 | 3 | |
| steve | zipcode | 45601 | 4 | |
| steve | zipcode | 45601 | 5 | |
| steve | zipcode | 45601 | 1 | home |
| steve | zipcode | 45601 | 2 | work |
| steve | zipcode | 45601 | cabin |
Trimming Omitted Columns¶
Copy file1.tsv, sending the output to standard output, giving
a partial list of columns and trimming the remainder:
kgtk reorder-columns -i examples/docs/reorder-columns-file1.tsv \
--columns label node1 node2 \
--trim
| label | node1 | node2 |
|---|---|---|
| zipcode | john | 12345 |
| zipcode | john | 12346 |
| zipcode | peter | 12040 |
| zipcode | peter | 12040 |
| zipcode | peter | 12040 |
| zipcode | peter | 12040 |
| zipcode | steve | 45601 |
| zipcode | steve | 45601 |
| zipcode | steve | 45601 |
| zipcode | steve | 45601 |
| zipcode | steve | 45601 |
| zipcode | steve | 45601 |
Trimming a Required Column¶
Copy file1.tsv, sending the output to standard output, giving
a partial list of columns and trimming the remainder, which includes
required columns:
kgtk reorder-columns -i examples/docs/reorder-columns-file1.tsv \
--columns node1 location \
--trim --mode=NONE
| node1 | location |
|---|---|
| john | home |
| john | |
| peter | home |
| peter | cabin |
| peter | work |
| peter | |
| steve | |
| steve | |
| steve | |
| steve | home |
| steve | work |
| steve | cabin |
Note
Quasi-KGTK input files may also be processed by specifying --mode=NONE.