Lesson 0: Exploring qsv help messages and syntax#
Listing all commands#
This may be your first time using qsv, so let’s see what qsv has to offer. We’ll run qsv with the --list
flag.
qsv --list
Installed commands (61):
apply Apply series of transformations to a column
behead Drop header from CSV file
cat Concatenate by row or column
clipboard Provide input from clipboard or output to clipboard
count Count records
datefmt Format date/datetime strings
dedup Remove redundant rows
describegpt Infer extended metadata using a LLM
diff Find the difference between two CSVs
edit Replace a cell's value specified by row and column
enum Add a new column enumerating CSV lines
excel Exports an Excel sheet to a CSV
exclude Excludes the records in one CSV from another
explode Explode rows based on some column separator
extdedup Remove duplicates rows from an arbitrarily large text file
extsort Sort arbitrarily large text file
fetch Fetches data from web services for every row using HTTP Get.
fetchpost Fetches data from web services for every row using HTTP Post.
fill Fill empty values
fixlengths Makes all records have same length
flatten Show one field per line
fmt Format CSV output (change field delimiter)
foreach Loop over a CSV file to execute bash commands
frequency Show frequency tables
geocode Geocodes a location against the Geonames cities database.
headers Show header names
help Show this usage message
index Create CSV index for faster access
input Read CSVs w/ special quoting, skipping, trimming & transcoding rules
join Join CSV files
joinp Join CSV files using the Pola.rs engine
json Convert JSON to CSV
jsonl Convert newline-delimited JSON files to CSV
lens View a CSV file interactively
luau Execute Luau script on CSV data
partition Partition CSV data based on a column value
pro Interact with the qsv pro API
prompt Open a file dialog to pick a file
pseudo Pseudonymise the values of a column
rename Rename the columns of CSV data efficiently
replace Replace patterns in CSV data
reverse Reverse rows of CSV data
safenames Modify a CSV's header names to db-safe names
sample Randomly sample CSV data
schema Generate JSON Schema from CSV data
search Search CSV data with a regex
searchset Search CSV data with a regex set
select Select, re-order, duplicate or drop columns
slice Slice records from CSV
snappy Compress/decompress data using the Snappy algorithm
sniff Quickly sniff CSV metadata
sort Sort CSV data in alphabetical, numerical, reverse or random order
sortcheck Check if a CSV is sorted
split Split CSV data into many files
sqlp Run a SQL query against several CSVs using the Pola.rs engine
stats Infer data types and compute summary statistics
table Align CSV data into columns
tojsonl Convert CSV to newline-delimited JSON
to Convert CSVs to PostgreSQL/XLSX/SQLite/Data Package
transpose Transpose rows/columns of CSV data
validate Validate CSV data for RFC4180-compliance or with JSON Schema
sponsored by datHere - Data Infrastructure Engineering (https://qsv.datHere.com)
Here we see a list of commands and a brief description about them.[1]
Viewing a command’s help message#
You may view a command’s help message by running:
qsv <command> --help
For example I may run the following to get the help message for the headers
command:
qsv headers --help
Prints the fields of the first row in the CSV data.
These names can be used in commands like 'select' to refer to columns in the
CSV data.
Note that multiple CSV files may be given to this command. This is useful with
the --intersect flag.
For examples, see https://github.com/jqnatividad/qsv/blob/master/tests/test_headers.rs.
Usage:
qsv headers [options] [<input>...]
qsv headers --help
headers arguments:
<input>... The CSV file(s) to read. Use '-' for standard input.
If input is a directory, all files in the directory will
be read as input.
If the input is a file with a '.infile-list' extension,
the file will be read as a list of input files.
If the input are snappy-compressed files(s), it will be
decompressed automatically.
headers options:
-j, --just-names Only show the header names (hide column index).
This is automatically enabled if more than one
input is given.
-J, --just-count Only show the number of headers.
--intersect Shows the intersection of all headers in all of
the inputs given.
--trim Trim space & quote characters from header name.
Common options:
-h, --help Display this message
-d, --delimiter <arg> The field delimiter for reading CSV data.
Must be a single character. (default: ,)
Usually you’ll find a similar structure for other qsv commands:
Description about the command
More details
Examples and/or a link to them
Usage format
Subcommands[2]
Arguments
Options (flags)
Displaying headers of a CSV#
Let’s try viewing the headers in the fruits.csv
file located in lessons/0
. Based on the command format in the “Usage” section of the help message for qsv headers
, we’ll run:
qsv headers fruits.csv
1 fruit
2 price
Recap#
In this lesson we’ve covered how to:
List all available qsv commands with
qsv --list
View the help message for an individual command with
qsv <command> --help
Interpret the parts of a command help message
Run a command on an arbitrary CSV file, getting the headers with
qsv headers <filepath>
Now it’s your turn to take on the first exercise.
Exercise 0: Total rows#
Using a qsv command, get the total number of rows that are in the fruits.csv
file.
qsv --list
Installed commands (61):
apply Apply series of transformations to a column
behead Drop header from CSV file
cat Concatenate by row or column
clipboard Provide input from clipboard or output to clipboard
count Count records
datefmt Format date/datetime strings
dedup Remove redundant rows
describegpt Infer extended metadata using a LLM
diff Find the difference between two CSVs
edit Replace a cell's value specified by row and column
enum Add a new column enumerating CSV lines
excel Exports an Excel sheet to a CSV
exclude Excludes the records in one CSV from another
explode Explode rows based on some column separator
extdedup Remove duplicates rows from an arbitrarily large text file
extsort Sort arbitrarily large text file
fetch Fetches data from web services for every row using HTTP Get.
fetchpost Fetches data from web services for every row using HTTP Post.
fill Fill empty values
fixlengths Makes all records have same length
flatten Show one field per line
fmt Format CSV output (change field delimiter)
foreach Loop over a CSV file to execute bash commands
frequency Show frequency tables
geocode Geocodes a location against the Geonames cities database.
headers Show header names
help Show this usage message
index Create CSV index for faster access
input Read CSVs w/ special quoting, skipping, trimming & transcoding rules
join Join CSV files
joinp Join CSV files using the Pola.rs engine
json Convert JSON to CSV
jsonl Convert newline-delimited JSON files to CSV
lens View a CSV file interactively
luau Execute Luau script on CSV data
partition Partition CSV data based on a column value
pro Interact with the qsv pro API
prompt Open a file dialog to pick a file
pseudo Pseudonymise the values of a column
rename Rename the columns of CSV data efficiently
replace Replace patterns in CSV data
reverse Reverse rows of CSV data
safenames Modify a CSV's header names to db-safe names
sample Randomly sample CSV data
schema Generate JSON Schema from CSV data
search Search CSV data with a regex
searchset Search CSV data with a regex set
select Select, re-order, duplicate or drop columns
slice Slice records from CSV
snappy Compress/decompress data using the Snappy algorithm
sniff Quickly sniff CSV metadata
sort Sort CSV data in alphabetical, numerical, reverse or random order
sortcheck Check if a CSV is sorted
split Split CSV data into many files
sqlp Run a SQL query against several CSVs using the Pola.rs engine
stats Infer data types and compute summary statistics
table Align CSV data into columns
tojsonl Convert CSV to newline-delimited JSON
to Convert CSVs to PostgreSQL/XLSX/SQLite/Data Package
transpose Transpose rows/columns of CSV data
validate Validate CSV data for RFC4180-compliance or with JSON Schema
sponsored by datHere - Data Infrastructure Engineering (https://qsv.datHere.com)
Hint
The count
command may be useful for this exercise. Make sure to learn how qsv count
determines the row count in order to complete this exercise as intended.
Solution
As with other solutions you may see in the upcoming exercises, there may be many ways to solve an exercise with qsv. A solution could be running the command:
qsv count fruits.csv --no-headers
And the output should be:
4
Why not 3?
The exercise requires finding the total number of rows in fruits.csv
. As described in the help message for qsv count
(you may run qsv count -h
to get the help message):
Note that the count will not include the header row (unless
--no-headers
is given).
If you run qsv count fruits.csv
then in your terminal you should see 3
as the output. Running it again this time with the --no-headers
flag (or -n
for short), you get the correct number of total rows 4
which includes the header row (which is the first row in the CSV file).
It may sound unusual that by using the --no-headers
flag, the header row gets included in the row count. You may share any ideas for improvements to qsv on qsv’s GitHub discussions.