csvkit1 is a collection of shell scripts which does simple manipulations and statistic on CSV files and which can also import data into SQL-databases ad-hoc.

Here I show how one can easily do simple statistics, and also how one can use sparklines2 for quick plotting in the terminal. For illustration purposes, I use the European Unions’s COVID-19 weekly statistics3.

cskit can do much more, go and explore.

$ wget https://opendata.ecdc.europa.eu/covid19/nationalcasedeath_eueea_daily_ei/csv/data.csv
$ csvsql --db sqlite:///covid19.db --insert data.csv
$ csvcut -n data.csv
  1: dateRep
  2: day
  3: month
  4: year
  5: cases
  6: deaths
  7: countriesAndTerritories
  8: geoId
  9: countryterritoryCode
 10: popData2020
 11: continentExp
$ echo 'select year,month,sum(cases) from data group by month order by year,month;' | sqlite3 covid19.db
2021.0|7.0|1920799.0
2021.0|8.0|2475308.0
2021.0|9.0|2532891.0
2021.0|10.0|6016018.0
2021.0|11.0|11312325.0
2021.0|12.0|14230992.0
2022.0|1.0|36056524.0
2022.0|2.0|23847825.0
2022.0|3.0|23225316.0
2022.0|4.0|15142428.0
2022.0|5.0|6717141.0
2022.0|6.0|2574159.0
$ echo 'select sum(cases) from data group by month order by year,month;' | sqlite3 covid19.db | sparklines -n3
      █
     ▁██▇▂
▁▁▁▄▇█████▄▁_

Note: Sun 19 Jun 2022 01:47:22 PM CEST Sparklines look much better in the terminal; my codeblock stylesheet doesn’t render these good enough.

  1. “csvkit: A suite of utilities for converting to and working with CSV, the king of tabular file formats.”; The csvkit team; URL: https://github.com/wireservice/csvkit 

  2. “sparklines”; Gherman, Dinu; URL: https://pypi.org/project/sparklines/ 

  3. “Data on the daily number of new reported COVID-19 cases and deaths by EU/EEA country”; European Centre for Disease Prevention and Control; URL: https://www.ecdc.europa.eu/en/publications-data/data-daily-new-cases-covid-19-eueea-country