Today I discovered a command-line tool that allows you to do SQL queries in a CSV file. It has a name that is very difficult to google search, not to mention to even click a link to it: q. Mwehehe.
I used it recently to clean a CSV file of shipping information that had invalid postcodes:
q -d, -O -H "SELECT DISTINCT * FROM shipping/data.csv WHERE year = 2015 AND postcode != 'Invalid'" > shipping/data2015.csv
-dis the field delimiter, in this case a comma.
-Okeeps the header line in the output.
-Hskips the header row.