Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The `join` command does a similar job and is included by default on at least Mac OS X (10.14) and Ubuntu 16.04.

I think it's much more common than that implies, but have not looked beyond two machines I had immediate access to.

The interface is clumsier in some ways, but it's already there, which is often a win when writing scripts.



xsv author here.

There is very little overlap between what xsv does and what standard Unix tools like `join` do. Chances are, if you're using xsv for something like this, then you probably can't correctly use `join` to do it because `join` does not understand the CSV format.

If your CSV data happen to fall into the subset of the CSV format that does not include escaped field separators (or record separators), then a tool like `join` could work. Notably, this might include TSV (tab separated files) or files that use the ASCII field/record separators (although I have literally never seen such a file in the wild). But if it's plain old comma separated values, then using a tool like `join` is perilous.

I didn't write xsv out of ignorance of standard line oriented tools. I wrote it specifically to target problems that cannot be solved by standard line oriented tools.

You might also argue that data should not be formatted in such a way, and philosophically, I don't necessarily disagree with you. But xsv is not a philosophical tool. It is a practical tool to deal with the actual data you have.


In case folks don't know, burntsushi is also the author of ripgrep (rg), written in Rust, and possibly the fastest grep in the world at the moment.

rg has all but replaced grep for me.


>burntsushi is also the author of ripgrep

They are also the source of much highly practical domain-specific knowledge and advice on how to write Rust code that does e.g.; efficient text file reading, streaming, state machines, etc.

Many thanks for these contributions.


And that advice applies further than just Rust, too. Highly recommend checking their blog and work out!


@burntsushi ... Sir .. need to say this.. many thanks for your tools! They make my life so clean and so manageable. Data wrangling on the command line with your tools is such a great experience (minimalistic and efficient).

Is there anything for `json` files that you would recommend? `jq` is awesome but just wondering.


Thanks! And no, I just use `jq`. I do often have to consult its man page for its DSL syntax, but probably because I don't use it frequently enough.


Thanks for the info. That is definitely good to have spelled out.

I don't believe I ever said that xsv shouldn't exist, or has no legitimate purpose, or was written out of ignorance - just that join is useful for similar tasks and worth knowing about, particularly because it's usually available without an install or compile.

You've obviously done a lot more CSV wrangling than I have, and the info here on what exactly xsv does that join cannot is helpful.

Thanks for sharing!


Xsv is excellent. It’s included in this list of “awesome csv” tools I’ve been collating, because I am inordinately fond of csv: https://github.com/secretGeek/AwesomeCSV/blob/master/README....


The join command has been around for over 30 years!

See p. 7 of this PDF (which contains SunOS 4.1.2 manual pages): http://www.bitsavers.org/pdf/sun/sunos/4.1.2/800-6641-10_Ref...

The short description it gives for the command is "relational database operator".

(The "Last change: 4 March 1988" at the bottom of the page refers to the last change of the intro(1) manual page, not the whole book.)


The problem most of the times is correctly parsing the CSV files.


This command is even covered in LPIC-1.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: