Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I’m about to turn 40 and have been thinking if I should learn Perl or AWK or go with a “modern” solution for text wrangling on the command line.

The general advice here seems to be “learn Perl while young”



What's the modern solution to text processing on the command line?

I don't think anyone tried seriously addressing that use case after Perl. Like, obviously you can do text processing in any language, but you're not going to be doing it in the context of shell pipelines and one-liners. The preferred interaction modes are totally different.


I've been disappointed to see a massive uptick in people writing a whole Python subprocess script instead of just an awk column select. They usually know awk, but Python is more familiar so they replace ten characters with a couple hundred.

It seems like fewer even want to learn a shell pipeline.


There was actually attempts to recreate Awk inside Python. See https://github.com/alecthomas/pawk


That was bad phrasing on my part, I meant "Is there a modern alternative I should go with instead?" - I also don't know what it would be.

But I would love to know if there is a non-regular expression language with native support for csv, json and yaml that one can pipe files in and out of.


Not exactly what you're looking for, but you'd probably like looking at miller / mlr. https://github.com/johnkerl/miller


> The general advice here seems to be “learn X while young”

I generalised your statement for you. When people learn anything in their youth, they tend to think of it as "better | easier | True".

It's true for the things we call culture, language, religion, and relatively simple phenomena such as cuisine and programming languages.

We form a personal canon from the first cognitive imprint. It is hard for people to change even when confronted with bias. The programming community is no exception.*

* pun intended


Perl is a super set of all things. So go with Perl.

>>The general advice here seems to be “learn Perl while young”

Best time to a plant a tree was 20 years back, Next best time is now.

- A Chinese Saying.


On the origin of that phrase:

https://english.stackexchange.com/questions/603690/origins-o...

Seems like it _may_ not be Chinese. I've often seen it attributed as Chinese. Funny how that sticks.


I thought it was Uncle Iroh.


Learn AWK. It's tremendously simpler than Perl, and even more omnipresent. In a week or two of study and tinkering you'll be an expert.


Awk is too specialized, specific. Perl is amazing at text processing AND also amazing at much more. The basics of perl for basic awk-like functionality is really not much to learn. Skip awk and learn something more useful.

This works because perl is layered. To do even fairly elaborate awk-like text processing you only need to learn a little perl. Nothing like all of perl. To do the rest (of elaborate text processing), you learn a little more. In the end, most likely you never need to learn all of perl. But it's there to add to your toolbox as needed as you go.


awk is purpose-built, lightweight, and perfect for line-oriented tasks. Perl can do everything awk does and much more, but that power comes with complexity. If you just need text processing, awk is simpler and sufficient. If you need more than text processing, consider modern alternatives like Python that are easier to maintain and collaborate with.


I've used awk for a lot of weird shit over the last ~30 years and while I agree it's much more straightforward and convenient than Perl for most simple tasks, "expert in a week or two" is no.


Perl was 1 line vs. 100 lines in everything else, in text processing. Now it is 1 line vs. 2-3 lines that are clearer and have stricter semantics. It also has support for running oneliners straight from cli. You probably don’t want to learn it today, as it won’t make that big of a difference.


What's the 2-3 line alternative to perl you are referring to?

I find piping into python to be a lot more than 2-3 lines before I'm even ready to do any manipulation of input, which again can get quite verbose. So I'm guessing it is not python.


Ruby? It has perlisms and awkisms around, here's a stupid example:

    echo -e "foo 1\nbar 2\nbaz 3" | ruby -n -e 'BEGIN { puts "===" }; $_ =~ /^b\S+ (.*)/ and puts "#{$_.chomp.upcase} => #{$1}"; END { puts "===" }'
There's -p too, -0777 works for slurp mode, throw in -rjson to get battery-included pretty_generate, interpolation can be nicer, but $_ is not implicitly used so it can get a bit more verbose than Perl.


I meant average line ratio, not literal one-liners, apologies for confusion.


When you're talking about doing things interactively (in a shell, either directly via a terminal or indirectly via an editor) then the constant factors matter far more than the asymptotics. This has always been perl's target for optimization. That's why it has a million obscure operators that no other language has.


In my mind Perl was the Rust of the 90s. Perl evangelists were everywhere and wanted to rewrite everything in one line of Perl.


I think you would really have needed to learn Perl in the 1990's.

Perl had three huge advantages (at least for me) when I first used it in 1992: 1: familiar syntax which paralleled well-loved tools (sed, troff, grep) 2: associative arrays as a built-in 3: regular expressions as a top-level language feature (not buried in some library with clunky syntax). A few years later, CPAN was another huge advantage.

Nowadays, syntax compatibility with sed is a disadvantage not advantage, associative arrays/dictionaries exist or are easily available for basically every language, and CPAN was the model that everybody else copied or did better. So really the "regexp syntax is immediately available" is the only remaining advantage, and it's rare that it would be worth it.


Awk is a tiny language that you could learn in an afternoon. The real trick is remembering how to use Awk when you need it.


The things you need from Awk you can also find in Perl and learn in an afternoon. Whether you need anything else from Perl is an independent question.


Go with AWK. When your AWK code gets too complicated, that's the signal to switch to something more documentable like Python, Ruby, etc.


> The general advice here seems to be “learn Perl while young”

I think what people are actually saying is: "Perl was great in certain regards, back when I could expect my colleagues to understand Perl, which incidentally was when I was young"

It may take many more likes to apply a regex in any other language - but if my colleagues know Python but not Perl, saving 10 minutes of coding will cost me 5 hours of training/coaching/debugging after the code has been 'maintained' by someone who doesn't know their $_ from their ->@*


depends on what you mean by "on the command line".

I personally find python most useful to learn, especially in combination with invoking a real editor to edit multi-line commands:

bash has C-x C-e and the `fc` buildin, other shells have equivalents.

https://www.gnu.org/software/bash/manual/bash.html#index-edi...

I have extensively used awk and it's lacking when mid size problems become larger projects.

I also have used perl quite a bit, and if you love the "sigil" you can become very, very happy with perl. I personally never go warm with sigil, I have developed sigil-itis, so to speak.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: