What's the modern solution to text processing on the command line?
I don't think anyone tried seriously addressing that use case after Perl. Like, obviously you can do text processing in any language, but you're not going to be doing it in the context of shell pipelines and one-liners. The preferred interaction modes are totally different.
I've been disappointed to see a massive uptick in people writing a whole Python subprocess script instead of just an awk column select. They usually know awk, but Python is more familiar so they replace ten characters with a couple hundred.
It seems like fewer even want to learn a shell pipeline.
That was bad phrasing on my part, I meant "Is there a modern alternative I should go with instead?" - I also don't know what it would be.
But I would love to know if there is a non-regular expression language with native support for csv, json and yaml that one can pipe files in and out of.
> The general advice here seems to be “learn X while young”
I generalised your statement for you. When people learn anything in their youth, they tend to think of it as "better | easier | True".
It's true for the things we call culture, language, religion, and relatively simple phenomena such as cuisine and programming languages.
We form a personal canon from the first cognitive imprint. It is hard for people to change even when confronted with bias. The programming community is no exception.*
Awk is too specialized, specific. Perl is amazing at text processing AND also amazing at much more. The basics of perl for basic awk-like functionality is really not much to learn. Skip awk and learn something more useful.
This works because perl is layered. To do even fairly elaborate awk-like text processing you only need to learn a little perl. Nothing like all of perl. To do the rest (of elaborate text processing), you learn a little more. In the end, most likely you never need to learn all of perl. But it's there to add to your toolbox as needed as you go.
awk is purpose-built, lightweight, and perfect for line-oriented tasks. Perl can do everything awk does and much more, but that power comes with complexity. If you just need text processing, awk is simpler and sufficient. If you need more than text processing, consider modern alternatives like Python that are easier to maintain and collaborate with.
I've used awk for a lot of weird shit over the last ~30 years and while I agree it's much more straightforward and convenient than Perl for most simple tasks, "expert in a week or two" is no.
Perl was 1 line vs. 100 lines in everything else, in text processing. Now it is 1 line vs. 2-3 lines that are clearer and have stricter semantics. It also has support for running oneliners straight from cli. You probably don’t want to learn it today, as it won’t make that big of a difference.
What's the 2-3 line alternative to perl you are referring to?
I find piping into python to be a lot more than 2-3 lines before I'm even ready to do any manipulation of input, which again can get quite verbose. So I'm guessing it is not python.
There's -p too, -0777 works for slurp mode, throw in -rjson to get battery-included pretty_generate, interpolation can be nicer, but $_ is not implicitly used so it can get a bit more verbose than Perl.
When you're talking about doing things interactively (in a shell, either directly via a terminal or indirectly via an editor) then the constant factors matter far more than the asymptotics. This has always been perl's target for optimization. That's why it has a million obscure operators that no other language has.
I think you would really have needed to learn Perl in the 1990's.
Perl had three huge advantages (at least for me) when I first used it in 1992: 1: familiar syntax which paralleled well-loved tools (sed, troff, grep) 2: associative arrays as a built-in 3: regular expressions as a top-level language feature (not buried in some library with clunky syntax). A few years later, CPAN was another huge advantage.
Nowadays, syntax compatibility with sed is a disadvantage not advantage, associative arrays/dictionaries exist or are easily available for basically every language, and CPAN was the model that everybody else copied or did better. So really the "regexp syntax is immediately available" is the only remaining advantage, and it's rare that it would be worth it.
> The general advice here seems to be “learn Perl while young”
I think what people are actually saying is: "Perl was great in certain regards, back when I could expect my colleagues to understand Perl, which incidentally was when I was young"
It may take many more likes to apply a regex in any other language - but if my colleagues know Python but not Perl, saving 10 minutes of coding will cost me 5 hours of training/coaching/debugging after the code has been 'maintained' by someone who doesn't know their $_ from their ->@*
I have extensively used awk and it's lacking when mid size problems become larger projects.
I also have used perl quite a bit, and if you love the "sigil" you can become very, very happy with perl. I personally never go warm with sigil, I have developed sigil-itis, so to speak.
The general advice here seems to be “learn Perl while young”