Kinda off that this is written in Python, and not Perl. If it was written in Perl, it'd have the portability that the underlying translated commands have!
I'd be happy to help you port it to modern, high-quality, readable Perl if that's something you're interested in. Being able to install this by dropping a single Perl file into ~/bin/ would be neat.
@bihla take this man up on his offer :-). I thought it was an April fools joke where it talks about perl and then implements everything in python. But no, its actually python. If you (peteretep) do port it to perl, please post a show HN: :-)
I agree, if it were entirely Perl it would have essentially no dependencies. Right now the audience is limited to those with a currently functioning Python3 environment or willing to configure one.
My only concerns are:
I do not know what the process of getting comparable tab completion would be in Perl. I depended on the argcomplete package for this feature which I consider to be essential...the clarity of the syntax depends on being a bit verbose, and this is only usable because tab completion makes it fast to use.
Also, I'm somewhat reluctant to develop in Perl, though I'm willing to do it provided the benefit + demand is there. It would be ironic to have learned Perl as a consequence of building a tool whose primary function is to prevent one having to learn Perl.
I haven't used the following (I simply never had to implement the shell autocompletion as try I avoid the shell as much as I can, preferring to organize my code in the actual files that I can reuse from my editor) but I guess it can be the best start to include significant parts of this code inside of your final script, if the goal can be to have a single easily usable script doing everything, as somebody mentioned that ideal here (still take care to keep the compatible/same license):
You can surely omit the lines between =pod and =cur (they are used to create the documentation) and most probably also the "description" parts.
> It would be ironic to have learned Perl as a consequence of building a tool whose primary function is to prevent one having to learn Perl.
It's less ironic than you think, I consider it the bigger irony that Python turned to be the major compatibility problem in all but single-user single-program environments (it's really not trivial to have to work with more Python variants as the dependencies for different code). You can be sure that any Perl 5 skills you obtain can remain quite portable and usable across the different environments.
I've also already written here, I consider Perl significantly "safer" language than Python: when I modify some big Perl scripts, very often the Perl compiler (when used with the "use strict" and -w option) will tell me early enough what is missing for a program to function correctly again, as Perl simply more explicit, in some weird way almost like the typed languages. I never have the same feeling of "it will work" with the modification of the Python scripts.
That "line noise" is actually meaningful, much more consistent than the shell languages, and safer in case of the changes than Python.
Forgot to say it in the other comment but regardless of whether we proceed with this, thank you for offering to help. It means a lot to get a positive response from people, especially to the extent that individuals will contribute to the project.
Perl aside, I would say that's almost imperative to learn sed and awk. They invaluable tools for getting things done in *nix. They seem cryptic at first, but put a weekend into learning them and you learn that they are simply concise.
And then you have a script that needs to run on a mac as well and you can either fight that the people will install the gnu version or you just rewrite your one-liner in perl.
I'm not a huge fan of perl but I've been writing my one-liners in it for a few years now and I've had less problems with portability.
I would use another language for distributing a script (probably) python. Awk and sed I use more on a one-off basis although I do have a list of ones I use frequently.
This is cool! I’ve wanted to explore something similar but incorporating a broad range of Unix commands.
I imagined a drag and drop interface (sort of like the Scratch programming language) with English descriptors of the functions being performed. You’d make it a web UI or something that generates a bash script you can paste into the terminal.
some notes on the sample awk/sed/perl one-liner given:
# input line has to be explicitly printed
awk '{gsub(/Jack/,"Jill")} 1' file.txt
# -i will do inplace editing, unlike the awk command
# -i by itself won't work on non-GNU versions, needs backup extension
sed 's/Jack/Jill/g' file.txt
# use single quotes always, unless double is needed
# -p will behave like default sed
perl -pe 's/Jack/Jill/g' file.txt
personally, I prefer the terseness of these commands over verbose SQL like syntax (and also the fact that I don't know SQL like tools)
However, I would agree that initial learning curve is tough for sed/awk/perl. Once you get familiar with their idioms, they become the swiss army knife of cli text processing (along with grep). I have an entire repo dedicated to such tools[1]
That looks very nice, thank you for sharing/writing it!
Are you responsible for updating the Cargo package? It's the first time I've used Cargo and was wondering how often I should re-run it or check for updates?
You can run cargo install-update -a every now and then to check for updates. At some point in a few weeks I hope to have binaries distributed with every release, which means there will be better/faster ways to install sd than cargo.
looks interesting, bookmarked (will possibly add tutorial on this someday)
one question though, sed is not just search and replace, it is filter + search and replace (without going into arcane commands like n,N,x,h,etc) - filtering can be done based on regex, line number, combination of these two for blocks of line, etc.. Does sd support these?
tl;dr its only find & replace with smart defaults and easy, straightforward syntax because I hated wrangling with sed's quirks. For more features, you're better off with sed. But if you're like me and only really use sed for find/replace, you will have a good time with sd.
Really looking forward to trying this out. The syntax looks very clean & clear. I hate the insanity of sed, and assuming the performance is here, I’d switch to this in a heartbeat.
sed provides only BRE/ERE which has lot less features than perl, however in my experience speed is better with BRE/ERE than PCRE, except patterns with backreferences. see also [1]
a simple example:
$ time sed 's/at/AT/g' /usr/share/dict/words > /dev/null
real 0m0.041s
user 0m0.037s
sys 0m0.004s
$ time perl -pe 's/at/AT/g' /usr/share/dict/words > /dev/null
real 0m0.056s
user 0m0.051s
sys 0m0.004s
$ time LC_ALL=C sed -nE '/^([a-z]..)\1$/p' /usr/share/dict/words > /dev/null
real 0m0.049s
user 0m0.048s
sys 0m0.000s
$ time perl -ne 'print if /^([a-z]..)\1$/' /usr/share/dict/words > /dev/null
real 0m0.041s
user 0m0.033s
sys 0m0.007s
I'd definitely incorporate it into my projects if it were ported Go, Rust, or any reasonable redistributable form, and continued to use a BSD license.
Way clearer than cryptic sed and awk one-liners! Big kudos to you for making this and sharing it with the world!
Python apps which are more than a single script are a non-starter for me. It's a big pain to get everyone on my 120+ person dev team to install it properly, they're all over the place with regard to environments. "Oops, I'm on Python 3.4", or python 2.6, or "I use this other python path"; it's inevitably a headache.
> if it were ported Go, Rust, or any reasonable redistributable form
I still consider Perl being, from the portability standpoint, the best language to pre-process a single text line that should be translated to a Perl one-liner which is then executed.
I wish I could disagree, but your point is true. Hopefully soon the python 2 vs 3 disease will be mostly cured, but right now anyone not currently using python3 is unlikely to use bsed given that the value is convenience and simplicity--a value nullified by any install frustrations.
One can have “experience” with something, and not “know” it.
Could just as easy say: “I have experience in scenarios where I must use python or run its scripts in an evironment, and it has been a headache, but I can’t say I dislike the language because I haven’t needed to explicitly create programswith it.”
Sure. But the way yoklov wrote it, I thought he meant that Python is objectively bad for this use case, which you cant know without knowing the language.
This is very cool for simplifying interactive use :) but I'd like to caution people away from immortalizing it in their scripts. sed and awk are both standardized by POSIX and can be found on pretty much any system, and trivially ported to new systems. Porting Python, on the other hand, is comparitavely a mammoth proposition, and often impossible on some platforms. For this reason, I'm generally allergic to "new $x replaces $y!" for any $y which is governed by portable standards.
This is something I love about Perl: in an age of many subtly incompatible operating systems, Perl said "F it, we'll do what it takes to compile and run your program everywhere." People don't appreciate this now, when everything is Linux, Mac, or Windows, but back when you regularly had to deal with various BSDs, HPUX, AIX, Mac OS 9, and even the occasional VAX, Perl was a godsend.
Seriously, I see so many ways of re-inventing these wheels, written in the fad languages/systems d'jour. Perl works very well, for a large subset of tasks that are common.
The same code that ran 25+ years ago (in perl4) on my Sun and Irix machines, runs today on my laptops in linux, MacOSX. I remember building Perl atop the Cray J90 running Unicos in grad school, as I had built run automation software in it (not a queuing system, but code to encapsulate some of my runs and return the relevant data). I had developed that code on my OS2 based desktop with Perl.
Perl is a go-to language. It gets stuff done. Without pain.
And even though Perl grew up on and for Unix, there are a couple of distributions that work good on Windows, too, even though I think Perl feels more at home on Unix, where it came from.
That's a good point. I think something like this would work extremely well in manual workflows and personally I can see myself using it from within my text editor, such as vscode.
Scripts, as you said, might be better off with sed/awk.
Edit: Or, as OP said in a sister comment, using -t and using the Perl command in scripts makes a lot of sense.
> bsed giant_malformatted.json replace '\'' with '\"' | bsed replace 'True' with 'true' | bsed replace 'False' with 'false'
bsed giant_malformatted.json replace '\'' that begins or ends a string with '\"' | bsed replace unquoted 'True' with 'true' | bsed replace unquoted 'False' with 'false'
Fair point. I mostly put those commands together to illustrate concepts. Two things you've touched on though that I haven't implemented yet: compound conditionals, and a nice way to indicated word boundaries (as opposed to the regex solution)
Neat, but a little scary because you rely on it to interpret your intentions correctly. Usually when I am trying to do something, I build up my command in multiple steps, printing outputs and verifying it is correct before actually running the command. Maybe you could have a "diff" or "dryrun" mode that would show the potential changes. Another idea would be for this script to "compile" a CLI command using grep/awk/sed etc instead of doing the actual editing itself.
The interpretation is only translating into standard Perl one-liners, and by default the execution writes to stdout instead of modifying the file. I haven't written code to actually do the text transformations.
My normal workflow is to write commands, pipe them together as needed, and when I think I am satisfied I'll store to a temporary file. If I really need to be extra sure then I will use my preferred ddiff tool to compare against the original before doing the swap.
I may have mis-interpreted but I hope between this and the unit tests your concerns are alleviated
How does it compare performance-wise to other tools with similar goals? Seeing this is written in 100% python, I suspect it will be an order of magnitude slower.
The python code only carries out translation into Perl. The actual execution of the text transformation is Perl, which is generally the fastest of the available tools.
Perl is actually executing the text transformations. The python is a translation layer between the bsed syntax and the Perl command, so you get the performance of Perl with a very minor time cost for interpretation done in Python.
Yeah, there's not a lot of point in having the python layer. Perl excels at these kinds of things. I'm guessing that the original developer just doesn't know perl well enough to implement that part. Which is not to denegrate his work, but this could easily be a drop in single file perl executable if someone does the actual (not that much) work.
Can you elaborate? The tool doesn't require you to learn/know perl, that's just what operates under the hood. I actually chose to exclude any reference to Perl in the title, though it appears a moderator changed it.
I'd be happy to help you port it to modern, high-quality, readable Perl if that's something you're interested in. Being able to install this by dropping a single Perl file into ~/bin/ would be neat.