At a previous place I worked, if they were working on the cli (eg in psql or similar) they'd always use these two steps, either of which would provide adequate protection:
1. Start a transaction before even thinking of writing the delete/update/etc (BEGIN; ...)
2. Always write the WHERE query out first, THEN go back to the start of the line and fill out the DELETE/UPDATE/etc.
It worked well, and it's a habit I've since tried to keep on doing myself as well.
Once I wanted to do `rm -fr *~` to delete backup files, but the `~` key didn't register...
Now I have learnt to instinctively stop before doing anything destructive and double-check and double-check again! This also applies to SQL `DELETE` and `UPDATE`!
I know that `-r` was not neccessary but hey that was a biiiig mistake of mine!
This gives you a static script with a bunch of rm's. You can read that, check it, give it to people to validate and when you eventually run it, it deletes exactly those files.
I do this too. Although I usually do `echo rm .txt~` then just remove the `echo` once everything works. Also works well for things like `parallel` and `xargs` where you can do `parallel echo rm -- .txt` and it basically prints out a script. Then you can remove the `echo` to run it.
If you ever type really dangerous commands, it is good practice to prefix them with a space (or whatever your favorite shell convention is) to make sure they not saved in your history.
One of my "oopsies" involved aggressive <up> and <enter> usage and a previous `rm -rf *` running in /home...
One time I was debugging the path resolver in a static site generator I was writing. I generated a site ~/foo, thinking it would do /home/shantaram/foo, but instead it made a dir '~' in the current directory. I did `rm -rf ~` without thinking. Command took super long, wondered what was going on, ctrl-c'd in horror... that was fun.
I’m really curious: without cheating by using the GUI, what would be the proper way to delete such an obnoxiously-named directory? Would “$PWD/~” work?
That may be fun in a trivial setup such as op’s but when millions of customers or billions of transactions are affected it’s a nightmare. A competent engineer runs queries against a local and then a uat db, verifies results and then on prod. But if you must do it in prod then it must be limited in scope.
We're on the same page with the best approach, I just don't consider corrupting an unpredictable subset of my database much of an improvement. It's not closer to correct, it's just still incorrect.
This can also bite you if your dataset is larger than the buffer pool (or whatever other RDBMS calls it), and the particular table you’re querying isn’t commonly accessed.
Turns out when you start loading millions of rows of useless data into memory, the useful data has to get kicked out, and that makes query latency skyrocket.
Same thing on a Unix/Linux level: When using find, I always do it first with -print until I see the files I want. Only then do I add the actual action I want.
Don’t ask me how I learned this.