>> If we don't want new file we can redirect the output to same file which will overwrite original file
You need to be a little careful with that. If you do:
uniq -u movies.csv > movies.csv
The shell will first open movies.csv for writing (the redirect part) then launch the uniq command connecting stdout to the now emptied movies.csv.
Of course when uniq opens movies.csv for consumption, it'll already be empty. There will be no work to do.
There's a couple of options to deal with this, but the temporary intermediate file is my preference provided there's sufficient space - it's easily understood, if someone else comes across the construct in your script, they'll grok it.
sponge is cool. But on debian/ubuntu, it's packaged up in moreutils, which includes a few helpful tools. However a programme called parallel is in moreutils, and that's not as powerful as GNU's parallel. So I often end up uninstalling sponge/moreutils. :(
The classic book "The UNIX Programming Environment" by Kernighan and Pike, has a tool in it, called 'overwrite', that does this - letting you safely overwrite a file with the result of a oommand or pipeline, IIRC.
I came here for that. I learned long ago, the hard way, to never ever use the same file for writing as reading. I was wondering if that rule had changed on me.
The rename is atomic; anyone opening "filename" will get either the old version, or the new version. (Although it breaks one of my other favorite idioms for monitoring log files, "tail -f filename", because the old inode will never be updated.)
you could directly write to uniqMovie.csv in your example. I would do it like below but ONLY once I am certain it is exactly what I want. Usually I just make one clearly named result file per operation without touching the original.
You need to be a little careful with that. If you do:
The shell will first open movies.csv for writing (the redirect part) then launch the uniq command connecting stdout to the now emptied movies.csv.Of course when uniq opens movies.csv for consumption, it'll already be empty. There will be no work to do.
There's a couple of options to deal with this, but the temporary intermediate file is my preference provided there's sufficient space - it's easily understood, if someone else comes across the construct in your script, they'll grok it.