Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>> If we don't want new file we can redirect the output to same file which will overwrite original file

You need to be a little careful with that. If you do:

    uniq -u movies.csv > movies.csv
The shell will first open movies.csv for writing (the redirect part) then launch the uniq command connecting stdout to the now emptied movies.csv.

Of course when uniq opens movies.csv for consumption, it'll already be empty. There will be no work to do.

There's a couple of options to deal with this, but the temporary intermediate file is my preference provided there's sufficient space - it's easily understood, if someone else comes across the construct in your script, they'll grok it.



The utility to do this is called sponge.

http://linux.die.net/man/1/sponge

    uniq -u movies.csv | sponge movies.csv


sponge is cool. But on debian/ubuntu, it's packaged up in moreutils, which includes a few helpful tools. However a programme called parallel is in moreutils, and that's not as powerful as GNU's parallel. So I often end up uninstalling sponge/moreutils. :(


There's some attempt underway to fix that fwiw: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=749355


GNU Parallel is an indispensable heavy lifter on the command line. I was expecting it to show up in the article.


moreutils for my usage contains 'chronic', which prepended to a command, stops cron from alerting on any non-error output. Big fan.


Thanks for throwing this out there. Never heard of this command before! Definitely a good one for my bag o' tricks.


The classic book "The UNIX Programming Environment" by Kernighan and Pike, has a tool in it, called 'overwrite', that does this - letting you safely overwrite a file with the result of a oommand or pipeline, IIRC.


I came here for that. I learned long ago, the hard way, to never ever use the same file for writing as reading. I was wondering if that rule had changed on me.


Thank you for inputs, how about this?

uniq -u movies.csv > temp.csv

temp.csv > movie.csv

rm temp.csv


long_running_process > filename.tmp && mv filename.tmp filename

The rename is atomic; anyone opening "filename" will get either the old version, or the new version. (Although it breaks one of my other favorite idioms for monitoring log files, "tail -f filename", because the old inode will never be updated.)


> Although it breaks one of my other favorite idioms for monitoring log files, "tail -f filename", because the old inode will never be updated

You should look into the '-F' option of tail; it follows the filename, and not the inode.


you could directly write to uniqMovie.csv in your example. I would do it like below but ONLY once I am certain it is exactly what I want. Usually I just make one clearly named result file per operation without touching the original.

uniq -u movies.csv > /tmp/temp.csv && mv /temp/temp.csv movies.csv


  $ temp.csv > movie.csv
  temp.csv: command not found


He forgot his cat.


You might want to use sort -u




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: