Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I found where I used to work that cron jobs usually failed because they weren't adequately tested, so I tried using MAILTO but it didn't do what I wanted, so started putting something like (it's been a while):

45 5 * * * /bin/bash -eux -C /path/to/real/cmd >> /var/log/somefile 2&>1 | mail -S "/path/to/real/cmd... at 10.2.3.4 failed with details in /var/log/somefile" to-address < tail /var/log/somefile

...then test failures to make sure the email and everything worked as expected (like, mail might not be set up correctly on the box by default). And either overwrite /var/log/somefile with ">" instead of ">>" or use logrotate. Of course, the /path/to/real/cmd script, if a shell script, should have something like "set -eux" or at least "set -e" at the top (and be well tested), otherwise it won't always report failures and this has no chance of working.

I didn't (in mild use) see unreported failures after that, and it was really handy for problem diagnoses when something did go wrong thereafter.

But after any change I had to test carefully again every failure mode etc, because it seemed so easy to miss something that causes unexpected behavior. Maybe even had to wrap it in an "if" statement (single-line), "..else mail...".

It would be fun but time-consuming to automate those tests, maybe with shunit2 (or something named roughly like that), to rerun periodically and make sure ops didn't change the mail config to break this setup, or something.

I know that looks awful but I enjoyed it. It might just be easier to use your replacement. How did you advertise?



In case anyone ever reads that, too late for me to edit but: file 2&>1 | mail -S ...should be: file 2&>1 || mail -S ...or more likely the wrapped "if" mentioned.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: