Hacker News new | past | comments | ask | show | jobs | submit login
Use the Unofficial Bash Strict Mode (Unless You Love Debugging) (redsymbol.net)
180 points by gkst on March 18, 2016 | hide | past | favorite | 57 comments



The article is dangerously wrong in its discussion of IFS.

What you should do to avoid the problem of mishandling spaces is use proper quoting (for i in "$@"; do ...), not changing IFS; setting IFS to \n\t will still break embedded tabs and newlines.

In general, in bash scripts any of use of $ should always be between double quotes unless you have a reason to do otherwise.


Seconded. It is quite off mark. This will break code which depends on splitting, like when you have a some variable called FOO_FLAG which contains "--blah arg" that's supposed to expand to two arguments. Observing proper quoting is the way (except for internal data representations that you can guarantee not to have spaces).

And also, the newline and tab is not explained! What is with that?

"We don't want accidental field splitting of interpolated expansions on spaces, ... but we do want it on embedded tabs or newlines?"

Huh?

If you don't want field splitting, set IFS to empty! (And then you don't need the dollar sign Bash extension for \t and \n):

   $ VAR="a b c d"
   $ for x in $VAR ; do echo $x ; done
   a
   b
   c
   d
   $ IFS='' ; for x in $VAR ; do echo $x ; done
   a b c d
No splitting on anything: not spaces, tabs or newlines!


Agreed. In addition to still having trouble with tabs and newlines, setting IFS still leaves the other big problem with unquoted variables: unexpected expansion of wildcards. The shell considers any unquoted string that contains * , ?, or [ to be a glob expression, and will replace it with a list of matching files. This can cause some really strange bugs.

Also, an unquoted variable that happens to be null will essentially vanish from the argument list of any command it's used with, which can cause another class of weird bugs. Consider the shell statement:

if [ -n $var ]; then

... which looks like it should execute the condition if $var is nonblank, but in fact will execute it even if $var is blank (the reason is complex, I'll leave it as a puzzle for the reader).

Setting IFS is a crutch that only partly solves the problem; putting double-quotes around variable references fully solves it.


The test command has certain rules depending on the number of arguments. The most pertinent rule is: For one argument, the expression is true if, and only if, the argument is not null.

In this case

    [ -n $var ] 
is the same as

    test -n $var
$var is not quoted, so when this command is run, word splitting occurs and therefore $var is null. Which there falls into the one argument rule above.

Therefore, always quote your variables.


> the reason is complex, I'll leave it as a puzzle for the reader

    [ -n ]
is the same as

    test -n
In this case -n has no argument, so it cannot be parsed as "-n STRING", instead it is parsed as "STRING", where STRING is "-n", with the behaviour "True if string is not empty.".


Google's Testing on the Toilet for Bash talks about $, and is a great reference for those trying to improve their bash scripting:

http://robertmuth.blogspot.com/2012/08/better-bash-scripting...


it would almost make sense (but still be wrong) if it were discussing POSIX shell, where arrays do not exist and these contortions are necessary.

what the author is doing is like this in Python:

    stuff=["a b", "c d", "e f"]
    for thing in '\n'.join(stuff).split('\n'):
        print thing


That's right for $@, but AFAIK only $@ – for instance you can't do:

    for filename in "*.txt" ; do...


   for filename in *.txt ; do ...
has no issue with spaces in filenames. If *.txt matches "foo bar.txt", then that's what the filename variable is set to. In the body of the loop you have to make sure you have "$filename".

You don't need play games with IFS to correctly process filesystem entry names expanded from a pattern.


Wildcards are not variables. Wildcards don't get expanded in quotes. Variables get expanded in double quotes but not single quotes. $@ obeys all the same expansion rules as all other variables. Command substitution with both $() and `` follow the same rules as variables.


No, $@ is special where $* is regular. Consider the following script foo.sh and call it like so:

    ./foo.sh one "a b" two
The $* part will print one line, the $@ part will print three lines.

    #!/bin/bash
    for x in "$*"; do
        echo $x
    done
    for x in "$@"; do
        echo $x
    done
I use "$@" often, but to this day I don't fully understand how $@ works...


Unquoted $@ works exactly like unquoted $* .

Quoted "$* " separates the parameters using the first character stored in IFS, or with nothing if IFS is unset/null. Usually, the first character of IFS is space, giving the impression that "$* " means "separate with spaces"; i.e. that it's just an ordinary quote job around $* (i.e. that it is "regular", in your words).

Quoted "$@" does ... what you clearly understand.


> $@ obeys all the same rules as all other variables.

That's hardy the case. Most other variables do not represent the positional parameters, and don't have the logic of "$@" which effectively produces "$1" "$2" "$3" ...


I was refering to the expansion behavior, which was the point in contention. I've clarified the original comment.


Agreed, but that's not something you can automatically enforce.


I guess you could require a comment on any line with an unquoted variable expansion...


How about some naming convention? If a variable contains a word with no spaces, call it $foo_w or something. Shell linting programs can be patched to recognize that and suppress their warnings.

Heck the shell language itself should have a declaration for this!

IMAGINARY FEATURE:

   typeset -w foo  # foo is not expected to contain spaces
(Or more generally, expansions of foo are not expected to undergo field splitting by IFS regardless of content.)

Now if you have an unquoted $foo that undergoes field-splitting, bash produces an error if that splitting actually breaks the contents of foo into two or more pieces.)

Furthermore, a way could be provided to declare that a variable requires splitting. Maybe "typeset -W". This could even assert into how many pieces "typeset -W3 foo" means that expansions of foo are expected to undergo splitting, and it must be into three fields.

Then there could be a global diagnostic option (similar to set -u and set -e) which diagnoses all unquoted expansions of variables, except for the -W and -w ones. The -w ones are diagnosed if they are subject to splitting, and splitting actually occurs. The -W ones are diagnosed if they quoted, or if they are unquoted and splitting doesn't produce the required number of pieces, if specified.


It's also a good idea to check your complex scripts before run with awesome shellcheck tool. http://www.shellcheck.net/


Thanks for this!

https://github.com/koalaman/shellcheck

and `brew install shellcheck`


To be honest, the build time required for this (and usually GHC as well) get really annoying. Especially for simple updates.


Does homebrew not support precompiled binaries?


Yep, it does and installing shellcheck took about four seconds for me.


Yes, but for use in a CI it's a real drag


Why is that? Are the download servers slow?


http://mywiki.wooledge.org/BashFAQ/105:

> Why doesn't set -e (or set -o errexit, or trap ERR) do what I expected?

> set -e was an attempt to add "automatic error detection" to the shell. Its goal was to cause the shell to abort any time an error occurred, so you don't have to put || exit 1 after each important command. That goal is non-trivial, because many commands intentionally return non-zero.

http://mywiki.wooledge.org/BashFAQ/112:

> What are the advantages and disadvantages of using set -u (or set -o nounset)?

> Bash (like all other Bourne shell derivatives) has a feature activated by the command set -u (or set -o nounset). When this feature is in effect, any command which attempts to expand an unset variable will cause a fatal error (the shell immediately exits, unless it is interactive).

pipefail is not quite as bad, but is nevertheless incompatible with most other shells.


The example given in the article:

grep some-string /some/file | sort

is a good example of why -e and pipefail are dangerous. grep will return an error status if it gets an error (e.g. file not found) or if it simply fails to find any matches. With -e and pipefail, this command will terminate the script if there happen to be no matches, so you have to use something like || true at the end... which completely breaks the exit-on-error behavior that was the point of the exercise.

Solution: do proper error checking.


From the first link (105).

> GreyCat's personal recommendation is simple: don't use set -e. Add your own error checking instead.

> rking's personal recommendation is to go ahead and use set -e, but beware of possible gotchas. It has useful semantics, so to exclude it from the toolbox is to give into FUD.


To be honest, I think traditional shells are now only good for environments where you know you aren’t doing anything too weird and all the most likely inputs work as expected without a lot of effort. This is spending time wisely; just because everything except a zero can technically be part of a Unix filename doesn’t mean that I want to invest hours or days making damn sure everything works for pathological cases.

If I actually do want to guard against every case imaginable, I immediately switch to Python or some other language that at least knows how to quote things unambiguously without a lot of effort.


Shell is a lot better than the other languages I know at many tasks involving I/O redirection, spawning programs, etc. It's kind of arcane, but so are the APIs for doing that stuff in other scripting languages. I'm eagerly awaiting some new contender in the system scripting language arena though.


Fails to mention what is in my opinion the most devious, subtle potential pitfall with `set -e`: assigning (or even just a bare evaluation of) an arithmetic zero. `foo=0` won't do anything surprising, but `let foo=0` will return 1, and thus abort your script if you're not careful.

Also, as an alternative to the proposed `set +e; ...; set -e` wrapper for retrieving the exit status of something expected to exit non-zero (generally cleaner in my opinion, if slightly "clever"):

    retval=0
    count=$(grep -c some-string some-file) || retval=$?


I wrote library for shell scripts with design goal to work properly in strict mode: https://github.com/vlisivka/bash-modules


One solution is not to use Bash. There are more basic shells that are equally as, or more, POSIX-like.


I still don't get why bash (or zsh) don't try to integrate more Korn shell (88 & 93) scripting features. But there the focus seems more on more colorful prompts and autocompletion handholding…

And even despite more free licenses (AFAIR, IANAL), you can't depend on actual Korn shells being available on Unices. At least the dependent app situation has been getting a lot better, mostly by the death of workstations and their proprietary OSs (try depending on almost any grep/awk/sed option/switch when it has to run on Solaris/AIDX/HP-UX). Although "all the world's a GNU/Linux" seems the new plague upon our lands here…

So after all these years, I'd say we're still in pretty much the same situation that birthed Perl. Which still would be my preferred choice if I'd actually have to distribute scripts and we're not talking about my own private, context-specific shortcuts, scripts and functions.


which particular ksh features do you miss in zsh?


I use set -e or not based on the needs of the script. Many times I want the script to continue and sometimes I don't. Sometimes I don't set until partway down the script so the top isn't strict. I wouldn't want to set it on every script.


I've encountered a fair number of people who blindly set -eu on every script as a matter of course, as suggested by the author here. While this article goes through a bunch of the pitfalls, in my experience people often fail to account for the many (sometimes unintuitive) ways this can cause an abrupt exit. Sometimes stopping halfway through is just as bad as continuing under false assumptions, particularly if the script is so simplistic that it isn't obvious to the user that it exited midway through.

I almost never use -e, and as a result I have to stay vigilant and test return codes all over the place. I prefer that to the kludge of forcing everything to return zero and I think it produces overall better results. Ultimately you want to handle an error, not just abort, and -e doesn't do much to promote handling properly.


Absolutely, but it's a good general default. I don't want set -e on my backup aggregator script (if one sync fails, keep doing the others), but for most other scripts it's good to have.\

I've had mixed luck with set -e combining with trap ERR though - they seem to conflict with each other and I don't have the shell knowledge to sort it out.


Instead of “|| true”, I prefer “|| :” – it’s shorter, and : is always a built-in, even in bare-bones shells where “true” is an external command.

Since “|| :” is always written at the end of lines, it should be short and visually unobtrusive.


':' isn't valid as a command in the fish shell (unless you make an executable named ':' and put it in your path)

  user@host ~> :
  fish: Unknown command ':'
edit: format


The fish shell? Does it claim POSIX /bin/sh compatibility?

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3...



Shellcheck is great when you integrate it with your editor.

I use it all the time when I have to write shell scripts.


I think the authors advice will save you trips to this other tool.


A simpler solution is to use the Plan 9 rc shell for scripting. It had more sensible syntax and doesn't rescan input so many of the issues raised in the article just don't occur.


What a nonsense.

Instead of handling non-zero exit statuses in a correct way,the article suggests to interrupt the script right in the middle, with probably some temporary files and processes hanging around which can't be cleaned up if something goes wrong.

The same BS goes though the entire article.

Has the author actually written anything bigger than echo "Hello world!" in Bash?


In bash you clean up in the exit handler:

  trap mycleanupfunc EXIT


I've read it, don't worry. So basically we create a problem with the use of -e flag and then solve it with traps... And what if our logic depends on exit statuses, for example when we check whether some utility or a file is present in the system? I don't want the script to exit, I want it to go another logic branch!

P.S. No, temporarily disabling the option is not a solution, it's another workaround for the problem created out of nothing.


Exit traps for cleanup are a good idea in any case. Scripts can be killed by signals and whatnot as well.

I'm not sure what scenario you're imagining with your other concern. This does what you would expect:

    set -o errexit
    if [ -f somefile ]
    then
        echo "File exists."
    else
        echo "File does not exist."
    fi


Do you understand how the flag works? You can still condition on exit status. It's only if you have an 'unhandled' exit status that causes script termination.


Clearly silently ignoring errors is the better approach. Or, if you are actually handling errors, then it doesn't matter either way.


Or simply start with

    #!/bin/sh
and stick to POSIX-compliant code.

I'm not sure that writing scripts that rely on bash-specific features is such a great idea.


Exactly how does that improve the error handling in your shell script beyond "ignore all errors", as discussed by the OP?

I am genuinely curious how you would write a command with a pipe in plain POSIX /bin/sh such that a non-zero exit status from the program that writes into the pipe is detected (as can be done in bash with "set -o pipefail" or "$PIPESTATUS").


Note that these options don't propagate to subshells. So be avare of your commands between ` marks.


They do propagate to subshells. The only exception is that bash (unlike other shells) clears -e in command substitutions.


They might propagate to subshells. Someone did a lot of checking:

http://www.in-ulm.de/~mascheck/various/set-e/


On the subject as an aside to the point you're making, I'd recommend never using `backticks` $() is posix and far less error prone. It can easily do multiple nested sub shells, escaping is less insane, keeps your teeth whiter and smiler brighter and all that sort of things. `backticks` stomps on your lunch, steals the affection of your (girl/boy) and makes you feel stupid.


there's also this insane idea of not scripting in bash. ffs!




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: