How to run external programs from Python and capture their output

dalke · on Dec 2, 2014

I weep! This code is only useful when security and portability are not concerns. Admittedly that's the case here, but for more general cases use subprocess for real. Here's the code:

    cmd = 'vw --loss_function logistic --cache_file {} -b {} 2>&1'.format( 
        path_to_cache, b )  
    output = subprocess.check_output( '{} | tee /dev/stderr'.format( cmd ), shell = True )

What happen if path_to_cache contains a space, or other shell metacharacter? The documentation warns that 'shell = True' is a security hazard. And this isn't portable to a non-Unix OS.

A possibly better solution is:

    output = subprocess.check_output(
            ["vw", "--loss_function", "logistic", "--cache_file",
            path_to_cache, "-b", b,"],
            stderr = subprocess.STDOUT)

This doesn't have the tee, which is exists "so [output] gets printed on screen while at the same time it goes to standard output and is captured."

Since that's important, another solution is:

    p = subprocess.Popen(
            ["vw", "--loss_function", "logistic", "--cache_file",
            path_to_cache, "-b", b,"],
            stdout = subprocess.PIPE,
            stderr = subprocess.STDOUT)
    output = []
    for line in p.stdout:
       sys.stderr.write(line)
    output.append(line)
    if not p.wait():  raise AssertionError("There was an error")

I'm the first to admit that this is longer, which may be an important concern for some. I prefer the greatly reduced number of gotchas.