I think my new favourite way of managing runbooks is to actually build them into...

I think my new favourite way of managing runbooks is to actually build them into a file tree of a bunch of simple python subcommand scripts, and have a run.sh script that scans the file system and uses argparse to construct a cli to call each script.

  # call ./runbooks/stack/update_secret.py
  # could update a secret in a vault, or update it in your deployed app
  ./run.sh stack update_secret --env=dev --name=foo --file=secret.txt

Most of the time my python scripts are glorified CLI commands like `docker service update` that are called through subprocess, so you shouldn't need to install dependencies beyond what you'd be typing in the CLI. It's also easy to add a verbose option to print out the commands it runs so you can do it manually.

  # call ./runbooks/services/build.py
  ./run.sh services build -v
  > #--- Building images ---
  > #> DOCKER_BUILDKIT=1 docker build --build-arg BUILDKIT_INLINE_CACHE=1 --label "myapp" -t example-admin-ui:local "./admin-ui"
  > #> DOCKER_BUILDKIT=1 docker build --build-arg BUILDKIT_INLINE_CACHE=1 --label "myapp" -t example-frontend:local "./front-end"
  > #> DOCKER_BUILDKIT=1 docker build --build-arg BUILDKIT_INLINE_CACHE=1 --label "myapp" -t example-nginx:local "./nginx"

Anything that can't be automated prints out an input line that gives instructions on what to do and just waits for you to input "yes/no"

  # call ./runbooks/get_crash_report.py
  ./run.sh get_crash_report --out=./crashes/
  > # Copying crashes from AWS to './crashes/
  > # Manual Step: Fill out crashes spreadsheet: docs.google/example_sheet
  > Continue [y/n]?

The other really nice thing with this setup is the run.sh script is able to build up --help commands that can print out what actions are available and what params they use cause it's just python argparse. Makes discovery of what to do or looking up params really quick.

At this point, the only culture you need to build is one where everyone's supposed to use the run.sh scripts and not do things manually. This enforces people to fix the scripts when something changes.

YMMV, but I've found this has simplified a lot of processes for myself at least.