Hacker News new | past | comments | ask | show | jobs | submit login
Use TOML for `.env` Files? (snarky.ca)
42 points by edmorley on Feb 10, 2023 | hide | past | favorite | 63 comments



The author mentions .env files come from 12factor design, but the 12factor way is just using environment variables directly for configuration.., Environment variables by themselves are language/library agnostic by design and aren't supposed to be cross-platform by design (they're meant to be tied to the "environment" which includes OS)

If you're storing config in a .env file that's read directly by your application (as opposed to sourcing it by your shell or reading it with docker when launching your container), you might just as well use any other file format for your config and call it a config file, not .env.., it'll still be 12factor compatible if the config would still be overridable by environment variables directly...


Indeed, the whole point of the 12 factor design is to decouple the app from a single config file. Instead you use whatever method you like to define environment variables and the app should just work. Also helps avoid committing secrets to your git repo.

I use SystemD extensively and it's pretty neat, you just: - Define default Environment variables in your unit file - Customize a service by adding an overrides.conf file in foo.service.d/ folder This makes it easy to define exactly the environment for a service.

.env files are helpful in dev mode to customise machines where I don't want to pollute the system env, but they aren't the 'config format' and it makes sense that they are specific to the OS/shell they need to run in.


Yes! I’ve really disliked the tendency recently to treat .env files as a sidecar database versus being actual environment variables. Keeping them as KEY=value newline delimited text makes them far more versatile and usable in shell environments.


>you might just as well use any other file format for your config and call it a config file, not .env

If you name it .env it's much easier to setup source control rules to exclude .env files. If you name them say .config, then you increase the chances of accidentally leaking creds (by checking them into the repo).


I always treated .env files as "this is the current set of environment variables", containing just NAME="value" pairs.

If there are _different_ sets of variables, they should not be in that file, but in some different place, like the suggested settings.toml. Or perhaps use a bunch of different files in .envs/whatever.env and symlink them.

Let's not make everything complicated :)


Hear hear!

The author claims "There is no standard" but I think the standard is so simple it hasn't been written down. The standard is what you said, KEY="value" and that's it. Simple, easy to parse, fast and compatible with how environment variables are declared in `/etc/environment` since forever.

Having different .env files for different OSes is easy as well. You have one `.env` that provides the default values, then `.env.linux` for linux, `.env.windows` for windows and so on, and on runtime, first read .env, have values from .env.$os overwrite those, and finally have whatever the actual environment has overwrite those.

Again, simple and hard to misunderstand.


I wish the 'dotenv' configuration language was so simple that it need not be written down! An off-hand comment (also cited by Brett) says that python-dotenv files "should mostly look like Bash files". Sadly, the vague implication that there's a highly compatible, at-least-ascii-safe subset of dotenv files and bash files is .. very far from the truth.

Here's a line of bash code that sets the variable X to a single-quote character:

    X=''\'''
(lest you think that's an unduly obtuse way to do it, this is what `git rev-parse --sq-quote` does! If not 'best practice' it's surely at least 'practice that's gotta be supported'!)

Here's what python-dotenv gets:

    Python-dotenv could not parse statement starting at line 1
Similarly, when you use python-dotenv to set a key with the value containing only the single quote

    dotenv.set_key('.env', 'X', "'")
the file is not acceptable to bash:

    bash: .env: line 1: unexpected EOF while looking for matching `''


Sounds like a problem of python-dotenv rather than a problem with environment variables. Env vars have been around forever, it's has well established syntax at this point, that X tool doesn't handle it properly doesn't mean the format is wrong.


I have had situations where name="value" will give me ""value"" (I.e double double-quotes) in some languages and others give me just the value.

It's even more confusing if the value contains a space.

It is definitely not consistent.


You can try using ' instead


And the nice thing about those NAME="value" pairs is you can now source it directly in shells, or read it with some straight forward, usually built in library in many programming languages and supporting tools


Until the value has a bang or any other special character that ends up being expanded by the shell and wasting half a day finding it!


> Until the value has a bang or any other special character that ends up being expanded by the shell and wasting half a day finding it!

Just how large are your env files?


It's not the size, it's the content of the values in the env files when they contain things outside of your control, like a password with a bang or a dollar sign.


I don't understand this article. It complains about parsing an .env file, and that .env files come from 12factor web app methodology?

Environment variables existed long before web apps did. And you are supposed to source the .env file to set the variables in the environment. You don't parse them yourself. You call `getenv()` to get the value for your current environment.


It is because the article is just "bad"...

> And then you could take the table idea farther and have a table for a specific [purpose] . You could have a [purpose.test] or [purpose.production], all without having to use separate files where you may accidentally leave out a common setting that every .env file needs to define for your application. (I'm also a fan of less configuration files, not more.)

that is the complete OPPOSITE purpose of .env files!! It is supposed to be such that you never conflict the information, and no simple "flag" sets production vs local environment vs test environment.


I think .env is suitable if you’re just teeing up shell environment variables. That’s the purpose of an env file. The most common problem i have found is when you want to support things like lists and booleans. Both must be parsed.

The cool thing is, a file will always be configuration data, so decide what format works for you and your team and keep your wheels on the ground. Yaml, despite its opponents, is my first choice. I’ve never run into the problems people say the have with it, and I use it frequently to configure Docker-compose and kubernetes resources anyway.


I find the Twelve-Factor App's design to use environment variable for configuration (https://12factor.net/config) unintuitive and perhaps a bad design choice. I believe environment variables are bad for the same reasons global variables are bad.

Pros listed for env vars include not committing them to the repo and not encouraging grouping them together as environments such as dev, staging or prod. I don't agree that these are always good goals, but if they are, the same can be achieved with config files: don't commit them to the repo, generate them on the fly.

The existence and prevalence of .env files is proof that using environment variables as an alternative has failed. Using Twelve-factor as a reference and .env files at the same time is a bit of a contradiction.

Another alternative to consider for both env vars and config files are command line arguments.


> I believe environment variables are bad for the same reasons global variables are bad.

Global variables are bad, but environment variables are actually more like dynamic variables: http://www.chriswarbo.net/blog/2021-04-08-env_vars.html

Dynamic scope is useful for things the caller knows better than the implementor, e.g. configuration, credentials, etc.

> Another alternative to consider for both env vars and config files are command line arguments

The two things which distinguish CLI arguments from env vars are:

- Env vars are usually readable from anywhere, whilst CLI args are usually passed around explicitly (more like lexical scope)

- Env vars are inherently key=value pairs, whilst CLI arguments are better suited to checking presence/absence (e.g. 'foo' versus 'foo --force'), parameters which don't need names (e.g. 'foo myFile') and variable-length lists of parameters (e.g. 'foo file1 file2 file3')


Hi Chris! Thanks for the link, it's an enlightening read, I learned about dynamic variable scopes today.

It did make me change my mind partially about "environment variables are bad for the same reasons global variables are bad." I concur that environment variables are more like constants than mutable globals, even in my language of choice, Python. If you only use them at process boundaries, they is fine, I admit using them that way too:

  parser = argparse.ArgumentParser()
  parser.add_argument("--foo", default=os.environ.get("FOO"))
If they are used at a boundary within a process, however:

  def foo_function():
    return foo_implementation(os.environ.get("FOO"))
Then testing foo_function() becomes a problem because os.environ isn't dynamically scoped within the process. Each test case can set os.environ["FOO"], but then the tests have mutable globals now even if the app doesn't. I know three ways to solve this, each with it's pros and cons:

- 1. Treat the script as a black box, only test the script as a whole -- or not at all. How env vars are used internally doesn't matter. Works well for smaller scripts.

- 2. Keep the code as is, test functions individually by setting and resetting the environment variables in each test setup and teardown. Don't run tests in parallel.

- 3. Push all environment variable usage to process boundaries and make all inner functions pure functions that are only affected by their explicit input parameters. If needed, I even make standard in/out/error, logger instances and other similar globals explicit parameters or class members. Requires more boilerplate, works better for more complex projects. Testing any behavior becomes easier.

I prefer to go with option #1 or #, as #2 feels dirty and makes my test cases smell of workarounds. #3 could look such with few details omitted:

  parser = argparse.ArgumentParser()
  parser.add_argument("--foo", default=os.environ.get("FOO"))
  args = parser.parse_args()

  def foo_function(foo_value):
    return foo_implementation(foo_value)

  def main():
    ...
    foo_result = foo_function(foo_value=args.foo)
    ...

  ...
  
To agree with you, it would be great if the ex-globals-turned-parameters I'm passing around during option #3 would be dynamically scoped. Not shown in the example above, but imagine that instead of printing to sys.stderr, functions receive an stderr: io.IOBase parameter or a custom dataclass that contains such a field. The point is to get rid of mutable global state in all cases.

To disagree with you, I think the correct term for "things the caller knows better than the implementor" are parameters. I'm not sure there's a benefit to preferring dynamic scope for parameters when most languages default to lexical scope.

About your last too points I somewhat agree and somewhat still disagree: "CLI args are usually passed around explicitly" -- I think this is a pro, not a con. Further, CLI arguments are strictly more flexible then environment variables, most argument parsing libraries support key-value parsing in addition to boolean flags and lists.

However, regarding your overall point that I understand as: environment variables used at process bounderies behave like dynamically scoped variables and these are fine. I agree, as long as they stay at process boundaries.


> "CLI args are usually passed around explicitly" -- I think this is a pro, not a con.

Sure; I never said it's a con. They have different characteristics, and are both useful in certain situations :)

> I think the correct term for "things the caller knows better than the implementor" are parameters.

True; that's also the name Racket gives to dynamically-scoped variables https://docs.racket-lang.org/guide/parameterize.html

In fact, Racket uses a parameter (dynamically-scoped variable) to store the environment. This is actually slightly annoying, since the parameter is one big hashmap of all the env vars; but I usually want to override them individually. One of my Racket projects actually defines a helper function to override individual env vars makes a copies all the other environment ( made a are contained in a parameterhttps://github.com/Warbo/theory-exploration-benchmarks/blob/...


Global variables are primarily bad when they're mutable. Environment variables are (usually anyway) global constants, which are indispensable.

And generating config files sounds like a pain, probably more complexity than a lot of us really need. Though I don't disagree that it's a little silly to take env files too seriously as a format.


Files are written to disk. In a cloud setup that mean possibly leaking credentials when your disk is re-assigned to another tenant.

Yes, cloud provider are supposed to properly erase hard drive before reassigning. Can you be 100% sure they do though ?

With environment variable in RAM the problem is moot. Committing and/or generating .env in production system is completely missing the point.


If you're worried about that you should be worried about what gets written to memory too. You have little control over where virtual memory ends up actually storing your bits and bytes. Unless you run without swap, but that's just a bad idea overall.


Lots of distributions precisely clean swap on boot or shutdown for security reasons. Also, clean a swap that is relatively small is faster than zeroing a full disk.

Your argument does not validate the use of easily recoverable .env file. Recovering a .env file is easier than recovering virtual memory.


"Files are written to disk" is not strictly true. In the use case where the config contains (hopefully short-lived) credentials, one would pass them in a temporary file that usually only lives in RAM (unless /tmp doesn't use tmpfs or the temporary config file is put somewhere else) and of course doesn't get committed to the repo. (I'm not sure if you meant git commit or filesystem commit.)

I sometimes find secrets to be safer inside config files since so many times the environment variables get dumped into logs – hence all the popular CI/CD products have features to try to scrub such secrets from their logs.

I agree about not using .env files in production, I'd not use it at all.


This is an advantage with sqlite as a config store as well - initial db config file augmented in-memory with secrets, accessible from all major languages, without relying on the vagarities of the filesystem (windows vs Linux tmp mount points) and easy to have multiple switchable configurations depending on environment, test mode (integration tests after deployment etc.) or customer.


TOML is a bad file format for human configuration.

For example, in the following file

  [hosts]
  "example.org" = "localhost:8000"
  "foo.com" = "localhost:9000"
  "sub.example.com" = "localhost:9002"

  certfile = "path/to/cert.pem"
  keyfile = "path/to/key.pem"
Ordinary human readers would generally think that the hosts table has 3 entries. But TOML considers certfile and keyfile to also be entries in the hosts table.

TOML has no way to end a [table] on its own; tables continue until EOF or until the next [table].


It's weird that the only way to get the following structure:

  {
    "hosts": {
      "example.org": "localhost:8000",
      "foo.com": "localhost:9000",
      "sub.example.com": "localhost:9002"
    },
    "certfile": "path/to/cert.pem",
    "keyfile":"path/to/key.pem"
  }
is to write the following toml:

  certfile = "path/to/cert.pem"
  keyfile = "path/to/key.pem"

  [hosts]
  "example.org" = "localhost:8000"
  "foo.com" = "localhost:9000"
  "sub.example.com" = "localhost:9002"
and other toml orderings, like in the parent comment, fail.


this is a tradeoff so that ini/toml can avoid braces and whitespace rules, which is actually a win for (non-programmer) human usability. to most users, fixing an "unbalanced braces" error is "programming", especially since that's exactly the kind of error that parsers can't give useful advice for in the error message.

the tradeoff is that your most general top-level settings must come before your category-specific settings, which is usually a pretty natural layout anyway.


Why would you want to require a specific order in the resulting object representation? If you need ordering, use arrays. And if you need to do stuff like content signatures, it makes sense to use some form of normalizer anyway (e.g. alphabetic key order).


There isn't a need for specific ordering of keys in the object representation. But I want to be able to write parts of the toml document in different orders (e.g. "hosts" table before "certfile"; or "hosts" table after "certfile") and still have the same effective object.


This is an advantage with sqlite as a config store as well - initial db config file augmented in-memory with secrets, accessible from all major languages, without relying on the vagarities of the filesystem (windows vs Linux tmp mount points) and easy to have multiple switchable configurations depending on environment, test mode (integration tests after deployment etc.) or customer.


Good point, but on the other side a clear visual enforcement between global and section specific stuff makes sense as well.

Apache configs are my personal favourite hate subject here.


The "global" and "section specific" stuff stops being so clear once you need to have subsections. And the "clear visual enforcement" can also lead to forced unreadability, when the ordering forced on you doesn't match the most sensible/readable order for a human reader.

This is repeating the same mistake I see all the lightweight markup formats (Markdown, Org Mode, etc.) do - using implicit terminators for hierarchy nodes. It's a superbly annoying feature of outliner tools (including the one I otherwise love: org mode) that forces you to create extra levels of structure just for the sake of being able to surround subtrees with context.


I'm an ordinary human and I disagree. I would not expect a blank line to end the [hosts] table. I find it nice that you can visually separate different categories of entries under a table. I also find it nice that global entries must be at the top of the file – more organized and fewer opportunities for such global entries to get lost.


I’ve always liked yaml but I feel the world is moving towards toml. Keen to see more of toml issues to build a fair comparison


It doesn’t help that YAML has footguns. I love it as an editable format, especially compared to JSON, but for library writers, supporting the whole spec (safely) tends to not be possible; Some even go as far as having a “safe load” function that purposefully violates the spec to remove footguns.


If you use "NO" it gets parsed as norway


The whole "`NO` is Norway" thing is indeed one of the footguns people love to bring up. However, as someone who is writing the YAML manually, and has a syntax highlighter, these issues don't manifest.


I dunno, this is where I said fuck it and never used it again.

I don't know what kind of mind will that as an OK reserved word for a programming language. but OK..

I guess I'm lucky I didn't need to use yaml for my work.


I thought it was the other way around? i.e. "NO", in a document that intends "NO" to mean "Norway", is parsed as "False".


I really like the Gura format (https://github.com/gura-conf/gura). Seems to combine the best of yaml, json, and toml for the use case of human created configuration files.


I've also seen the Dhall configuration language (https://dhall-lang.org/) mentioned.


Windows user here. Use .env files. Most unix-ish coding is done on WSL these days and bash is the default shell there.


I will admit, env files can be pretty "interesting" at times.

I've run into so many issues over the years around incompatibilities with how Docker Compose v1, Docker Compose v2 and Kubernetes tools process an .env file.

Often times it's related to having characters like $ in your value. Across many different examples sometimes you need to single quote the values, other times you need to use double quotes. Often times with Kubernetes tools you can't use quotes (certain ways of populating config maps and secrets from an env file have serious issues if you use quotes). Sometimes you need to escape certain characters, etc.. For a long time Docker Compose didn't allow `export MYVAR=coolvalue` because it had a space in it (it does now).

With that said, I can't realistically see dropping them for a config file because Docker Compose lets you use variable interpolation from an env file in a docker-compose.yml file which is awesome for reducing duplication. Having an env file that you can source in a shell script is also very convenient for ancillary commands that go with your project.


So, for my own projects in Go lang, I created a simple library for loading configuration files, so it can load configuration either from .toml files, or from .env files or just from an environment variables.

Then you define your config like that:

  type ServerConfig struct {
    Address  string `env:"HTTP_ADDRESS"`
    Port     int    `env:"HTTP_PORT"`
    UseTLS   bool   `env:"HTTP_USE_TLS"`
    CertFile string `env:"HTTP_CERT_FILE"`
    KeyFile  string `env:"HTTP_KEY_FILE"`

    Timeout int `env:"HTTP_TIMEOUT"`
  }

  type Config struct {
    Server ServerConfig `env:"SERVER"`
  }

So, for example, with the code like this:

  if err := config.LoadToml(&cfg, tomlFileName); err != nil   {
   return nil, err
  }

  if err := config.LoadOverrides(&cfg); err != nil {
   return nil, err  
  }
It can load from TOML file, and from environment variables. This means - service can be either started from container, or run locally with .toml file.



.env is fine. Don't need another format.


TBH, config is one of those Pandora's boxes - what happens if I want to do an integration test using a dev database and local docker images? what about testing after deployment to prod?

Given the myriad of configurations that can exist, including secrets management, static data etc. I would be very tempted to try to build tooling around a sqlite database, storing all your configs including static data, and update it with secrets at runtime. This way you can even remotely interface with the configuration for debug/monitoring, lint the config before commits etc.


One way to solve this is to be able to import into the .env file:

    import values from local/or/public/private/url
    ...values
When reading the file, the environment variables will be obtained from the URL and populate the environment.

This is what I had in mind when designing the import functionality for deon [1].

Being able to import also makes it easy to have a .base, a .production, a .local setup, and combine them accordingly.

[1] https://github.com/plurid/deon


Neat solution, sqlite gives you much more though. What happens with the .env files with inheritance eg. env.linux overridden by env.linux.local? You need one url per permutation, or call them in the correct order, now you have a config for your config.

What about if you require multiple local configurations for eg. testing inside and outside docker?

What if you have several developers working on the same codebase with different settings? What if you have static data that changes with environment? Sqlite can answer all of this, including parsing quotes and handling filepaths in a uniform fashion


Not sure what you mean by "config for your config", seems this is what you want to obtain with an SQLite: someone would still have to manage those databases. The way I use deon for inheritance is with one directory per project, with one file to be read at startup.

file tree:

  environment/
    .env.base.deon
    .env.local.deon
    .env.production.deon
.env.local.deon file:

  import base from ./.env.base.deon

  {
    ...#base
    NEW_VALUE foo
    OVERWRITING_VALUE boo
  }
Then the node process will be started with:

  deon environment ./environment/.env.base.deon -- node build/index.js
If you want to take it a step further you could import values from a URL, even using a token from an environment variable for authentication, such as:

  import values from ./.env.base.deon
  import overwrites from https://deon-data.example with #$DEON_TOKEN

  {
    ...#values
    ...#overwrites
  }
Not sure what's the problem with several developers on the same codebase. Aren't they using their own, individual machines? Each developer can have their own environment file as they wish, or their own environment DEON_TOKEN.

If the static data changes with the environment then it's not that static, or I don't know what you mean. If I were using deon, I would split the .base file into two or more, and import accordingly.

Not sure what you mean by "including parsing quotes and handling filepaths in a uniform fashion". Have you found a bug in deon?

Anyhow, if your use case is too complex, of course you will need special tooling, deon is more of a research for my own requirements and still needs to be written in a compiled language, now it is only for the JavaScript ecosystem.


My major pain point was that I had big JSON files inside my .env files. We use Cloud Foundry at work and it uses plenty of these for configuration.

I wrote a small Rust tool called ‘json_env’[0] to read JSON files and supply them as ENV vars to a program. I’m working on it in my free time and eventually want to also replace direnv with it. TOML and YAML support is also planned.

[0] https://github.com/brodo/json_env


But why? Is it not easier to store the JSON as file and have an env variable that points to it?

JSON as env value is utter madness.


Having no config files is part of the 12 factors [0]. All cloud providers I have worked with adhere to this.

[0]: https://12factor.net


12factor doesn't mandate that you put the entirety of a JSON file into a single variable. That is what you implied with your original comment.


Or better yet, have sane defaults and just stick to exporting whatever you need to the local environment. If that ends up being tons of variables, chances are something is very wrong somewhere.


I still don't see why people use .env when direnv is a thing, and more Unixy.


Isn't direnv a tool for interactive shells (which also relies on current directory)? There are more ways to launch a process than from an interactive shell (standing in a certain directory, even).


Direnv loads the environment variables listed in .envrc in the current shell. On Unix all spawned processes inherit the environment of their parent.

It's perfect for development, and reading .envrc outside of interactive shells is just a "source .envrc" away, or use the appropriate plugin (such as Emacs direnv mode)


Yes. I'm aware of this. My point is, not everything that starts a process is an interactive shell (or a subprocess of an interactive shell), and even when you do launch things from a shell, you're not necessarily in the directory of the thing you launch. Examples would be automatic builds on code commits, cron jobs or whatever.

So in the end the magic of direnv is only helpful in special circumstances. Outside those circumstances, you'd have to treat its directory local configuration file like an .env file anyway, and a more complicated one at that given that it's likely to contain shell keywords (such as unset) which your parser has to be aware of - so now you're worse off than with regular .env files.


> There is no standard

> Not cross-platform

> python-dotenv

ehem ... shell.nix

https://nixos.wiki/wiki/Development_environment_with_nix-she...




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: