Using the sqlite db as the canonical configuration is a nice idea, but reminds me of the fun of bind, editing one file to re-generate another. It is as if he is ignoring the advice in his older post about making software usable.
Instead of having a simple text format that is the configuration, he encourages everyone to write their own. I can imagine the fun now on a mailing list - debugging n+1 languages to start a webserver.
When you deploy, do you deploy the sqlite database file or the script that generates it? Does the daemon check if the db is out of date on startup? Does the script clear the existing db when it runs? Does it have merge logic? How do you keep them in sync?
'But it's powerful' I hear you cry, but my retort is simply 'it is effort to maintain, deploy and configure'. It smacks of: here is a bus port, go build yourself a keyboard. This sort of toolsmithery is what brings us delights like autoconf.
A simple change would improve things a lot, whilst maintaining the flexibility he craves:
instead of 'generate a config file' with your tool and hand it off to mongrel, the tool you write is part of mongrel.
You write a script called /etc/mongrel/makeconfig, that takes one argument, the name of the mongrel settings database. When mongrel starts, it invokes makeconfig to build the database.
You can provide a bunch of trivial makeconfig scripts as defaults, including a shell script that just runs a bunch of sqlite statements.
> Instead of having a simple text format that is the configuration, he encourages everyone to write their own.
Uh no, you obviously didn't read the article where at the end I show you the default format we provide to you in m2sh. I think before you go waxing poetical about something you should learn more about it.
And it's not powerful, it's simple. Take a look at the code that loads the configuration in mongrel2 from sqlite, and compare that to similar C code (also written by me) in the m2sh source to just parse a config file. Code doesn't lie and in this case, parsing a config is a hell of a lot harder than just loading it out of a sqlite file.
parsing a config is a hell of a lot harder than just loading it out of a sqlite file
Is there really no library for that? </rhetorical> The usual INI file format has been unchanged since the 80s!
[section1]
param1=value1
param2=value2
; this is a comment
[section2]
param3=value3
And the reason it's stuck around for so long, on every platform you can think of, is it works, it's bulletproof, and everyone understands it at a glance. Anyone can come up with their own scheme, then you end up how it is Javaland right now, where the "application" is just a runtime for the application written in a custom one-off language which is the 10,000-line XML configuration...
Anyway, the people who get called at 3am when production is down won't thank you for introducing yet another config file format.
Agreed, the potential issue of having different people having different configuration sounds a bit dangerous. OTOH, most issues you mention are true whether you use a db or a plain configuration file - they are just more explicit with sqlite3 (which acts as the model in a MVC organization of your app settings).
Incidentally, I am thinking quite hard about those issues (configuration representation, live change/querying, versioing) for internal apps, it is not an easy issue once your app has more than a few parameters.
Bad call. The canonical version of the server configuration doesn't play nice with version control, and so staff has to work with a facsimile and hope that nobody makes out-of-process changes.
Because programmers are expensive, but sysadmins are free.
If you're a programmer working for a large organization, despite sitting in an open plan office surrounded by dozens of other programmers, and knowing that there are hundreds in the building, you probably nevertheless believe that yours is the only code running in production, and the entire operations staff exists in order to support it...
All I see are things about religion, dressing like the status quo, and other things that say "old way of doing things". I think you mean to say "official" or something like that. Let's use official.
In this case, the config file you use is the "official" configuration. If you use the m2sh format than that's the default official format. If you put those into version control then you have solved your problem.
But then, I'm sure you have some witty reply to this reply to your canonical one-liner:
"Conforming to orthodox or well-established rules or patterns, as of procedure."
I used the word in the sense of "authoritative"; M-W might say "the most solemn and unvarying part". Which seems like the right term to use for "the form of the config the server actually honors".
You chose this as a design feature worth highlighting; you asked for people's opinions about it. I think it's flawed. Other tools I like are flawed in similar ways (tinydns, for example). Don't worry, I'm not leading a movement to overthrow you.
So you meant "official" then. I actually don't mind your rebuttals and over throw attempt, I live for that kind of debate. You just tend to write one-liners that really have no meaning and provide nothing useful as feedback.
For example, the problem you describe of a lack of an official configuration exists not because of the storage mechanism but because you have a storage mechanism. If you use a VCS then you have the same problem since the stored version can be different from the actual version running on any one server.
That begs the question how what's on the server gets to be different from the "official" stored version, and in this case it's people. It's always people.
So, let's look at the two mechanisms and compare:
1. VCS and just a config file means you have no idea who changed that config file. It's not in the VCS remember? Post crash forensics are then difficult and you really only have a diff.
2. VCS, a config file, and a sqlite storage with commit logs (we have those, you can see who changed stuff) means you have diffs of the config file from the VCS, and logs from the sqlite database. Diff the config in the VCS just like before, and if someone is going in the super back door to run SQL behind the scenes (which I don't get), then you know they did it. If they use the tools then you have the commit log in the datbase. You've now got way more data than you did with just config files.
But, if I were to extrapolate your real complain it is probably this:
"When the server breaks, I can not run 'diff' on the sqlite3 database config and see what changed some jackass could run raw sql against it and I would never know. Can I please have a way of comparing a given database to what a config file would produce? Kind of a config delta?"
With that in mind, I present to you the m2sh diff command ticket:
I'm glad I was able to help you clarify your design.
The "diff" command you propose seems like a lot of machinery just to make up for the fact that your design --- which is, in essence, a Unix take on the Windows Registry --- doesn't have a canonical human-readable form you can just pop open in a text editor.
In the interests of full disclosure: I'm very unlikely to ever use Mongrel2. My goal isn't to provide you with useful feedback. Instead, you opened up a conversation about this design decision, and I think the fact that it's prone to a particular failure mode is relevant --- especially to the (presumably numerous) HN readers who might think of emulating you.
I hope you read the "what year is it" comment as a judgement on INI files though, and not your code. We appear to share an opinion about INI files.
> which is, in essence, a Unix take on the Windows Registry
Ah, the windows registry canard. You are the canard king man. Let's see you've pulled out djb and windows registry. I think your next linux fear buzzword will be "embrace and extend" or possibly "free as in freedom". That'll really enhance the F in your effing statements.
The flaw of the windows registry is not that it wasn't a text file, but that it was a massive monolithic database with all software configurations in it and no enforceable schema that couldn't be examined. Our database is only for mongrel2 and has a solid clean schema everyone can go look at with "sqlite3 .dump". So that's a pointless comparison.
And you keep using this word "canonical" where it doesn't apply, so let's use official. The "official" configuration file is loadable into a text file, but again, I think that's irrelevant for the use cases the design targets. Either you're doing your own setup and m2sh with mongrel2.conf is fine, or you're doing large scale automation and the database is better.
In my design you get both, plus more choice. In your design you only get the status quo configuration which sucks.
Also, if I do m2sh diff it'll be just for fun. Nobody really needs that since most people don't hire Ninjas.
I choked on the whole SQLite configuration aspect of Mongrel2 as well, but I continued to read further and started to see how it ties in to the entire mongrel2 philosophy: "Mongrel2 is an application, language, and network architecture agnostic web server that focuses on web applications using modern browser technologies.". Agnosticism is important to Mongrel2. If you open your mind to the concept, you can begin to see how using SQLite for configuration becomes a bike shed in the entire scope of the project.
If I'm a shop looking for a single web server to host apps written in a variety of languages, chances are that I'm going to have an opinion about the process used to configure and administer that server. That opinion is going to include the structure, process, and implementation of the entire server and management tool stack. The same thing happens with servers that use plaintext configuration formats as well. Have a look at the differences in Apache administration on Redhat based distributions versus Debian based distributions. Look at the number of tools that exist for managing Apache conf files. True, not everyone uses them, but the manageability of editing these text files by hand only scales so far. At some point, you turn to a tool that abstracts away the underlying (canonical?) configuration file. At that point, the underlying format -- text or SQLite -- becomes less important. Hence, Zed's repeated referral to this whole issue as a "bike shed".
What I think is really cool about this entire approach is that there will likely be a wide variety of tools built to manage different types of Mongrel2 hosting environments. The power of SQLite will come to bear when implementing these tools. SQL is a powerful tool for collecting and updating information. There are plenty of text libraries that perform similar tasks, but text is a lowest common denominator. At the end of the day, I think it's a pretty interesting choice.
I agree with you - I prefer a human readable/git friendly config file. That's why I invested 10 minutes coming up with this solution and putting it into run_mongrel2.sh.
[edit: earlier version of this post was unnecessarily snarky.]
Congratulations, you have invented the "facsimile" that tptacek spoke of. Because your "canonical" dump file is just a dump file. At any given moment your server cannot be guaranteed to be in a configuration that is specified by your dump file, or by any version of your dump file. Why? Because the server exposes a tempting non-file-based interface to its internal state, and people who are trying to fix the servers at three in the morning will tend to use this to make "out-of-process changes".
If the server is broken, do you dare to naively overwrite its configuration with the version that is in git? The general answer is "no". Because two weeks ago somebody did an INSERT into the server's internal database to make Customer X's site work, and afterwords they didn't take time to properly dump the config and commit the dump to git. Because it was 3AM, and the edit was only one of seventeen switches that they were desperately flipping, and they just forgot. When you restart the server that config setting will be lost and the customer will crash and you will spend hours trying to recapture the lost state of the server.
You can prevent all this by making it difficult or impossible to alter the server config without altering the config file. But then why did we bother exposing the non-file-based interface in the first place?
The non-human readable interface is exposed for experts who wish to have python/ruby/whatever edit the sqlite interface. If you don't like this interface, don't use it. If your sysadmins are desperate to screw things up, they will.
If you really want to make it hard, your runmongrel2.py script can store the actual sqlite file in os.path.join('/tmp',str(uuid.uuid4())), thereby making it more difficult for sysadmins to find/edit.
Lastly, if you are truly paranoid, you can put something along these lines into your server restart script:
> At any given moment your server cannot be guaranteed to be in a configuration that is specified by your dump file, or by any version of your dump file.
It's a database, and one that's used by the entire world to reliably store data guaranteed. So this is just false on technical merits.
> Why? Because the server exposes a tempting non-file-based interface to its internal state, and people who are trying to fix the servers at three in the morning will tend to use this to make "out-of-process changes".
Yet, you say people store things in version control. You have the exact same problem then. People can go and make changes out of band that aren't in version control and forget about them. Happens all the time, and that's a people problem.
What you seem to not understand is that you just need to take people out of the equation. Automation is the future of operations, and a system that allows any programming language with sqlite3 bindings manage the config is going to help with automation.
Once you don't have humans logging onto machines like it's the 90's then you've actually solved the problem. A config file won't do that.
Your last sentence is hard to argue with, but it's worth pointing out that there's a cost to designing systems as if the universe you want to live in is the universe we actually live in. Let's call it "the Bernstein tax".
Huh? I have no idea what this is intended to mean.
For what it's worth: I'm an ardent admirer of Bernstein, and happily pay the Bernstein tax on my systems. All I'm saying is that I recognize that I'm paying it.
Whenever someone wants to shoot down a new system that is weird, they pull out djb. "DJB did this thing with qmail and everyone hated it, but he wanted it his way and that's why nobody uses qmail."
It's another fear buzzword for sysadmins. It's effectively saying, "If you use Mongrel2 then it'll be like qmail. BEWARE!"
If Mongrel2 was like qmail, it would in fact be very likely that I would use it. If you think I'm making up my admiration for Bernstein's work for the sake of argument, here's what 1 minute of Google searching turns up:
Not myself being a sysadmin, I couldn't tell you what the fear buttons are. I had that job in '96-'97 (the same time period where I read qmail, which I deployed in early beta versions at the rather popular ISP that employed me, because I love qmail), became a developer, and never looked back.
As a sysadmin for about ten years, this argument makes no sense to me.
Zed has created a new, flexible configuration method using sqlite, which lets me easily automate changes to the web server configuration. If version control is an issue, I can easily automate taking a human-readable, source-controllable dump of the database however often is reasonable.
Bernstein, on the other hand, decided that everything about the UNIX system was wrong, and built his system around what he asserted was a better way. All of qmail went into /var/qmail, which meant you had to either symlink everything in place anyway or add it to your $PATH, which caused unnecessary confusion.
He encouraged a different (non-human-readable) timestamp for log files (http://cr.yp.to/libtai/tai64.html), making it a huge hassle to take a quick glance through your log files for something without piping it through at least one extra process.
While we're talking about config files, he also used one config file for every variable (allowed hosts, virtual domains, etc), meaning that checking to make sure you configured something right involved cat'ing all kinds of files back and forth and scrolling up and down in your terminal.
He designed the qmail build process to hard-code the user IDs into the binaries, meaning you couldn't move a compiled version of qmail from one server to another unless all their UIDs/GIDs were identical. He also made qmail use inode numbers as file names for messages in the mail queue, meaning you couldn't move a queue from one server to another without renumbering each file.
Finally, he prevented people from shipping modified copies of his source code, meaning that adding any new features to qmail required patches, with a very significant chance that two patches for different features would interact in unexpected or inconsistent ways. It also meant that if you didn't have the original source tree and wanted to add one single feature (or even change the UIDs qmail ran under) there was a significant chance that you would lose features and/or your config would break.
Presumably this last rule was put in place to prevent people from fixing all the brain damage that went into qmail's ridiculous design. It caused nothing but massive headaches, and linux distributions had to jump through hoops to provide qmail as a package, since one single change to the source required the installer to download, compile, and install it for you.
qmail was a system administrator's nightmare because DJB had in his head what he believed to be a much better way to manage software on a UNIX system, and he went out of his way to make his software use that and force people to deal with it if he wanted to use that software (making qmail open source for most of its lifetime). The problem is that that might well have been a great system, if everything was using it, but since no one but him used his wildly varied system it just made qmail a headache. For as long as sendmail was the only alternative, it was a headache worth dealing with, but once Postfix came along most (open-minded) sysadmins jumped ship at the first opportunity.
Qmail is horrid, non-free software, but, like Windows, it was horrid, non-free software we all had to deal with for a long time. Now there are other, better options.
So I'll reiterate: qmail was a horrible mess that made my life, as a system administrator, a huge hassle. Mongrel2 and its DB config file, on the other hand, are ideas I'm ridiculously excited about, and can't wait to put into production, because the flexibility and power available in this sort of config philosophy are things I can do a great deal with.
Don't compare Zed to DJB. DJB looked down his nose at anyone who didn't like his pure, pristine system, and if you didn't like it, too bad for you. Zed, on the other hand, has gone out of his way to make his software work with whatever workflows we, as sysadmins, need to implement. That's worth some high praise, as far as I'm concerned.
I skimmed this briefly, saw the sentence "qmail is horrid, non-free software", and didn't bother to read closely, knowing that it was written by someone who hasn't even taken the time to learn that qmail is in the public domain. You probably aren't even engaging the argument, which does not revolve around Bernstein's software being bad.
Djb did try and get users to replace init and the fhs in order to use his mail app. It's not a buzzword, it's a genuine issue that made people not want to use qmail.
A configuration management system like cfengine or puppet is probably distributing files directly from revision control. Changes made to files on the system will likely be reverted unless someone specifically disables the agent.
> a system that allows any programming language with sqlite3 bindings manage the config is going to help with automation.
It's just a different config language, there's nothing that inherently allows more automation. The guy writing a gui interface might find that sqlite3 more useful. The guy using puppet and erb templates will probably prefer the text file.
The point of a configuration language is to provide a human interface. If you don't need the human interface then why do you expose configuration at all?
You, after examining diff outputs: "No, you are adding a second config file. Use the existing config file and deploy script."
Contributor: "Ok, fixed and rebased, now there is no trace of my stupid misconfiguration in the repository. I'll be sure to learn about the project conventions before committing in the future. My mistake."
I don't follow: surely, a sql dump is possible with sqlite, and you could use just that in your VCS ?
As for ini file, the format is supported in the python stdlib, which makes it a natural default if you don't need too much. If .ini is not enough, I think in general you need something with a bit some logic, at which step using the database itself makes sense in mongrel2 case.
What? You should totally be into .ini files. I mean, one incomprehensible difficult to automate bizarre ass user interface file is no better than any other.
It's a false dichotomy to suggest that because INI files are incomprehensible, difficult to automate, and have bizarre asses, all text configuration files must be the same.
But, like your comment, that's just message board geekery. I wouldn't want to suggest that it's worth arguing about.
You keep using the logic phrases that you seem to not understand. It's not a false dichotomy, it's a generalization that has no external validity. It has internal validity though.
Vocabulary really seems to be tripping us up today. Let's just cite the Nizkor list when critiquing arguments. For reference, I'm invoking #24, "False Dilemma":
Bill: "We'll have to keep our configurations in a database and avoid flat text files."
Jill: "Why?"
Bill: "Because otherwise we'll be stuck with incomprehensible text files with bizarre asses."
I wonder: what kind of format is easy to automate ? What makes configuration file difficult to "automate" is their multiple format (especially on unix), the fact that it is not easy to reload a configuration (most applications don't even allow it) and that there is no easy way to check that your configuration matches how the app is configured (especially for bad softwares). The format only helps for the first point.
Cisco configuration files don't play nice with version control either. Anyone (with access) can log into any Cisco router, make changes and forget to write the running config to the startup config. Even if they do that, they might forget to copy the new startup config to the TFTP server. And even if they do THAT, they might not even bother doing the "svn commit" or "git commit" or "whatevervcweareusing commit". I view the sqlite database as the running configuration of Mongrel2. And from where I sit, it seems vastly simpler to do a diff of a configuration file stored in sqlite than it is to do a diff of a configuration from a Cisco router.
I did opposite in large Windows-based real-time algorithmic system with tens of thousands of parameters. The parameters were stored in MS-SQL database, under complex schema.
For each branch and version, each developer were adding new parameters, so after merges or upgrades it quickly become parameters hell.
I changed the parameters storage implementation to use CSV files, instead of MS-SQL, while external API did't changed.
Now we were able to add .CSV files to the source control and parameters were versioned and automatically merged.
Additional bonus - you can edit .CSV files in Excel ;)
No opinion one way or another about how Mongrel2 handles configuration, but are INI files really "the fad today"? Just strikes me as an odd insult for something so old. (I don't work with Python, so maybe that's the faddish part of all this?)
A technology doesn't have to be recent to still be applicable. INI files have been around a long time but they are a well understood mechanism for storing confuguration settings and have mature, well debugged, components for handling them. Not that I don't think that sqlite isn't mature but it seems like overkill for storing simple configuration settings.
I usually find that when people request something weird, it's because they have a "special sauce" idea they want to implement and not tell me about.
As for sqlite3 being overkill, go check out the code for loading the config with sqlite, and for just parsing the config in m2sh. Parsing is way harder and more overkill than just querying a db.
You misunderstood my point (maybe my post was too short). I wasn't complaining that the INI format was too old. I just found it odd that Zed's post began by making broad jokes about fads and "the future" and then went on to talk about INI files. I didn't think INI-style configuration was a current fad - though now I wonder if it is among Python devs.
SQLite seems like more of a PITA for some cases, but when using something like chef to configure an instance, SQLite makes more sense than using something like sed to mangle a config file.
Now, yes, you can define a whole config file in a template, but sometimes you need to have a separate recipe that makes a change to an existing config (I'm looking at you engineyard), and for that, we must unfortunately resort to sed.
Exactly. With m2sh the regular small usage case is very easy, and you get tons of capabilities other servers don't have. It's not preventing you from working how you work now, just gives one more step.
But, for everyone else, and the future of operations, this kind of configuration storage is insanely useful. The idea that we could point mongrel2 at a redis store in the future and have the 1000s of servers some companies put out automatically update their configs in realtime is just sexy as hell.
That's pretty much what we do at our company (albeit with Riak and it's not mongrel2) - we have x servers with shared config and they all need to get that config on startup. I shudder to remember the early days of running sed on multiple configuration files on multiple servers using parallel-ssh.
I hate config files, having written a software tool GUI to configure lots of daemons that traditionally use config files.
The steps, in general, were parse-modify-overwrite, a real hassle considering the diversity of grammars each daemon employs - following the make your own parser mantra.
I welcome Zed's initiative of using sqlite, making a configuration GUI tool for it would be a breeze.
I've not found answer to question why mongrel2 doesn't use ini files..
It's not so hard to generate sqlite file but it is even easier to generate ini, yaml, json or xml if you don't like particular ini format. And you can manage these files by git/svn and config loader could show you line with error if loading fails.
There are number of json, yaml libraries for c. SQLite is portable for sure, but json, yaml parsers are usually consist of few files without any dependencies.
A couple people have mentioned this, and really I'll tell you a little secret:
The config loading in Mongrel2 is all abstracted away and could support anything. I haven't done anything explicitly to let you write a "config load module" but it'd be possible. In theory, since the design kept this MVC idea, you could configure out of a config file, redis, couchdb, really anything you need to get your ops mojo working.
I could have easily just went with a config file as the default setup instead of sqlite3, but then where would all the FUD slingers and bikeshedders go to waste their energy? :-)
I haven't seen the Python community complaining about the config methodology. It usually just seems to be the people who can't understand new things that complain about change. As someone who would be managing this in production, I welcome it.
Already get those. There's folks who want to use redis, couchdb, just about anything that can store configurations.
I think the thing most existing sysadmins don't get is that this is written for people who have to manage mongrel2 servers in a modern way. It doesn't prevent you from doing your usual manual ssh work, but it's aimed at the future where people won't be doing that so much.
It's kind of sad, because I want sysadmins to learn to code so bad, and I even wrote a book so they could learn it, and have helped tons of them get better jobs, yet they still resist awesome features like this. They're really just holding themselves back by not realizing automation and code are their future.
I don't mean to nitpick, but the sysadmins you're talking about are the bad sysadmins (or maybe just mediocre sysadmins). Any sysadmin worth his salt knows that when you automate something, you only have to do it once, and you remove any mistakes down the road. Once you have the script written and tested, you don't have to worry about typos ever again. If it weren't for automation, my job would be unbearable, but writing scripts to automate systems is the most fun I get to have in my day job.
Instead of having a simple text format that is the configuration, he encourages everyone to write their own. I can imagine the fun now on a mailing list - debugging n+1 languages to start a webserver.
When you deploy, do you deploy the sqlite database file or the script that generates it? Does the daemon check if the db is out of date on startup? Does the script clear the existing db when it runs? Does it have merge logic? How do you keep them in sync?
'But it's powerful' I hear you cry, but my retort is simply 'it is effort to maintain, deploy and configure'. It smacks of: here is a bus port, go build yourself a keyboard. This sort of toolsmithery is what brings us delights like autoconf.
A simple change would improve things a lot, whilst maintaining the flexibility he craves:
instead of 'generate a config file' with your tool and hand it off to mongrel, the tool you write is part of mongrel.
You write a script called /etc/mongrel/makeconfig, that takes one argument, the name of the mongrel settings database. When mongrel starts, it invokes makeconfig to build the database.
You can provide a bunch of trivial makeconfig scripts as defaults, including a shell script that just runs a bunch of sqlite statements.