Most application's file formats are structured as a tree, not as flat tables. If your application's data is flat tables or name-value pairs then SQLite is an obvious choice. But if it is tree structured then it is less obvious. You can still save your tree in JSON format as a blob in a SQLite table but in this case the benefits are fewer. But if in addition to the JSON you have images or other binary data then once again SQLite offers benefits, because each of those binary files can be additional rows in the SQLite table. This is far easier to handle than storing them in ZIP format.
Maybe not as obvious for those without formal education in """database normalization""" but it's pretty trivial to convert from a tree structure to a flat table structure using foreign key relations. Recursive queries aren't even that difficult in SQLite, so self-referential data can be represented cleanly too, if not a bit more difficult to write. IME most applications "tree structures" aren't self-referential and are better formalized as distinct entities with one-to-one relationships (ie. a subtree gets a table).
There's always the lazy approach of storing JSON blobs in TEXT fields, but I personally shy away from that because you lose out on a huge part of the benefits of using a SQL DB in the first place, most importantly migrations and querying/indexing.
Until just now, I've been trying to figure out why people think that JSON is necessary in the database? Yes, lots of data is hierarchical, and you just normalize it into tables and move on. The fact that some people don't work this way, and would like to put this data as it stands into a JSON tree hadn't occurred to me.
What problem does normalization solve? You don't have to parse and run through a tree every time you're looking for data. You would, however, need to rebuild the tree through self joins or other references in other cases, I suppose. It depends how far you break down your data. I understand that we all see data structures a bit differently, however.
> There's always the lazy approach of storing JSON blobs in TEXT fields, but I personally shy away from that because you lose out on a huge part of the benefits of using a SQL DB in the first place, most importantly migrations and querying/indexing.
SQLite at least provides functions to make the “querying” part of that straightforward: https://sqlite.org/json1.html
What problem are you trying to solve with this approach? Unless your document is huge and you need the ability to read or update portions of it, it is better to just read and write JSON.
There's a laundry list of benefits that all add up, not like one specific killer feature. Some applications really do have very complex configuration needs, but it's sorta situation dependent on whether embedding a scripting language or a database is the right solution (for really simple cases I'm more likely to reach for TOML).
An incomplete list of benefits of using SQLite:
- Runtime config changes for free
- type safety
- strong migration support
- incorrect configurations can be unrepresentable (or at least enforced with check constraints)
- interactable from text-based interfaces and strong off-the-shelf GUI support
Type safety as a benefit of SQLite? For me type safety is a negative of SQLite. Being able to store a different type that what the column is declared to store is a bug (not a feature). I also find the lack of DATE and DATETIME/TIMESTAMP to be less than ideal.
We are talking about an application file format, so "type errors" are about who's right: the application (even better, multiple equally right implementation of a specification) or random hackers altering the file in incorrect ways.
Loose type checks, e.g. NOT NULL columns of "usually" text, are loose only compared to typical SQL table definitions; compared to the leap forward of using abstract tables and changing them with abstract SQL instead of using text or byte buffers and making arbitrary changes, enforcing data types on columns would be only a marginal improvement.
Most frameworks can serialize and deserialize JSON from strongly typed classes. For example, Newtonsoft in .NET. The rest isn't worth the effort for most people. Your scenario may be unusual.
I've certainly had some unusual contents in the past where we had approximately 10,000 configurable properties on the system, but we didn't use SQLite for that. Regardless, you ignored 3 of the 4 (I'll ignore the last one, it applies to JSON too) other points I made. My use cases aren't that weird and I'm not saying reach for SQLite every time, it's one option out of many. Migrations and runtime configuration change alone justify it for me in many cases.
I am not really classically trained on the subject but I think this is the idea behind relational storage, it is to have better extraction options, you don't have to treat your data as a single document at a time.
Naively, most data looks hierarchical and the instinctive reaction is to make your file format match. But if you think of this as a set of documents stacked on top of each other if you take the data as a bunch of 90 degree slices down through the stack now your data is relational, you loose the nice native hierarchical format, but you gain all sorts of interesting analysis and extraction options.
It is too bad relational data types tend to be so poorly represented in our programming languages, generally everything has to be mapped back to a hierarchical type.
I had some json data that I wanted an annotation interface for. So I asked codex to put it into sqlite and make a little annotation webserver. It worked quickly/easily and without hassle. Sqlite supports queries over json-like objects.
Maybe a very simple document oriented db would have been better?
My biggest gripe is that the sqlite cli is brutally minimal (makes sense given design), but I probably should have been using a nicer cli.
My issue with SQLite's JSON implementation is that it cannot index arrays. SQLite indexes can only contain a single value per row (except for fulltext indexes but that's not what I want most of the time). SQLite has nothing like GIN indexes in Postgres.