As I see people mentioning Datalog-related software, I'm gonna throw a pebble into the yard as well - Graql (https://github.com/graknlabs/grakn) is a declarative graph-based query language with rule-based reasoning heavily inspired by Datalog and Logic Programming.
Just to add and clarify: it supports recursion (since a couple releases back) and pattern negation based on set difference (on current master branch, soon to be released).
Yes, the DOOP folks under Iannis Smaragdakis have been doing this for a while. They use souffle, I believe: https://souffle-lang.github.io/.
This is Semmle's business model (https://semmle.com/). They have a custom Datalog with OO features they call QL.
I've heard LogicBlox used to do program analysis as well, but AIUI they turned more towards business analytics and then got bought out, and the Datalog engine team all jumped ship.
The Rust compiler has been exploring implementing some of their type system (the borrow checker, I think?) in a Datalog.
On the academic front, Tamás Szabó and Sebastian Erdweg have been working on an incremental Datalog engine for static analysis; https://github.com/szabta89/IncA. I know Tamas has worked for Itemis, but I don't know whether they're using this tech.
I'm working on a functional language, Datafun, inspired by Datalog, but it's too much of a toy to do any real analysis with at the moment. All in all, Datalog is still a niche language and static analysis a niche application of it, but it seems to be on the rise.
1. You can implement some minor flavor of Doop in differential dataflow (https://github.com/TimelyDataflow/differential-dataflow/tree...). The fragment came from Yannis which he described as a minimal non-trivial analysis (vs other, even smaller fragments), but it still uses some 900 dataflow operators (updates in 10s of ms when inputs change, though). The DD impl isn't meant to be understood, I'm afraid.
2. Many ex-LB folks are now at relational.ai doing .. advanced weird tech like they did with LB. I'd watch them (I do).
The LogicBlox platform is still in use and in active development at Infor. But yes, many of the principals have indeed left.
DOOP actually used LogicBlox before Souffle existed, but it is difficult to be competitive when Souffle doesn't need things like transactions, ACID, datasets larger than memory, incremental evaluation, etc.
Doop is a fantastic base system to extend for anyone wanting to do program analysis generally! I built my thesis work on top of it (recently defended but unfortunately not published yet) and Datalog is incredibly productive to prototype ideas.
(Performance is another matter -- one has to pay careful attention to clause ordering, and watch out for O(n^2) / O(n^3) / ... loop nests, but at least it's easier to try options than refactoring an ad-hoc analysis engine.)
Datalog is awesome as is Logic Programming generally. It's not only useful in static program analysis but also in areas as diverse as synthetic biology and data integration.
There's lots of datalog and similar technology in most Knowledge Graph platforms, too. For example, http://stardog.com/.
I don't see anywhere that mentions datalog in relationship to Stardog. I see OWL2 and SparQL, which as far as I can tell do not support recursion other than in experimental prototypes. If there is no recursion, it is just conjunctive queries and not datalog.
> If there is no recursion, it is just conjunctive queries and not datalog.
SPARQL has negation (FILTER NOT EXISTS), though, making query answering PSpace-hard, whereas conjunctive query answering is NP-complete, so the two are very much not the same, although still less expressive then Datalog (which is ExpTime-complete). CQ answering for the Horn fragment of OWL2 is already 2ExpTime-complete, however, making it vastly more expressive than pure Datalog.
No it's not. The query language of Datomic, and that taught on learndatalogtoday.com, is a Datomic-proprietary language similar to Prolog-in-LISP implementations or frame-based expert systems of old.
There is also Flix language inspired by Datalog that adds the features of monotone functions and lattice models. This post is such a coincidence for me since I have just started to work on Flix and souffle for my thesis.
I think it's more cultural than anything else. SQL, despite being declarative itself, has a more imperative feel to it for some people (recursion being at the fringes of its standard) and people seem to believe more declarative means slower and more difficult to comprehend. The former is true only due to the implementation effort that went into SQL engines and the latter is plain false.
It probably doesn't help that the standard presentation of Datalog looks very much like Prolog, which most people think is an esoteric and failed programming language.
As for its limitations, it's disciplined in a number of ways, but those are features rather than shortcomings. So you can see in the post that it employs syntactic restrictions to ensure domain independence, it doesn't allow function symbols to have termination. It requires predicates that are not purely logical (extralogical) to satisfy certain conditions for safe execution. These make sure that your queries are well-behaved even before you start running the query. There is research on relaxing these restrictions (cough my work cough) without compromising on the nice properties that come along with it.
The nice thing is of course that if you want to do the same for SQL you start with pandora's box open and try to contain the chaos inside again, while with Datalog you're carefully trying to make the best out of it without unleashing madness. The latter is a much nicer foundation.
On the continum of relational algebra to logic programming catalog is somewhere in the middle.
It is better at handling recursive queries, and logical inference than SQL. On the flip side SQL can be better at more set style operations (for instance windowing functions are painful in datalog).
It is different tooling, essentially it comes down to do you want to extract answers from your data with varying relational algebra but holding the logic the same (SQL), or keeping the relational algebra the same but varying the logic (datalog).
That doesn't help at all because although Datalog is a syntactic fragment, its dynamic semantics are very different, so the operations defined in Prolog standard are suggestive at best and completely nonsensical at worst.
For example, the program `p(X) :- p(X)` with query `?- p(X)` is an infinite loop in Prolog but in Datalog it terminates trivially and is equal to the empty set of facts for `p` in the absence any other facts.
My take on miniKanren is that it's implementation is designed to be small and hackable. Especially small with microKanren.
If you don't know scheme/lisp, but want to roll your own for educational purposes, then check this out: https://codon.com/hello-declarative-world
Vs Prolog, the emphasis in miniKanren is on constraint programming --- especially writing new constraints to extend it to more problems. Where (chief variants of) Prolog have been optimized in various ways for certain types of problems.
What I read about Datalog is that it is basically a recursive relational database. It's set up for "deductive queries". This wording reminds me of miniKanren --- but the Kanrens are Turing complete.
Ah, Wikipedia has this to say:
In contrast to Prolog, Datalog,
1.) disallows complex terms as arguments of predicates, e.g., p (1, 2) is admissible but not p (f (1), 2),
2.) imposes certain stratification restrictions on the use of negation and recursion,
3.) requires that every variable that appears in the head of a clause also appears in a nonarithmetic positive (i.e. not negated) literal in the body of the clause,
4.) requires that every variable appearing in a negative literal in the body of a clause also appears in some positive literal in the body of the clause[4]
I'd have to put some thought into checking this list against the capabilities of miniKanren. Generally it doesn't ring any bells.
miniKanren is Turing-complete; Datalog deliberately isn't.
Despite that, there are things Datalog can do easily that miniKanren can't. miniKanren can't handle negation natively, but Datalog can (as long as it's stratified, that is, a predicate can't use its own negation).
Datalog is also forward-chaining, so some things that would infinite loop in miniKanren work fine in Datalog - for example, computing the transitive closure of a cyclic graph. If someone added tabling to miniKanren that would solve this.
There's also the question of performance. Some miniKanren implementations are surprisingly fast for what it does, but I suspect they can't compete with, say, Souffle, for the kind of static analysis workloads it's aimed at. Would love to be proven wrong, though.