The "magic" of Prolog is built upon two interesting concepts : Unification ( htt...

abeppu · on July 18, 2024

> The continuous equivalent in a sort of unification is Rao-Blackwellisation (done automagically by deep-learning from its training experience) which allows to pick the right associations efficiently kind of the same way that a "most general unification algorithm" allows to pick the right variable to unify the terms.

I don't know how to reconcile this statement about deep learning with my understanding of Rao-Blackwell. Can you explain:

- what is the value being estimated?

- what is the sufficient statistic?

- what is the crude estimator? what is the improved estimator?

Roughly, I think sufficient statistics don't really do anything useful in deep learning. If they did, they would give a recipe for embarassingly parallel training that would be assured to reach exactly the same value a fully sequential training. And from an information geometry perspective, because sufficient statistics are geodesics, the exploratory (hand-waving) and slow nature of SGD could be skipped.

GistNoesis · on July 18, 2024

Once you view prolog goal reaching as a game. You can apply Reinforcement Learning methodologies. The goal is writing a valid proof, aka a sequence of picking valid rules and variables assignment.

Value being estimated : The expected discounted reward of reaching the goal. The shorted the proof the better.

The sufficient statistic : The embedding representation of the current solving state (the inner state of your LLM (or any other model) that you use to make your choices). You make sure it's sufficient by being able to regenerate the state from the representation (using an auto-encoder or vae does the trick). You build this statistic across various instances of problems. This tells you what is a judicious choice of variable based on experience. Similar problems yield similar choices.

The crude estimator : All choices of have the same value therefore a random choice, The improved estimator : The choice value is conditioned on the current embedding representation of the state using a Neural Network.

You can apply Rao-Blackwell once again, based by also conditioning one-step look-ahead. (Or at the limit applying it infinitely many times by solving the bellman equation.)

(You can alternatively view each update step of your model, as an application of Rao-Blackwell theorem on your previous estimator. You have to make sure though that there is no mode collapse.)

You don't have to do it explicitly, it happens under the hood by your choice of modelisation in how to pick the decision.

inkyoto · on July 18, 2024

A unique property of Prolog is that, given an answer, it can arrive at the original question (or, a set of questions – speaking more broadly).

Or, using layman terms, a Prolog programme can be run backward.

ted_dunning · on July 18, 2024

To be precise, a small number of very small Prolog programs can be run backwards.

There are essentially no significant Prolog programs that are reversible with acceptable efficiency.

derdi · on July 18, 2024

To be even more precise, Prolog programs only ever run forward because the order of evaluation is fixed as top-down, left-to-right. These notions of "forward" and "backward" are very unhelpful and should be given up. Beginners find the order of evaluation hard enough to understand, let's not confuse them even more.

Also, the notion is woefully incomplete. Let's say we consider this "forward":

    ?- list_length([a, b, c], Length).
    Length = 3.

Then you would say that this is "backward":

    ?- list_length(List, 3).
    List = [_A, _B, _C].

Fine, but what's this then? "Inward"?

    ?- list_length([a, b, c], 3).
    true.

And then presumably this is "outward":

    ?- list_length(List, Length).
    List = [], Length = 0 ;
    ...
    List = [a, b, c], Length = 3 .

None of these cases change the order of evaluation. They are all evaluated top-down, left-to-right. The sooner beginning Prolog programmers understand this, the better. The sooner we stop lying to people to market Prolog, the better.

kazinator · on July 19, 2024

What's going on in the second example? Did Prolog generate a list term stuffed with gensym variables, which satisifies the list_length being 3?

derdi · on July 19, 2024

It generated a list stuffed with distinct logical variables. Any list of length 3 is unifiable with this list. Or in other words, this is the unique (up to variable renaming) most general list of length 3.

Whether this is a "yes" to your question depends on what your mental model of "gensym variables" is. They are variables, not symbols (which Prolog would call atoms).

inkyoto · on July 19, 2024

I was largely joking. Even though the capability is there, it is not computationally practical nor possible to accomplish such a feat for any sufficiently complex programme.

In the most extreme case, attempting to run a complex Prolog programme backwards will result in an increase in entropy levels in the universe to such an extent that it will cause an instant cold (or hot) death of our universe, and life as we know it will perish momentarily (another tongue in cheek joke).

agumonkey · on July 18, 2024

the bidirectional (relational) aspect of prolog is what got me into this. I love symmetries so it was a natural appeal even before I learned about logic programming (Sean Parent made a google talk about similar ideas implemented in cpp). That said it's very limited. But I wonder how far it could go. (the kanren guys might have more clues)

radomir_cernoch · on July 18, 2024

Do you see a good way to include backtracking in an imperative programming language?

I can imagine how unification would work, since the ubiquitous "pattern matching" is a special case of Prolog's unification. But I've never seen how backtracking could be useful...

vmchale · on July 18, 2024

backtracking is perilous in general; logic programming languages have really nice abilities for such but I don't know how to avoid pathological inefficiency.

YeGoblynQueenne · on July 18, 2024

With memoization as in tabling (a.k.a. SLG-Resolution):

https://www.swi-prolog.org/pldoc/man?section=tabling

Re-evaluation of a tabled predicate is avoided by memoizing the answers. This can realise huge performance enhancements as illustrated in section 7.1. It also comes with two downsides: the memoized answers are not automatically updated or invalidated if the world (set of predicates on which the answers depend) changes and the answer tables must be stored (in memory).

Known to the Prolog community since about the 1980's if I got my references right.