I think the parent is referring less to a system that can construct a single mod...

I think the parent is referring less to a system that can construct a single model without an explicit schema, and more to the thing that'd be a few steps after that—the ability to:

1. dynamically notice world-features ("instrumental goal features") that seem to correlate with terminal reward signals;

2. build+train entirely new contextual sub-models in response, that "notice" features relevant to activating the instrumental-goal feature;

3. shape goal-planning in terms of exploiting sub-model features to activate instrumental goals, rather than attempting to achieve terminal preferences directly. (And maybe also in terms of discovering sense-data that is "surprising" to the N most-useful sub-models.)

In other words, the AI should be able to interact with reward-stimuli at least as well as Pavlov's dog.

Right now, ML research does include the concept of "general game-playing" agents—but AFAIK, these agents are only expected to ever play one game per instance of the agent, with the generality being in how the same algorithm can become good at different games when "born into" different environments.

Humans (most animals, really) can become good at far more than a single game, because biological minds seem to build contextual models, that communicate with—but don't interfere with—the functioning of the terminal-preference-trained model.

So: is anyone trying to build an AI that can 1. learn that treats are tasty, and then 2. learn to play an unlimited number of games for treats, at least as well as a not-especially-smart dog?