Then where is the publication that uses GPT-2 for a long-term NLP task, like, fo...

yorwba · on Jan 5, 2020

How about a model that can detect the semantic shift of "Amazon" from "river" to "company"? https://arxiv.org/abs/1703.00607

They're not even using transformers, just simple word embeddings.

huffmsa · on Jan 5, 2020

The parent comment isn't thinking time-series like you and I are.

Being able to follow multiple agents and correctly deduce their relationships at a given time t is very hard.

NLP "time-series" does a fine job at making back references within a text, but wouldn't be able to have multiple representations of a word or character through the years.

It's very hard to get the computer to say "ah, the context is 16th century, so here are the relationships" without fudging it / tailoring models via tailored corpuses.

yorwba · on Jan 5, 2020

Would adding a context vector for "16th century" be "fudging it"?

huffmsa · on Jan 5, 2020

Depends. In the articles Uber example, the failure wasn't with detection, it as with context switching.

The detection kept changing, and so the model kept going "oh, new object, restart decision process."

Lacking the ability to generate and maintain it's own context is an area where a human would do better. We might not know what the object was, but our "slow down" response wouldn't keep resetting depending on what we classified the object as.

Same as words switching meanings within a piece or sentence. It's hard, but most humans can pickup when the usage changes