Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How will AI write about a world it never experiences? By training on the work of human beings.


The training sets can already include direct data series about the world, where the "work of human beings" is just setting up the the collection devices. So models can absolutely "experience the world".

But I'm not suggesting they'll advance much, in the near term, without any human-authored training data.

I'm just pointing out the cold hard fact that lots of recent breakthroughs came via training on synthetic data - text prompted by, generated by, & selected by other AI models.

That practice has now generated a bunch of notable wins in model capabilities – contra the upthread post's sweeping & confident wrongness alleging "Ai generated content is inherently a regression to the mean and harms both training and human utility".


> models can absolutely "experience the world"

How does the banana bread taste at the café around the corner? What's the vibe like there? Is it a good place for people-watching?

What's the typical processing time for a family reunion visa in Berlin? What are the odds your case worker will speak English? Do they still accept English-language documents or do they require a certified translation?

Is the Uzbek-Tajik border crossing still closed? Do foreigners need to go all the way to the northern crossing? Is the Pamir highway doable on a bicycle? How does bribery typically work there? Are people nice?

The world is so much more than the data you have about it.


Of course, training on synthetic data can't do everything! My main point is: it's been doing a bunch of surprisingly-beneficial things, contra the obsolete beliefs about model-output-worthlessness (or deleteriousness!) for further training to which I was initially responding.

But also: with regard to claims about what models "can't experience", such claims are pretty contingent on transient conditions, and expiring fast.

To your examples: despite their variety, most if not all could soon have useful answers answers collected by largely-automated processes.

People will comment publicly about the "vibe" & "people-watching" – or it'll be estimable from their shared photos. (Or even: personally-archived life-stream data.) People will describe the banana bread taste to each other, in ways that may also be shared with AI models.

Official info on policies, processing time, and staffing may already be public records with required availability; recent revisions & practical variances will often be a matter of public discussion.

To the extent all your examples are questions expressed in natural-language text, they will quite often be asked, and answered, in places where third parties – humans and AI models – can learn the answers.

Wearable devices, too, will keep shrinking the gap between things any human is able to see/hear (and maybe even feel/taste/smell) and that which will be logged digitally for wider consultation.


You’ve used an LLM to write that, haven’t you.


No - and you can compare the style & written tics for continuity with my 18y of posts here.

I used 'delving' in an HN comment more than a decade before LLMs became a thing!

https://news.ycombinator.com/item?id=1278663


So in the end, there is still a human doing the work


> data series about the world, where the "work of human beings" is just setting up the the collection devices. So models can absolutely "experience the world"

But not experience it the way humans do.

We don’t experience a data series; we experience sensory input in a complicated, nuanced way, modified by prior experiences and emotions, etc. remember that qualia is subjective, with a biological underpinning.


Perhaps. But these models can already clearly write about the world, in useful ways, without such 'qualia' or 'biological underpinnings'.


Sure, and there are many such writings that can be useful. No denying. But the LLM cannot experience like humans do and so will forever be outside our circle. Whether it also remains outside our circle of empathy, or us outside of its, remains to be discovered.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: