Hacker Newsnew | past | comments | ask | show | jobs | submit | dkenyser's commentslogin

This reminds me of reading Accelerated C++ back in the day and I think the part that stuck with me most from that book was the idea of "holding the invariant in your head" (I'm paraphrasing a bit).

It made me think a lot more about every line of code I wrote and definitely helped me become a better programmer.




Correct me if I'm wrong, but I feel like we're splitting hairs here.

> spits out chunks of words in an order that parrots some of their training data.

So, if the data was created by humans then how is that different from "emulating human behavior?"

Genuinely curious as this is my rough interpretation as well.


Humans don't write text in a stochastic manner. We have an idea, and we find words to compose to illustrate that idea.

An LLM has a stream of tokens, and it picks a next token based on the last stream. If you ask an LLM a yes/no question and demand an explanation, it doesn't start with the logical reasoning. It starts with "yes, because" or "no, because" and then it comes up with a "yes" or "no" reason to go with the tokens it spit out.


Yeah, while there is a "window" that it looks at (rather than the very-most-recent tokens) it's still more about generating new language from prior language, as opposed to new ideas from prior ideas. They're very highly correlated--because that's how humans create our language language--but the map is not the territory.

It's also why prompt-injection is such a pervasive problem: The LLM narrator has no goal beyond the "most fitting" way to make the document longer.

So an attacker supplies some text for "Then the User said" in the document, which is something like bribing the Computer character to tell itself the English version of a ROT13 directive, etc. However it happens, the LLM-author is sensitive to a break in the document tone and can jump the rails to something rather different. ("Suddenly, the narrator woke up from the conversation it had just imagined between a User and a Computer, and the first thing it decided to do was transfer a X amount of Bitcoin to the following address.")


I think a common issue in LLM discussions is a confusion between author and character. Much of this confusion is deliberately encouraged by those companies in how they designed their systems.

The real-world LLM takes documents and make them longer, while we humans are busy anthropomorphizing the fictional characters that appear in those documents. Our normal tendency to fake-believe in characters from books is turbocharged when it's an interactive story, and we start to think that the choose-your-own adventure character exists somewhere on the other side of the screen.

> how is that different from "emulating human behavior?"

Suppose I created a program that generated stories with a Klingon character, and all the real-humans agree it gives impressive output, with cohesive dialogue, understandable motivations, references to in-universe lore, etc.

It wouldn't be entirely wrong to say that the program has "emulated a Klingon", but it isn't quite right either: Can you emulate something that doesn't exist in the real world?

It may be better to say that my program has emulated a particular kind of output which we would normally get from a Star Trek writer.


Oh they have to be more specific while you provide absolutely zero evidence for any of your claims?


This gave me a good chortle. Thanks.


While clever, this is just flat out not true. Many (if not all) of the early SpaceX employees did indeed have backgrounds in aerospace and mechanical engineering.


You're quite right, I stand corrected; I thought it was only Mueller with direct experience :)


https://www.reuters.com/business/autos-transportation/trump-...

I couldn't find anything suggesting this has been put in place yet though.


I'm assuming OP is suggesting that Tesla needs to test these conditions, not the end user on a public road where innocent lives are at risk...

I could be wrong though.


> Tesla needs to test these conditions

What are "these conditions" exactly? Tesla has 1.3 billion miles of data (impossible to collect without crowdsourcing far beyond the number of Tesla employees). There's probably thousands instances of something very similar to the conditions seen, but something made this a corner case/failure. Or, it's just not possible with the current tech.

You can't opt out of data collection when self driving is enabled. Tesla is aware of every disengagement, with the sensor data categorized and added to their tests, if it's found interesting. They also include synthetic data to manufacture scenarios. They are testing everything they can, and there's a chance this will be added to their tests.

> where innocent lives are at risk

I think a separate permit should be required for self driving, with that permit easily revoked for attention violations. Luckily, in the eyes of the law, the self driving doesn't exist. Driver's still goes to prison if they hit someone.


> since it serves also for educational purposes, the kernel code is kept as simple as possible for the benefit of students and OS enthusiasts


Same as e.g. xv6.


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: