Hacker Newsnew | past | comments | ask | show | jobs | submit | Tv9m's commentslogin

Not sure if you meant this comment rhetorically or not, but try asking ChatGPT to do something useful like convert some text into JSON, or what conjugation of a verb you should use, or for writing feedback on an email. If you treat it like a toy it will be disappointing but it's useful as a tool.


It's not very useful, unless you only want to write a single isolated class in Java. Every time I refine my prompt I get a completely different answer and often it just repeats answers it gave me earlier which wipes out improvements made in the meantime. It produces bad code faster than you can check it.

When you ask it to do something simple like chess it whips out crazy illegal moves. "the omnipotent f6 pawn" is already an internet meme [0]. You might now claim that it is a language model and it shouldn't be able to play chess but this is a very real limitation that you are going to pretend doesn't exist later on when it is about "language tasks". If it can't follow basic rules like rooks only moving horizontally or vertically or bishops only moving diagonally which is purely defined by logic then how on earth can it possibly do something that actually requires complex reasoning?

The problem with these large language models is that it gets harder and harder to notice the limitations, but they are still the same regardless of the size of the model. So every iteration you get to hype up more people about the same thing. They will make the same bad projections into the future like everyone else.

[0] https://www.reddit.com/r/AnarchyChess/comments/10yym4o/why_i... (Context: ChatGPT will play pawn f6 which summons a pawn out of nowhere, beating the queen)


What is the difference? The original prompt isn't what gets used anyways.


Isn't the parent talking about an AI running on the backend? That seems new.


So what? There's no evidence that it finds any more exploitable security vulnerabilities than existing tools.


I mean that the AI is what's being attacked. It's likely that backend LLM agents will have access to sensitive non-public APIs.


That can happen with any system exposed to untrusted clients. Such vulnerabilities have nothing to do with AI or LLM agents per se, so raising it as a concern with Bing Chat is just a red herring. There are well known best practices for mitigating such risks, including using an API firewall and other techniques.


I just don't know if there's any point in fighting this fight. We either decide to ride the bandwagon and get our money, or we decide to just wait this out and wait for the bros to realize it's just ML dressed up all over again. It's just embarrassing to watch the hype cycle play out with the same suckers over and over.

Let me explain this hype to folks:

1. People suck at googling

2. People suck at information literacy - aka, the abstract ability to consume many sources and discern a sort of perceived truth from the supported commonality between them (read through threads here for an example) (and yes, this is inherently nuanced, so much so that again I'm not sure how to properly describe it).

3. People love being told what to do/think. (look at every influencer/podcasts, even including the ilk that is/was popular among the HN crowds)

4. "Take down Google", for many folks, is implicitly translated into "Microsoft can noticeably cut into Google's ad market revenue by making a better AI-powered search".

There are so, so, so many inferences, assumptions, pitfalls in #4 that I simply don't know how to explain it other than to just laugh and shrug and keep my head down on real work.

EDIT:

More and more it's becoming quite clear. Some folks are principled and care about information and education and society and understand the risks of misinformation at scale. And some folks see a way to get rich, or get some personal utility and can't or don't give a rats ass about the rest. Literally I just saw a thread where someone is pointing out clear CURRENT harm caused by ChatGPT powered products and the response was "IDC, it helped me, yolo".

I really just... want to jump ahead to a robot shooting me so I don't have to live through our ignorant enabling of silicon-valley-driven-killer-robots because I swear to god that's where the ignorant lot is driving us to.


What is the current harm?


> Is prompt injection even a problem worth worrying about?

It depends what API access the AI has. If it's just a chat bot, prompt injection can only reveal facts about its language model. But if the AI has POST access to something, depending on what it is, prompt injection can set off arbitrary human-caused disasters.


That's not the correct way to do security vulnerability analysis. If an API call can cause a disaster then fix the API. Whether the API consumer is an AI or some other type of system is irrelevant.


> If an API call can cause a disaster then fix the API

By "API" I'm not referring just to publicly facing REST endpoints. I mean things like shell access for system maintenance, that normally only human professionals like you would be given. In the future it's not clear that humans will be able to dominate that role forever.

Hopefully the issues will be recognized while LLM-based agents are still only serving as retrieval systems.


> But for most usecases, online models trained by megacorps still win.

Whisper and whisper.cpp have gotten us close to the tipping point.


You can ask it why yourself. That might have something to do with it.


What would be evidence that a prediction machine had developed a theory of mind?


well, I'd first need to see that it had a Theory of Feet... ;-)

More seriously, that it can actually understand and wield abstract concepts. Can it accurately and repeatedly understand that "the foot attaches to the shin bone, which attaches to the thigh bone, which attaches to the hip bone...", and that these have certain degrees of freedom, but not others, and that one foot goes in front of the other, and to easily and reliably distinguish a normal walk from a silly walk . . .

Yes, these are different levels of abstraction, especially the last one, and they need to be very accurate to even reach a young child's level of understanding, and this is just one branch of a branch of a branch in the entire fractal pattern of understanding that is necessary for a more general intelligence.

Once that is in place, and it can show evidence that it can model it's own mind, then it might be able to model someone else's mind.

While the statistical 'abstraction' and remixing seen in these "AI" systems is sometimes impressive and useful, it is frequently revealed that there is utterly no conceptual understanding beneath it. It is merely a statistical re-mixer abstracting patterns of words that occur near other words, remixing them and filtering for grammatical output.

It hasn't got a theory of anything, nevermind a theory of mind.


"statistical re-mixer" doesn't describe these systems very well. I see this complaint a lot, that supposedly DL models can only manipulate existing content without creating anything of their own. That's just false, unless your standard for originality is so high that humans can't reach it either.

These models that have hundreds of billions of "synapses", it's not very shocking to me that they can learn the abstract form of concepts. In fact, it's kind of beautiful that human concepts have this mathematical nature. It vindicates Plato, and disappoints everyone who has claimed that language and meaning is arbitrary.

But the main issue here is that for every conceivable empirical test we can perform, you'll still make the same complaint. Even after it's demonstrated better ToM abilities than you, by predicting and explaining other people's mental states better than you can, you'll say the same thing.

Maybe it's because you think that "understanding" requires not just accuracy, but having a certain kind of inner experience that a human could relate to.


Yes, I understand that it can appear to synthesize something new, and no, I'm not looking for some inner experience.

I'm looking for it to show an ability to wield not only a set of strings (with language associations), but something actually like the platonic ideals - objects, with properties and relations.

A few errors show quickly there is no such concept being weilded.

>> I saw a fine example of this failure the other day: "Mike's mom has four kids. three are named Danielle, Liam, and Kelly. What is the fourth kid's name?" ChatGPT's reply is explanation of how there isn't enough info in question to tell. Told "The answer is in the question.", ChatGPT just doubles down on the answer. (Sorry, couldn't find the original example)

>> "My sister was half my age when I was six years old. I'm now 60 years old. How old is my sister?" ChatGPT: "Your sister is now 30 years old". [0]

>> Or this one where ChatGPT entirely fails to understand order/sequence of events. [1]

Or a plethora of math problem fails found...

Similarly, the image "AI"s fail to understand relationships between objects (or parts of one object), and cannot abstract a particular person's image from a photo, showing it has no understanding of what is a body... (I can look those up if necessary).

And, of course, the answers are entirely untethered from reality - it is completely by chance whether the answer is correct or just wrong. It is run through a grammatical filter/generator at the end so it's usually grammatical, but no sort of truth filter (or ethical filter for that matter either).

I don't expect some abstract experience, I expect it to be able to break down it's work into fundamental abstract concepts and then construct an answer, and this it cannot do, or it would not be making these kinds of errors.

[0] https://twitter.com/Bestie_se_smeje/status/16210919157469184...

[1] https://twitter.com/albo34511866/status/1621608358003474432


> A few errors show quickly there is no such concept being weilded

I would have given similar examples to show that ChatGPT makes the same kinds of mistakes that humans do. The first one is good, because ChatGPT can solve it easily when you present it as a riddle rather than being a genuine question. Humans use context and framing in the same way; I'm sure you've heard of the Wason selection task: https://en.wikipedia.org/wiki/Wason_selection_task

When posed as a logic problem, few people can solve it. But when framed in social terms, it becomes apparently simple. This shows how humans aren't using fundamental abstract concepts here, but rather heuristics and contextual information.

The second example you give is even better. It's designed to trick the reader into thinking of the number 30 by putting the phrase "half my age" before the number 60. It's using context as obfuscation. In this case, showing ChatGPT an analogous problem with different wording lets it see how to solve the first problem. You might even say it's able to notice the fundamental abstract concepts that both problems share.

The third problem is also a good example, but for the wrong reason: I can't solve it either. If you had spoken it to me slowly five times in a row, I doubt I could have given the right answer. If you gave me a pencil and paper, I could work through the steps one by one in a mechanical way... but solving it mentally? Impossible for me.

> It is run through a grammatical filter/generator at the end so it's usually grammatical, but no sort of truth filter (or ethical filter for that matter either).

I kind of thought it did get censored by a sort of "ethical filter" (very poorly, obviously), and also I wasn't aware of it needing grammatical assistance. Do you remember where you heard this?

Here's my chat with it, if you're interested: https://pastebin.com/raw/hQQ8bpsB

But comparing 1 human to 1 GPT is mistaken to begin with. It's like comparing 1 human with 1 Wernicke's area or 1 angular gyrus. If you had 100 different ChatGPTs, each optimized for a different task and able to communicate with each other, then you'd have something more similar to the human brain.


>>trick the reader into thinking of the number 30 by putting the phrase "half my age" before the number 60

Yet it is exactly the process of conceptualizing "half" and applying it to "at six years old" instead of "of 60" that is the key to solving it.

These things aren't abstracting out any concepts, they only operate at the level of "being fooled by" semantics. The fact that humans sometimes fail this way gives us little more than [sure a human not really thinking about the problem may offer a bad solution based only on the superficial semantic]. ChatGPT reliably gives us the error based on the superficial semantics.

>>If you had 100 different ChatGPTs, each optimized for a different task and able to communicate with each other, then you'd have something more similar to the human brain.

YES, that is the route we need to go to get towards actual intelligent processing. Taking 100 of these tuned for different areas, and abstracting out the various entities and relationships.

Kind of like the visual cortex model that extracts out edges, motion, etc., and then higher areas in the visual cortex, combined with other areas of the brain allow us to sort out faces, bodies, objects passing behind each other, the fact that Alice entered the room before Bob, and that this is because Bob was polite...

They also mut know when they are making errors, and NONE of these systems comes even close — they happily spout their bullshirt as confidently as any fact.

I gave a deposition in a legal case where the deposing attnys used an "AI" transcription system. Where a human would ask if anything was unclear, and always at the next break get proper spellings of all names, addresses, etc., this thing just went merrily along inserting whatever seemed most likely in the slot. Entire meanings of sentences were reversed (e.g., "you have a problem" edited to "I have a problem"), names were substituted (e.g., the common "Jack Kennedy" replaced "John Kemeny").

There's the Stable Diffusion error with a bikini-clad girl sitting on a boat, where we see her head and torso facing us, as well as her butt cheeks, with thighs & knees facing away. It looks great for about 1.5 sec. until you see the error that NO human would make (except as a joke).

The mere fact that some humans can sometimes make superficial errors which resemble the superficial errors these "AI" things frequently and consistently make does not mean that because humans often have a deeper mode, these "AI"s must also have a deeper understanding.

It means either nothing, i.e., insufficient data to decide, or that these are indeed different, because there is zero evidence of deeper understanding in a ChatGPT or Stable Diffusion.

EDIT: Typos


You might like some of the work being done under the label "Factored Cognition". It's an approach that treats LLMs as building blocks instead of being complete AIs. Instead of asking the LM to solve a problem directly in one pass, you ask it to divide the problem between several different virtual copies of itself, which then themselves subdivide further, and so on until each subtask is small enough that the LM can solve it directly. For this to work the original problem needs to be acyclic and fairly tree-like, i.e., not something that requires having a sudden "Eureka!" moment to solve.

But I've only seen this done with a single model. Sometimes it gets prompted to act like a different agent in different contexts, or given API access to external tools, but it's still just one set of weights.


Hmm, that sounds like a nod in the right direction, but a rapid initial skim maybe indicates that it's more parallelizing the problem than abstracting it. I've got to read more about it - thanks!

While Minsky & Papert's book on Perceptrons was enormously destructive, I think there is something to their general concept of Society Of Mind, that multiple sub-calculating 'agents' collude to actually produce real cognition.

We aren't doing conscious reasoning about the edges detected in the first couple layers of our visual cortex (which we can't really even access, 'tho I think Picasso maybe could). We're doing reasoning about the concepts of the people or objects or abstract concepts or whatever many layers up. The first layers are highly parallel - different parts of the retina connecting to different parts of the visual cortex, and then starting to abstract out edges, zones, motion, etc. and then synthesize objects, people, etc.

I think we need to take a GPT and a Stable Diffusion and some yet-to-be-built 3D spatial machine learning/reasoning engine, and start combining them, then adding more layer(s) synthesizing about that, and maybe that'll get closer to reasoning...


> And then the slippery slope, “we can also improve the odds of higher intellect, being taller , thinner, etc”.

I would love to have been born in a world that had slid to the bottom of that particular slope.


Cross your fingers to be born rich!


I'm always confused. The NWO conspiracy people say that gene editing will be prohibitively expensive (like owning a yacht is), but the real life examples I see don't seem that way. What are some good examples of services that could be done cheaply but are intentionally made expensive for the sake of exclusion?


> but the real life examples I see don't seem that way.

In what situation would removing an egg, fertilizing it, altering its genetic sequence, and them implanting the embryo into a person be consider cheap?


Looks like IVF costs around $12,000 today. Considering that this is for your own child, I bet it would be one of the least expensive parts of raising them. I also expect it would become very cheap after it went mainstream.

> The Human Genome Project was the international research effort to determine the DNA sequence of the entire human genome. It took 13 years and was published in 2003, with an estimated cost of over $300 million. Today, a whole human genome can be sequenced in one day for under $1000.

https://frontlinegenomics.com/a-history-of-sequencing/


$12,000 is more than the average yearly income for a person on this planet. The USA has only 300 million or so people in it. What are the chances you are born in a place where $12,000 is considered 'the least expensive parts of raising a child'?


Would you have made the same argument for mobile phones back when they costed $12,000? I'm assuming that many people will want to use this technology.


A mobile phone can be mass produced based on one design. This is not the case with removing eggs and doing custom gene work on them, incubating them, and implanting them in a person. You are making an invalid comparison. There are no 'economies of scale' with individual procedures.


It doesn't seem like you're addressing the HGP example, which at the time, would have been an "individual procedure" as well.


Many species have already slid to the bottom of that slope. The phenomenon is called Fisherian Runaway, and it's not pretty.


> and it's not pretty.

It feels somewhat ironic that when I looked up "Fisherian Runaway" the first thing I saw was an image of a peacock.

Evolution by "persistent, directional female choice" seems like exactly what our species needs to unfuck itself.


> It requires a theory of mind, which means understanding human goals.

Like chess?


Chess doesn't require understanding human goals if you're an AI, just running through the space of winning moves that your training encountered. An AI doesn't think "ah, rook to E4, knowing Kasparov, he will then castle and I can do X". It'll just go "lol I encountered 3 486 864 games with this particular board pattern and 49 652 lead to a victory lemme pick one for the lulz".

Chess is really just about searching through as much of the space as possible, and you can take a few seconds. And believe me, there's a whole lot more stupid behaviours you can encounter on the road while you have to react in less than a second.


> It'll just go "lol I encountered 3 486 864 games with this particular board pattern and 49 652 lead to a victory lemme pick one for the lulz".

That isn't really an accurate description of how the modern DL bots work. They don't need to reference any database of past games while they play.


Chess is a perfect information game. Both opponents can see the entire board at all times. Reality, nor even driving, is not anywhere close to that. Weather, time of day, climate, seasonality, and other drivers throw constant and voluminous amounts of unpredictability into the mix.


> Reality, nor even driving, is not anywhere close to that.

This reminds me of the kinds of intuitions people had about Chess and Go. In retrospect they seem silly, but it made plenty of sense to them at the time. The fact was that there was a solution that machines could use that humans couldn't use. Naturally, a solution that humans couldn't use was hard for them to anticipate being effective.


Chess AIs don't need to develop an understanding of the goal, they're hard-wired with the rules of the game and its goal-states.

As an experiment, try making a chess AI without explicitly giving it the rules of the game and see how it performs.


I think that's already been done: https://en.wikipedia.org/wiki/MuZero


Almost, but not quite, exactly unlike chess.


So not like chess, but like how chess was supposed to be?


> endless progress or a good future

The optimist believes progress will halt at a convenient moment. The pessimist fears otherwise.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: