GPT-5 simply sucks at some things. The very first thing I asked it to do was to ...

zaptrem · 2025-08-08T18:50:50 1754679050

The image model (GPT-Image-1) hasn’t changed

orphea · 2025-08-08T18:57:05 1754679425

Yep, GPT-5 doesn't output images: https://platform.openai.com/docs/models/gpt-5

perlgeek · 2025-08-08T19:24:38 1754681078

Then why does it produce different output?

simonw · 2025-08-08T19:27:36 1754681256

It works as a tool. The main model (GPT-4o or GPT-5 or o3 or whatever) composes a prompt and passes that to the image model.

This means different top level models will get different results.

You can ask the model to tell you the prompt that it used, and it will answer, but there is no way of being 100% sure it is telling you the truth!

My hunch is that it is telling the truth though, because models are generally very good at repeating text from earlier in their context.

slickytail · 2025-08-08T23:04:53 1754694293

Source for this? My understanding was that this was true for dalle3, but that the autoregressive image generation just takes in the entire chat context — no hidden prompt.

simonw · 2025-08-09T03:41:26 1754710886

Look at the leaked system prompts and you'll see the tool definition used for image generation.

slickytail · 2025-08-09T07:39:39 1754725179

I stand corrected! Thanks.

seba_dos1 · 2025-08-08T22:02:57 1754690577

You know that unless you control for seed and temperature, you always get a different output for the same prompts even with the model unchanged... right?

carlos_rpn · 2025-08-08T18:52:49 1754679169

Somehow I copied your prompt and got a knife with a single handle on the first try: https://chatgpt.com/s/m_689647439a848191b69aab3ebd9bc56c

Edit: chatGPT translated the prompt from english to portuguese when I copied the share link.

hirvi74 · 2025-08-08T18:56:59 1754679419

I think that is one of the most frustrating issues I currently face when using LLMs. One can send the same prompt in two separate chats and receive two drastically different responses.

dymk · 2025-08-08T19:15:06 1754680506

It is frustrating that it’ll still give a bad response sometimes, but I consider the variation in responses a feature. If it’s going down the wrong path, it’s nice to be able to roll the dice again and get it back on track.

techpineapple · 2025-08-08T18:57:58 1754679478

I’ve noticed inconsistencies like this, everyone said that it couldn’t count the b’s in blueberry, but it worked for me the first time, so I thought it was haters but played with a few other variations and got flaws. (Famously, it didn’t get r’s in strawberry).

I guess we know it’s non-deterministic but there must be some pretty basic randomizations in there somewhere, maybe around tuning its creativity?

seba_dos1 · 2025-08-08T22:06:24 1754690784

Temperature is a very basic concept that makes LLMs work as well as they do in the first place. That's just how it works and that's how it's been always supposed to work.

vunderba · 2025-08-09T06:42:23 1754721743

To ensure that GPT-5 funnels the image to the SOTA model `gpt-image-1`, click the Plus Sign and select "Create Image". There will still be some inherent prompt enrichment likely happening since GPT-5 is using `gpt-image-1` as a tool. Outside of using the API, I'm not sure there is a good way to avoid this from happening.

Prompt: "A photo of a kitchen knife with the classic Damascus spiral metallic pattern on the blade itself, studio photography"

Image: https://imgur.com/a/Qe6VKrd

joaohaas · 2025-08-08T18:56:57 1754679417

Yes, it sucks

But GPT-4 would have the same problems, since it uses the same image model

chrismustcode · 2025-08-08T18:55:30 1754679330

The image model is literally the same model

minimaxir · 2025-08-08T19:11:54 1754680314

So there may be something weird going on with images in GPT-5, which OpenAI avoided any discussion about in the livestream. The artist for SMBC noted that GPT-5 was better at plagiarizing his style: https://bsky.app/profile/zachweinersmith.bsky.social/post/3l...

However, there have been no updates to the underlying image model (gpt-image-1). But due to the autoregressive nature of the image generation where GPT generates tokens which are then decoded by the image model (in contrast to diffusion models), it is possible for an update to the base LLM token generator to incorporate new images as training data without having to train the downstream image model on those images.

simonw · 2025-08-08T19:41:03 1754682063

No, those changes are going to be caused by the top level models composing different prompts to the underlying image models. GPT-5 is not a multi-modal image output model and still uses the same image generation model that other ChatGPT models use, via tool calling.

GPT-4o was meant to be multi-modal image output model, but they ended up shipping that capability as a separate model rather than exposing it directly.

minimaxir · 2025-08-08T19:45:05 1754682305

That may be a more precise interpretation given the leaked system prompt, as the schema for the tool there includes a prompt: https://news.ycombinator.com/item?id=44832990