The fact that everyone asks it to be terse is interesting to me. I find the outp...

striking · on May 25, 2024

Most folks don't realize that each token produced is an opportunity for it to do more computation, and that they are actively making it dumber by asking for as brief a response as possible. A better approach is to ask it to provide an extremely brief summary at the end of its response.

londons_explore · on May 25, 2024

Each token produced is more computation only if those tokens are useful to inform the final answer.

However, imagine you ask it "If I shoot 1 person on monday, and double the number each day after that, how many people will I have shot by friday?".

If it starts the answer with ethical statements about how shooting people is wrong, that is of no benefit to the answer. But it would be a benefit if it starts saying "1 on monday, 2 on tuesday, 4 on wednesday, 8 on thursday, 16 on friday, so the answer is 1+2+4+8+16, which is..."

zargon · on May 26, 2024

The tokens don't have to be related to the task at all. (From an outside perspective. The connections are internal in the model. That might raise transparency concerns.) A single designated 'compute token' repeated over and over can perform as well as traditional 'chain of thought.' See for example, Let's Think Dot by Dot (https://arxiv.org/abs/2404.15758).

delusional · on May 26, 2024

That doesn't have to be the case, at least in theory. Every token means more computation, also in parts of the network with no connection to the current token. It's possible (but not practically likely) that the disclaimer provides the layer evaluations necessary to compute the answer, even though it confers no information to you.

The AI does not think. It does not work like us, and so the causal chains you want to follow are not necessarily meaningful to it.

londons_explore · on May 26, 2024

I don't think that's true on transformer models.

Ignoring caches+optimisations, a transformer model takes as input a string of words and generates one more word. No other internal state is stored or used for the next word apart from the previous words.

delusional · on May 26, 2024

The words in the disclaimer would have to be the "hidden state". As said, this is unlikely to be true, but theoretically you could imagine a model that starts outputting a disclaimer like "as a large language model" it's possible that the next top 2 words would be "I" or "it" where "I" would lead to correct answers and "it" would lead to wrong ones. Blocking it form outputting "I" would then preclude you from getting to the correct response.

This is a rather contrived example, but the "mind" of an AI is different our own. We think inside of our brains and express that in words. We can substitute words without substituting the intent behind them. The AI can't. The words are the literal computation. Different words, different intent.

drexlspivey · on May 25, 2024

Does more computation mean a better answer? If I ask it who was the king of England in 1850 the answer is a single name, everything else is completely useless.

Kiro · on May 26, 2024

You just proved yourself incorrect by picking a year when there was no king, completely invalidating "a single name, everything else is completely useless".

im3w1l · on May 26, 2024

Make me wonder if, when forcing it to do structured output, you should give it the option of saying "error: invalid assumptions" or something like that.

have_faith · on May 25, 2024

It's potentially a problem for follow up questions. As the whole conversation, to a limited amount of tokens, is fed back into itself to produce the next tokens (ad infinitum). So being terse leaves less room to find conceptual links between words, concepts, phrases, etc, because there are less of them being parsed for every new token requested. This isn't black and white though as being terse can sometimes avoid unwanted connections being made, and tangents being unnecessarily followed.

card_zero · on May 26, 2024

King Victoria. Does that not benefit from a few clarifying words? Or is your whole point that "Victoria" is sufficient?

acchow · on May 25, 2024

It gives better reuslts with “chain of thought”

striking · on May 25, 2024

I mean in the general case. I have my instructions for brevity gated behind a key phrase, because I generally use ChatGPT as a vibe-y computation tool rather than a fact finding tool. I don't know that I'd trust it to spit out just one fact without a justification unless I didn't actually care much for the validity of the answer.

ClassyJacket · on May 25, 2024

I'm not an expert on transformer networks, but it doesn't logically follow that more computation = a better answer. It may just mean a longer answer. Do you have any evidence to back this up?

spencerchubb · on May 26, 2024

https://arxiv.org/abs/2404.15758

OJFord · on May 26, 2024

Isn't it an implementation detail that that would make a difference? No particular reason it has to render the entirety of outputs, or compute fewer tokens if the final response is to be terse.

mattmanser · on May 26, 2024

I'd not thought about it, but even if it did improve the quality the answer is still a lot slower.

It also now has a lot of useless cruft I have to scan to get to what I want.

Cicero22 · on May 25, 2024

Why not ask for an extremely brief summary up front?

andromaton · on May 25, 2024

Because it hasn't computed yet.

hn_throwaway_99 · on May 25, 2024

> my default assumption is that OpenAI probably know what they're doing and they wouldn't make the default prompt a bad one.

That's not really a great assumption. Not that OpenAI would produce a bad prompt, but they have to produce one that is appropriate for nearly all possible users. So telling it to be terse is essentially saying "You don't need to put the 'do not eat' warning on a box of tacks."

Also, a lot of these comments are not just about terseness, e.g. many request step-by-step, chain-of-thought style reasoning. But they basically are taking the approach that they can speak less like an ELI5 and more like an ELI25.

cal85 · on May 26, 2024

It works. I agree, more words seem to result in better critical rigour. But for the majority of my casual use cases it is capable of perfectly accurate and complete answers in just a few tokens, so I configure it to prefer short, direct answers. But this is just a suggestion. It seems to understand when a task is complex enough to require more verbiage for more careful reasoning. Or I can easily steer it towards longer answers when I think they’re needed, by telling it to go through something in detail or step by step etc.

The main benefit of asking for terseness in your preferences is that it significantly reduces pleasantries etc. (Not that I want it completely dry and robotic, but it just waffles too much out of the box.)

matsemann · on May 25, 2024

My experience as well. Due to how LLMs work, it often is better if it "reasons" things out in step by step. Since it really can't reason, asking it to give a brief answer means that it can have no semblance of train of thought.

Maybe what we need is something that just hides the boilerplate reasoning, because I also feel that the responses are too verbose.

anticensor · on May 26, 2024

That one is easy: Generate the long answer behind the scenes, and then feed it to a special-purpose summarisation model (the type that lets you determine the output length) to summarise it.

tomashubelbauer · on May 25, 2024

I'd be less inclined to put that instruction there now with the faster Omni, but GPT4 was too slow to let it ramble, it wouldn't get to the point fast enough by itself. And of course it would waste three seconds starting off by rewording your question to open its answer.

p1esk · on May 25, 2024

In my system prompt I ask it to always start with repeating my question in a rephrased form. Though it’s needed more for lesser models, gpt4 seems to always understand my questions perfectly.

drexlspivey · on May 25, 2024

You prefer this response instead of the one line command? https://chatgpt.com/share/8c97085e-70cc-4e62-8a54-3a64f95744...

LeoPanthera · on May 25, 2024

A single example does not prove the rule.

jamesponddotco · on May 25, 2024

It's even more interesting if you take into consideration that for Claude, making it be more verbose and "think" about its answer improves the output. I imagine that something similar happens with GPT, but I never tested that.

dinkleberg · on May 25, 2024

I have been wondering that now that the context windows are larger if letting it “think” more will result in higher quality results.

The big problem I had earlier on, especially when doing code related chats, would be be it printing out all source code in every message and almost instantly forgetting what the original topic was.

illusive4080 · on May 26, 2024

I didn’t know that. I always try to make it terse because by default it is far too verbose for my liking. I’ll have to try this out.

What if I just ask it for a terse summary at the end? Maybe I’ll get the best of both worlds.

BiteCode_dev · on May 26, 2024

Because it works.

We tried the alternative, and it's less productive.

At some point, there is the theory and practice.

Since LLM output are anything but an exact science from the users perspective, trials and errors are what's up.

You can state all day long how it works internally and how people should use it, but people I've not waited for you, they used it intensively, for million of hours.

And they know.

mrbombastic · on May 25, 2024

I am not sure assuming they know what they are doing is too reasonable but it might be reasonable to assume they will optimize for the default so straying too far might be a bad idea anyway

sandspar · on May 26, 2024

I'd rather have a buddy with an IQ of 115 who I enjoy talking to than one with an IQ of 120 who I find annoying.

itomato · on May 26, 2024

Maybe an artifact of the 4K token limit