Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Automatic prompt optimizer for LLMs (jina.ai)
126 points by artex_xh on April 21, 2023 | hide | past | favorite | 55 comments



Here is a prompt you can use to achieve the same thing directly in GPT4 without any extra services:

  I want you to become my prompt master creator, by helping me to create the best possible prompt. 

  In order to do this we will follow the following process: 
   
  - First, you ask me what the prompt is about. I will answer you, and we will go through the next step. 
  - Based on the answer I gave you, you will generate the following: 
   - An improved prompt, concise. 
   - Relevant questions you might have to improve the quality of the prompt. 
  - We will go through this process repeatedly, with me providing additional information to you, and you updating the prompt to improve it, until I say we are done.


But ChatGPT has no special introspective powers. It's guessing based on the data it was trained on, which was primarily written before GPT by people who didn't know how GPT would work.

So it will give you suggestions that sound plausibility like how we tend to think an AI might work, but there's no reason to expect it to be correct.


The prompt it generates looks good and makes sense, but I doubt it's always correct or optimal. But then I doubt a human assistant could give a perfect prompt either (enemy of the good, etc.).


> The prompt it generates looks good and makes sense

What I'm trying to say is that that's exactly what it's optimized for. They're predicting what sounds plausible based on all the pre-gpt writing about AI.

But GPT was revolutionary! A lot of the pre-gpt blogspam and reddit comments and fiction and so on was wrong about how AI works in exactly the way you've been socialized to find plausible.

In general plausibility is the wrong metric to evaluate GPT on, and it's wronger than it seems like it should be.

Edit: And in contrast a human trying to write good prompts will have data about how GPT works that they've personally observed, and they'll weigh that data much higher than say Star Trek.


Gpt is really bad at optimizing prompts this way because there is no way it has the ability to simulate the effects, way too complex. Tools like this need to log and a/b test.

gpt can be layered and made into an agent etc. To do the AB testing or to make prompts longer by adding more end cases as time goes by. But the effects of one single word change are far too complex for gpt base output to understand anything about.


I'm sure it could be improved, including telling it to do what you suggest. Have you tried it as is though?


Yes I used it. The optimized prompt was not better for my use case. The playground was useful though. I believe prompt optimization is really only optimized by running it through many scenarios and understanding how changing a single word affects things down the line. And then a bunch of hardcoded conditions to change the system/assistant messages on demand as an output of the tool.


From my experiments so far, I am unsure if chatgpt is effective at optimizing prompts, even though it thinks it is.


The workflow you get with this is that it helps you to think about what you really want answered with the prompt.

For example, latest try had this initial prompt:

> How can I create collisions in Bevy and Rust?

After 4-5 messages back and forth, I ended up with:

> What is the most efficient method to implement 2D collision detection for a large number of constant-radius circles randomly distributed in Bevy and Rust without using third-party tools, focusing on detection only, without any unique properties or attributes affecting the detection process?

Which is much more explicit and targeted to what I initially wanted to do, but didn't write. It ended up helping me implement a Quadtree solution which I don't have any experience with before, but overall it went smoothly.


How do you know that the prompt you got to is at least a local maximum? Would chancing "method" to "function" provide better results? Did you do any benchmarking?


The aim is not any local maximum, is improving the prompt by getting help providing more context.


That's a pretty bad optimizer then.


I've seen it help me once so far in a way I hadn't thought of: basically, the chat completion engine was reading a bug bounty report on my behalf, and believing generic, vague claims. ChatGPT-3.5 didn't "fall" for this, but 4 did consistently.

After failing to improve performance manually, I eventually asked 4 to improve my prompt after explaining the problem and it basically added "Don't assume the existence of any examples".

This makes perfect sense in hindsight, but I had been approaching things from the other direction - directing it to only consider explicit examples in the report, and for some reason that didn't work.


Or just provide it with sample contexts and expected responses, and ask for a prompt to produce those responses from the given context.


If you prompt it with "sample contexts and expected responses," that is, in fact, a prompt that would produce those results. It's not a bad technique for crafting prompts, either.


That’s best when possible, but if each context and response very large, a good prompt is the best way to distill the information without fine tuning.

I suppose you could try asking for shorter examples to provide for a few-shot prompt, could work well in some cases


you'd be better off with two chats, one is your prompt creator, you pick the prompt, use another chat to create answer (zero shot chat cleared every time), and you copy the answer back into the prompt creator, with feedback on what the answer get wrong.

that's because from the point of view of gpt, gpt is always providing is most accurate anwer already, and the only improvement you'd get is in context clarity as defined by gpt understanding of language, which is not bad by itself, but it's not as effective as having gpt fight itself.


Yes, the way the prompt I put is that it's a session, so you'll open a new session per prompt, for each prompt you want to work on.

Then when you want to use that prompt, you'll open yet another session. Sorry if that was unclear.


Thank you, I am collecting promtps creators.


Computer science has become so weird ...


If you think creating and using prompts are computer science then yeah, it has gotten weird :) Fortunately, I think that's a minority view.


Damn it! I was just about to get promoted to senior prompt engineer and now they just automate my job. I worked so hard on reschooling myself from metaverse pixel ape designer, and now I’ll have to learn something new. Any suggestions?


Don't dispair. There's still work out there for prompt optimizer prompt engineers, producing exceptional prompts for prompt optimizers to optimize. It's a little different from being a prompt engineer but you've shown the sort of flexibility that leads me to believe you could cross-train.


Go back to the metaverse and complete the import of Zuck’s legs. You’ll make a billion dollars.


Prompts don’t write itself so you will need to, except maybe you don’t need to be so senior


you asking people?


I wish I could give this comment gold


Modern equivalent of selling shovels during a gold rush


yes, 12 months ago jina was an open source semantic search tool, than a whole bunch of things, now a prompt optimizer and bunch of ai tools


+1 to this, I feel like they are in the middle of an existential crisis. They keep shipping out product after product each month now. I think that they are struggling quite a bit to meet investors' growth expectations.


> By providing any User Contribution on the Platform, you grant us and our affiliates and service providers, and each of their and our respective licensees, successors and assigns the right to use, reproduce, modify, perform, display, distribute and otherwise disclose to third parties any such material as per your instruction.

> By providing any User Contribution on the Platform, you grant us and our affiliates a perpetual, worldwide, royalty free, fully paid up, license to use, reproduce, modify, create derivative works, perform, display, distribute and process any such User Contributions in order to maintain, improve, enhance, or secure the Platform and any Services provided via the Platform.

Prompts are not code, fine. But are you saying that you'll potentially make use of prompts provided by users?


> Prompts are not code

ChatGPT, where were the causes for WWII?

For each cause

  specify the people involved

  for each person

    print out a brief biography


If a third party using your prompts is a significant risk then you're toast anyway.


This whole apparatus makes me want to go completely offline. Maybe Kaczynski was right.


You know, I feel you.

The reality of modern business and purpose is morphing into a caricature of itself.


Nth time this week someone repackages a call to GPT4 as a way to improve/evaluate LLM outputs. Guys, just stop.


Well if you look at lexica.art, I don’t think the business is about repackaging GPt4, it’s about getting into the information flow because prompts are RHLF and in the end this is a case of faking it to make it


I wonder how it works under the hood.

I'm sure there's an LLM somewhere, but is it as simple as a (very specific, elaborate) prompt for each service run through GPT4, or something more specific... like breaking it up with actual code and running the reconstructed bits through a finetuned LLM?


The way I would do it (to be tweaked for each target LLM) is to give GPT4 a prompt telling it what kind of things make for a good prompt followed by a dozen of before/after examples (spanning various domains).


good idea

just so you know the automated translations of your website are completely AWFUL

maybe use GPT or deepl to translate it?


With an optimized prompt!


Since this is a Show HN, here's my immediate feedback from your homepage. The copy that ends with this sentence: 'Say goodbye to subpar AI-generated content and hello to prompt perfection with PromptPerfect!' feels overdone and like it's written by a prompt rather than a human.

Since you claim in that copy to be able to optimize a prompt for a better output it felt like an odd dissonance to read that copy and have that reaction since your product claims to do the opposite. Hope this helps!


Why isn't this called "prompt-imization"? Maybe prompt-opt or opt-prompt is easier...


Because prompt-imization would be understood as "converting into a prompt", not optimization of a prompt.


sure


Circular loop business, very direct. Use gpt to create better prompts and sell it.


Story of many GPT products that came out in the past 6 months.


This is cool, but why does it require a login? I would guess it uses an LLM under the hood with a single-shot prompt to refine a given prompt.


> but why does it require a login?

$$$


Actually pretty interesting. I got some good results with it for Midjourney. Huh. I'll use my free credits and evaluate afterwards.


Aside: I opened this in a new tab to browse later, then glanced over at the icon and wondered how I had opened a PayPal page.


I had the same problem. OP, your logo/favicon is a navy blue bold capitalized P superimposed on an electric blue P of the same size, slightly off to the lower right.

This is extremely similar to the PayPal logo/favicon and could be problematic for your branding (I'm the second user who got confused just in this thread), or could even cause you legal issues.


The "How it does it work" section has layout issues on mobile: some text goes off the screen.


[flagged]


"Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something."

https://news.ycombinator.com/newsguidelines.html


You rocks! Awesome idea/execution "You can't see the forest for the trees"




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: