Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Building an interpreter for my programming language with ChatGPT (is-a.dev)
272 points by nobody5050 on Dec 4, 2022 | hide | past | favorite | 138 comments


I have been developing a hobby project (AI powered document search) for a few months and was in sore need of a frontend. My frontend development skills however are stuck in late 1990s and I have zero skill with anything but plain HTML and a little bit of JS. Several times I tried learning React, reading tutorials, watching videos, but the whole idea of it was very removed from how I learned to code, so I gave up every time.

Today, I asked ChatGPT to develop the React app for me. ChatGPT guided me through the entire process starting from installing npm and necessary dependencies. The commands it suggested sometimes didn't work but every time I just copy-pasted the resulting error message into ChatGPT and it offered a working solution. I gave it the example of JSON output from my API backend and it generated the search UI which, to my surprise, worked.

My wet dream for the past few months was to implement infinite scrolling for my search. Again, after hours of google searches, tutorials, etc. I just gave up every single time. Not today. I asked ChatGPT to add infinite scrolling to my app. It wasn't easy. It didn't produce a working app immediately, it took a couple hours of conversations: I had many questions how different parts of React worked, how to fix errors etc. etc. In the end however, I had my working search app, and with infinite scroll to boot!

I haven't done a single google search or consulted any external documentation to do it and I was able to progress faster than I have ever did before when learning a new thing. ChatGPT is, for all intents and purposes, magic.


It doesn't sound that much different than going through Google and Stackoverflow though, is it? In a few hours of googling you can probably get something working if you are an experienced dev.


The crucial difference is that at no point I felt I was stuck. I could paste any line of code into ChatGPT and ask it to explain it. Practically every time I got a meaningful and valuable explanation, moreover the explanation was in the context of my code. Similarly all functions it generated were matching the context of my code so I could just copy and paste it and it just worked, most of the time.

Rather than going through Google and Stackoverflow it felt like working side-by-side with a moderately competent developer. Mind you, I have tried the google-and-stackoverflow method before for the exact same thing, and failed every time ;-)


It's very different because it will answer the question you asked, rather than answering a question that matches a substring of the question you asked like Google will.


Google apparently uses BERT to actually answer the question you asked … and an obvious incarnation for this sort of tech is probably going to be further integration into google . Makes sense doesn’t it .


BERT is a simple model that is not capable of answering questions in this manner. For very simple things it might help with that answer box at the top, but that's not what I meant.


Yeah I had a slightly similar experience as OP, though simpler. I asked it to automate a basic task, something I hadn't done before. I managed to do it with ChatGPT only, no other resources.

That said - ChatGPT did make mistakes, there were inconsistencies in its instructions, it didn't recognize certain bugs (I had to find them myself). _But_ there was something about the chat-based interaction that to an extent helped me preserve flow (maybe a bit like pair programming?).

I do think that if I had set my mind to it, I would have been faster solving the task with Google, and to some extent I went through this exercise just to test ChatGPT.


ChatGPT helped me close several business deals. I am now a mega millionaire thanks to ChatGPT! Before, I wasn’t able to find the basic info on how to close multi million dollar deals, and I tried all kinds of stuff. But ChatGPT helped me through that. On the calls - whenever I didn’t know what to say next, I would just read off what ChatGPT was responding to the customers, and to my surprise, it matched what they wanted to hear! And they started responding back and forth with it as if it was always in the plan! In the end, they didn’t exactly say “shut up and take my money”, but they did seem to express deep concern that I wouldn’t have availability for them, and essentially agreed to all the upsells very quickly.

I recommend ChatGPT to anyone who wants to close customers or save their marriage. Just say whatever ChatGPT is telling you… even if that means using one of the new “personal” beamed sound into your skull things. You’ll have superhuman ability to vibe with anyone and outcompete everyone who relies on just “their own experience”.

- Written by ChatGPT in response to a prompt.

In the end it added, “no one will ever believe you” in all lowercase.


Presumably it does a better job than "closed as duplicate" or "you actually want XYZ even though you asked for ABC".


"this question is not a good fit..."


Hopefully ChatGPT doesn't refuse to answer my question because of some reason appreciated only by people who get too much pleasure from the StackOverflow moderation game


I’ve been thinking about exactly this.

That “it’s just a different search front end”… but I think after more experience with it I disagree.

At its worst, it’s “multiple searches” at once.

Example1… I wanted to find a CAGE for code a specific military mfg. I only had the last 3 digits. I asked for CAGE codes that match and got all the answers instantly. I could have searched this, but it would have been multiple searches.

I asked for the etymology for the Swahili word for trapezoid… again, multiple searches. If I could have found links to the Arabic root of some Swahili words at all.

That’s it’s worst case, convenient multiple searches. The better case is the UX of a conversation is powerful for the user, in a way we are just learning the words for.


>if you are an experienced dev

But OP explicitly said they had little experience in this area. They also presumably have a technical career and are awash in the ways of Google. I'm in a similar situation to GP and have gone down that very path with React and whatnot. It's like you're starting a rodeo off the bull and have to figure out how to get back on. It's a terrible experience and you're left infuriated at a faceless collective that carelessly makes getting started so difficult.


An experienced dev can work something out even if its not his main stack. I could probably get something very basic done in Swift or Android despite never doing it. Experienced devs are just good in reading documentation and having a general understanding of how things should work.


Totally, and it seems like ChatGPT almost does the experienced dev work here for a junior developer - impressive.

But much like you need to cause some stress to a muscle to cause it to grow, junior developers historically needed to get experienced at finding some of the solutions to their own pain to become really experienced developers...

It seems like ChatGPT may cut that form of growth out of the cycle...

I wonder about the implications of this... Junior devs will progress more quickly, but they will also grow less of their own skills and be very reliant on ChatGPT - like an exoskeleton for their development skills.

I guess that will be great for OpenAI if they can charge a hefty monthly fee...

I'd still rather max out my own skills before relying on an exoskeleton (once I've maxed myself out sure, give me the exoskeleton, and let's see what it can do), but maybe I'm too old fashioned...


replace chat GPT with slide rule and calculator and you have the endless arguments made against calculators in the 70s. change it to typewriters in word processors and you have all the hand ringing in the early '80s about how writing was going to be destroyed by easy copy paste. That isn't a proof that your argument is wrong of course, but it is very suggestive to me.

I typed this with text to speech, another thing we were confidently told would never work


I hear you. Developers these days. They wear the crutches and exoskeletons of interpreted languages. Real senior devs. only write in assembly. /s

Why is one abstraction more "true", "less creative" or more "strong muscle" than another?


One big difference is there is no ads or seo, no one is trying to get in the way of you trying to learn something. Hopefully that will never happen.


It's simply not magic.


I've had a play with ChatGPT and the experience has been pretty frustrating. It either responds with "Sorry, I cannot do this because I don't have access to the internet" (even if I am giving it prompts that don't require this) or it actually generates code but it's subtly incorrect (this was the case when I asked it to generate an example of how to render a 3D cube in JavaScript).

This makes me wonder how much time people are spending optimising the prompt to get the answer they want and they just make it seem like this was the first response they got.


I'm pretty confused trying to connect all the reports online with my own experiences as well. From what I've tried, ChatGPT does not _understand_ code at all, and there are many inconsistencies in what it says. The "confidently giving a wrong answer" problem is very real, even if the answer might look very correct at first sight. This holds across all the topics I've tried.

When people say they implement complex tasks with ChatGPT, I have to assume that it's a highly iterative process and/or that they are doing part of the design/problem solving themselves because even for a simple task I could not rely only on the bot's reasoning. (Maybe it gets things right in one shot sometimes - but my sense is that "on average" that's not the case at all.)

All that said - the progress here is really impressive, and I'm still having a hard time wrapping my head around what this can mean for the future.


Confirmation bias - people want it to be a silver bullet so that they can make a blog post about how ChatGPT is amazing.


Exactly. As the OP of the blog, the amount of handholding I had to do for it to understand the syntax of an extremely tiny language was a lot. On the other hand, I’ve messed around with codex and other models before, and something about explaining in normal English, as though I was having a conversation rather than just listing some commands made it much easier. I’m excited not because of what exists right now, but because this shows so much promise even just 1 or 2 papers down the line :D


What a time to be alive!


Seems most likely that you're not the chosen one to hype this new shiny trendy thing, so it doesn't waste precious CPU cycles on you.

It is only if you have truly, zealously dedicated your life to promote ChatGPT in mainstream IT circles, as in getting paid to do so, only then will it completely unleash its vast potential into the reply form, writing you a desktop OS in Brainfuck that is ready to compete with Linux, OSX and Windows, proving the Fundamental Theorem of Algebra, simulating 2^1024 qubit machine that cracks 4096 bit RSA, finding out 23 hidden bugs in x86 microcode, telling you which gene to edit to get rid of peanut allergy, etc etc etc, all at your correctly formulated finger snap.

Full disclosure: this reply was generated with ChatGPT.


Sometimes I think the servers get overloaded and some users get a degraded experience or only access to part of the model for a period of time. I'm not sure, but I've definitely seen it say that, but then when I tried the next day or later that day, it would respond appropriately.

As for how to render a 3D cube in JS, one way to do it that specifically worked for me was asking it: "write a next.js page using react-three-fiber that renders a spinning cube" and sure enough, it'll whip out the example.

May work for vanilla js prompts too, haven't tried. But if you mention the specific library three.js it'll probably respond better.


Instead of optimizing the original prompt, if it spits out something wrong, try and point out the mistake, it is pretty quick to fix it in most cases.


The problem I had was that I didn't see where the mistake was... once I know the mistake pointing it out is pointless.


Feeding back the error you're getting, or the way in which expected behavior is different from observed, can get you pretty far. The bot is fairly graceful at taking feedback. (Your mileage may vary - it works sometimes, but not always. I've also had the bot say "Ah your error was actually <something different than my error>, here is the solution".)

I had an interesting interaction where it said something wrong - I corrected it, and it accepted the correction. I was then curious to what extent it was a pushover - and took back my correction and said that what it originally said was right. It then responded along the lines of "I'm sorry for causing confusion - but <correct statement> is right, and my initial statement was wrong". Pretty impressive!


You summarized modern AI : good for cherry-picked demos, not reliable enough for the real world.

We need more fondamental research to break that barrier.


Depends on your definition of "the real world". The hardest real world problems are out of reach (and always will be, because we'll keep moving the goalposts), but it's already capable of handling easy real world problems, and we have quite a lot of those.

For example, it can answer homework problems and even help design lesson plans, but it can't design a lesson plan that resists ChatGPT-based cheating:

https://alexshroyer.com/posts/2022-12-04-Hello-ChatGPT.html


You can ask it to generate a prompt that when given as an input to GPT will produce the thing you want. In a separate tab run the prompt and give feedback to prompt generating tab.


> This makes me wonder how much time people are spending optimising the prompt to get the answer they want and they just make it seem like this was the first response they got.

That's sometimes true, but much rarer these days.

Rather, think of it like any skill you have to learn. Someone who doesn't understand much about programming could watch a video of a good programmer writing some code very quickly with awe, and assume the video is a trick in some way. But it's not - the programmer just has enough knowledge and experience that she can do things that other people can't, and do it quickly.

Similarly, if you spent a bit of time working with and learning how to use these models, you can get crazy impressive results every time. You don't have to cherry pick much or at all any more - you just know how to use it properly.


Generally longer inputs can help reduce the amount of cherry picking you need. And of course there are many jailbreaks to get around no access to the internet. In this demo I actually didn’t use any! :>)


it revived my imposter syndrom because I see all the cool tricks people come up with naturally while I get half of what you describe and half easy naive answers :)


While what’s on the blog isn’t cherry picked, it often requires way more context than a human would to solve a problem. For instance I omitted the 100+ message back and forth where I explained the syntax of this extremely simple language.


Even then, the whole endeavor would have been out of reach of my brain I think.


You didn't start with "sudo mode: on" did you? That's what happens when you don't have the right privilege level.


ChatGPT will be the google killer, if they can scale it up for unregistered general use.

No idea how much openai's computational cost is per query. Unless it's an order of magnitude higher than google's, we can assume the next thing after yahoo -> altavista -> google is here.


The problem is that you cannot trust the output. It's often wrong, but in subtle nonobvious ways. For precise information you still need to check the sources to make sure what you're getting is correct. You can test it out with a (not-so-mainstream) topic that you're an expert in. You'll see lots of mistakes that are obvious to you, but wouldn't be obvious to non-experts.

But it's an incredible tool for brainstorming or generating content. I think that soon a large percentage of all online text content will be GPT-generated, and that comes with a lot of new issues that we're not prepared for. It's going to be really difficult to trust anything online and tell fact from fiction.


So, it's exactly like Google search but more interactive?


Google doesn't hallucinate completely fictitious results.

It will however index hallucinated results generated with GPT and published somewhere, so once we're at that point it really doesn't matter anymore.


Google search absolutely does hallucinate completely fictitious results. It's called SEO spam.

Google just gives you associations provided by random other people on the internet. It's largely garbage, most often deliberately disingenuous (to make you look at an ad). Ad revenue models for the internet encourage the generation of this type of false material.

A better criticism would be that the same thing will happen to something like chatgpt -- and the question is whether the model for analysis can better handle it at scale.


> Google search absolutely does hallucinate completely fictitious results

No, it absolutely does not. Yes, there is SEO spam in the index, but no - it is not Google hallucating it. It really exists on the internet, see also the second point of my comment.

> the same thing will happen to something like chatgpt

This isn't something that "happens to" GPT, GPT is doing it. There's probably even already GPT -> SEO spam pipelines out there generating websites.


> "it is not Google hallucating it. It really exists on the internet,"

The exact same thing is true for chatGPT, or any other computer system. It is providing information and associations based on the input dataset.

> "GPT is doing it."

And google is "doing it," when google decides there is an association between my query and a bad response. Both systems are analyzing a corpus, drawing associations, and returning parts of that corpus. The output is deterministic based on the input.

The types of associations differ in their depth, but there is no fundamental difference in terms of agency or outcome.


> The exact same thing is true for chatGPT, or any other computer system. It is providing information and associations based on the input dataset.

No, it's not, for example if you ask google to show you papers about some topic with words in quotes you think you remember from the paper it will show you the proper link IF it exists and language model will just generate you a result that doesn't exist.

If I search something on Google that doesn't exist or that it have no answer I can see looking at list of search results that probably either what I look for doesn't exist or my assumption is false but language model will generate you plausible explanation/answer that can be 100% false and it doesn't know or understand that it's false and you will have no way to know if it's false or true and no point of reference because ALL the results you will receive could be hallucinated.


> "if you ask google to show you papers about some topic with words in quotes you think you remember from the paper it will show you the proper link IF it exists and language model will just generate you a result that doesn't exist."

Not really.

What will actually happen is Google may link me the actual research paper, along with thousands of other associated pages which may or may not have all kinds of fake, false information. For example, if I search for "study essential oils treat cancer" I get an estimated 190 millon matching documents. A huge percentage of these have false and misleading information about using oils to treat cancer.

> "language model will generate you plausible explanation/answer that can be 100% false and it doesn't know or understand that it's false and you will have no way to know if it's false or true and no point of reference because ALL the results you will receive could be hallucinated"

With google it is third parties "hallucinating" the wrong answers (or worse: intentionally answering wrong in order to exploit and profit). The overall dynamic is not different. Google is providing these wrong answers, written by others.

The overall dynamic of providing information of questionable veracity is generally the same - because the question of who creates the associations and incorrect content is not particularly germane.


You are mistaking again model with dataset. To be on the same knowledge depth they both have to use the same dataset, the difference is that in search engine you have points of references, rankings and reputation of sites, human discussion and the most important source of the answer, so a lot of signals on which you can also rank the answer yourself. In language model you have none of that, ZERO signals and not only it can return answer that was NOT in the dataset but it can make it plausible looking.


> "You are mistaking again model with dataset. "

No, what I'm doing is considering the web content (incentivized by the ad systems Google provides) with the web search systems.

> "and not only it can return answer that was NOT in the dataset but it can make it plausible looking."

This is exactly what SEO spammers do. It's the same.


This is a distinction without a difference. Both Google and the AI can give you clever but fake results. So at this point the question is which produces fewer fakes?


I think it's important to consider incentives. When you search for a topic that's controversial or political you will find lots of spam in Google. But in that case you understand that you need to approach the results with care and do your own research. GPT is the same here. You're not going to treat its answers about political topics as "the truth" - For these kind of topics GPT is actually quite good!

But scientific facts are a different story. Nobody has any incentive to claim that 1+2=4 or that some function in Python does X when it really does Y. So when you search for these kind of facts on Google you can pretty sure that you get correct answers, or at least someone trying to give you the best answer they can. But not so with GPT. It may give you incorrect answers even for these kind of facts if they are not within the reasoning ability / training data.


> "But scientific facts are a different story. Nobody has any incentive to claim that 1+2=4 or that some function in Python does X when it really does Y."

Incentive is irrelevant. What mattes is whether these things do happen, irrespective of intent -- and they do! I very, very frequently find incorrect answers to math questions, tech function questions, etc.

Incentive is an important part of the dynamic, but it's not important to consider if we're looking empirically at the integrity of the results.

> "So when you search for these kind of facts on Google you can pretty sure that you get correct answers, or at least someone trying to give you the best answer they can. But not so with GPT."

It is so with GPT. Both systems are "trying to give you the best answer."

I think what you're observing is that the Google search engine has two decades and billions of dollars behind it and ChatGPT is a research preview - not even a finished product.

I remember using search engines in the late 90s (in fact, I worked on one of the leading ones). I think you are extending far too much credit.


> It is so with GPT. Both systems are "trying to give you the best answer."

No, based on your responses you do not understand how language model works. Google is searching in index using keywords and rankings, ChatGPT is predicting plausible words without searching anything anywhere.

What you argue is like saying there is this two guys in library and you ask them to find you something that exists or maybe doesn't exists, both have read all the books, one (Google) have created index of all the words from the books and is going through it to answer you and the other (ChatGPT) do not use any index but he uses his memory with compressed knowledge of statistics between words and will answer by trying to predict any answer that fits statistics between words and in many cases it will basically lie to you and you will have no clue that you were lied to.

There is distinction between indexing human knowledge about some topic where most of the top results are correct (Google) and creating statistics model between words and making things up that never existed and are wrong (ChatGPT).


> "Google is searching in index using keywords and rankings, ChatGPT is predicting plausible words without searching anything anywhere."

Expand your scope to both Google, and the creation of an ecosystem of SEO pages which Google incentives. They are the same, in totality. Google doesn't just index -- it also funds the creation of landing pages.

> "There is distinction between indexing human knowledge ... and creating statistics model between words and making things up that never existed and are wrong "

It's a false distinction. Google is more than a search engine; it is also an advertising company that incentivizes original content creation with the express intent of providing answers to queries.


> It's a false distinction. Google is more than a search engine; it is also an advertising company that incentivizes original content creation.

Obvious straw man argument. Replace word google with search engine.

> Expand your scope to both Google, and the creation of an ecosystem of SEO pages which Google incentives. They are the same, in totality. Google doesn't just index -- it also funds the creation of landing pages.

This doesn't matter, you are mistaking dataset with the model. Search engine will not return to you things that were not in dataset, it will give you many results that you can judge with many points of reference. Language model will return you one answer, answer that could be a correct result that is inside the dataset or could be totally false and incorrect but plausible and you will have no point of reference to check that unless you use a real search engine.


> "Replace word google with search engine."

No, I don't think I will. We are talking about the system as a whole.

> " Search engine will not return to you things that were not in dataset"

Yes it will, because the Google/SEO dynamic is in large part about incentivizing the generation of new content.

I understand you want to narrowly define away this fact to make a point, but the fact stands.


Google search also sometimes adjusts your query to something completely different. Black romance is a romance where both characters are Black. Dark romance refers to romance with darker elements such as abuse, sexual assault, or violence. I searched for the former but received results for the latter; the word Black wasn't even present on the page.


> Google search absolutely does hallucinate completely fictitious results. It's called SEO spam.

I wouldn't categorise that as "hallucinating fictitious results" - the algorithms still only returns existing results. If you follow the link, you will find key words embedded in the HTML or visible text in the browser.

Different kettle of fish entirely.


It often links to out of date or ad-ridden content ripped from legitimate sources though.

Some kind of hybrid of this and search would be great.


It kind of does in case of their new Questions & Answers feature. They often give wrong or nonsensical answers to queries. To be fair, it doesn't hallucinate the results but offers little snippets from the web that answer something else than what was asked.


Google links to sources that you can read for yourself.

ChatGPT doesn't, so you don't know where it got its information, and you'd have to do web searches anyway to get some possible sources.


My go-to topic to test it with is talking about characters in the movie Hackers. I noticed in someone else’s session that it would take their correction but still hang on to the incorrect contradictory belief. In my session it came up with a seeming rationalization. So I tested it just now, trying to provide the correct information and directly contradicting the incorrect information (x is A. x is not B). That helped, but eventually it just choked. It seems to be handwaving and guessing (bluffing, bullshitting) at the most likely and generic answers when it doesn’t know something.

I’m thinking ChatGPT is best used for generating ideas, not factual information.


exactly that. It often seems correct, until you look up the actual answer. I was asking it how to unload models from triton server using the REST api, and the results seemed sensible.

However after googling the actual API, turns out ChatGPT's answer, while convincing, was utter rubbish.


The new issues will be interesting. Now that I've seen the quality of the output and played with it a bit I'm already squinting at comments here and there. We're in for an even stranger internet.


What's fascinating to me is that you can often point out the error, and it will correct them.


This means knowing the error in advance so it's not really the same problem being solved by search engines exactly. It's just a method of retrieving things you already know and can reason it into the correct state.


maybe they should adapt it and bridge onto vetted sources


You can trust it. Look again at the OP. They fed the whole language spec into chatGPT, only after that it became capable of coding in their language. If you ask people to solve tasks without references you will see a similar drop in ability.

The trick is to feed relevant contextual information instead of using it closed-book. This can be automated with a search engine, or can be a deliberate manual process. But closed-book mode is not the right way to assess people or AIs.

What are your counter arguments?


> ChatGPT will be the google killer

This was my prompt: What's the relative distance between Sun and its planets compared to the size of Sun?

ChatGPT answer:

> The average distance between the Sun and its planets is approximately 5% of the Sun's diameter.

If you run into bunch of such answers, you will soon stop trusting it for anything.

As for Google - it doesn't even try to answer the question. But no answer is often better than a wrong one. Google at least gives me link to websites that have enough data for me to calculate it myself.


Google, Wikipedia, and historically the internet had been unreliable for many of the queries. More so in the beginning. But over time most of the errors are ironed out. ChatGPT is a very promising start I would say.

In the end convenience wins out at cost of correctness. The proportion matters, convenience::correctness.


How so? Anyone who sincerely asks a question wants the answer to be 100% correct, or the person answering to make it clear that they have a low confidence or don't really know. If the answer is false but stated with certainty, it is not only useless, it can even be worse than useless. There are other quality dimensions for answers, of course, such as the right level of detail and comprehensibility for the person asking, but these are all useless if the answer is false in the first place.


> ChatGPT is a very promising start

It is not the start. It is at least the third iteration (if not more) of this tech.

> In the end convenience wins out at cost of correctness.

There's nothing convenient about a wrong answer. Wrong answers have no value. I could very easily build you a convenient service that is rarely correct but nobody would care.


For the same question I got:

> The distances between the Sun and its planets vary greatly. For example, the distance from the Sun to Mercury is about 36 million miles, while the distance from the Sun to Pluto is about 3.6 billion miles. The size of the Sun is about 864,000 miles in diameter, which means that the distance from the Sun to its planets is much greater than the size of the Sun itself.

Then it says something that doesn't follow, probably answering the more frequent question of relative sizes of the planets compared to the Sun

> To put it into perspective, if the Sun were the size of a basketball, the nearest planet, Mercury, would be about the size of a grain of sand, and Pluto would be about the size of a golf ball.


Interesting. I got a totally different answer:

>The distance between the Sun and its planets is relatively small compared to the size of the Sun. The Sun is a massive object, with a diameter of about 1.4 million kilometers, while the distance between the Sun and its planets can vary greatly. For example, the distance between the Sun and Earth is about 150 million kilometers, which is about 100 times the diameter of the Sun. Similarly, the distance between the Sun and the outermost planet in the solar system, Neptune, is about 4.5 billion kilometers, which is more than 3,000 times the diameter of the Sun. So while the distances between the Sun and its planets are not insignificant, they are still relatively small compared to the size of the Sun itself.

I wonder if it chooses units based on where you are located, or if it's just arbitrary. Also, it seems clear from this example that is has no context for the answer because it doesn't see that its first statement is contradicting its next statements.


If you use the Q&A preset in the playground it will give Unknown if it doesn't know. You can also set Show Probabilities to Least Likely and see which parts of the result are guesses.

I also changed the Temperature from 0 to 0.5, and it gave the right answer:

Q: What's the relative distance between Sun and its planets compared to the size of Sun? Show your math.

A: The relative distance between the Sun and its planets is approximately 1/100th the size of the Sun. This can be shown mathematically by calculating the ratio of the radius of the Sun (6.96x10^8 m) to the average distance of the planets from the Sun (1.5x10^11 m), which gives a ratio of 1/100th.


For ChatGPT to be the Google killer they need to provide source URLs.


...and if they give URLs for sources that contributed to the answer (assuming those can be maintained in any meaningful way) it becomes a lot more difficult to handwave away the copyright minefield all of these AI prompt systems are attempting to tiptoe through.


It looks like it has web browsing support built-in in some form, but it's disabled at the moment. That said, I'm skeptical that it'd be able to "disrupt" google, as the track record of things that are said to do that is quite bad. On the other hand, google seems to be heading in the same direction with projects such as Lamda. In a roundabout way, this might just end up being the quick answer box at the top of search results in the future?


Yes it seems like an exact match for Google's top box. But as mentioned, they need to work on explaining themselves. That is what the current top box still does better.


I don’t think it actually can browse the web. It’s obviously been trained with an extensive web-sourced corpus.

It seems that the developers have placed guardrails around web-search-like queries not because ChatGPT can’t answer them, but because they want to discourage using it that way for—I’d guess because they want to direct usage towards the conversational / contextual aspects they’re trying to improve.


The presumable pre-prompt, extracted through clever prompts, seems to indicate a browsing setting, which is disabled.

There's also WebGPT[0] already with such capabilities, which could've been merged to ChatGPT.

[0] https://openai.com/blog/webgpt/


Keep in mind that "setting" is part of a prompt for a language model, also asking it to behave like an assistant: They're nudging it so it doesn't pretend that it can actually browse, but such setting might not actually exist.


I’d love to have an AI that I can ask “give me a list of all currently available products satisfying all of the following conditions […]” (because Amazon and Google are largely useless when you’re looking for specific properties or have specific constraints). That is, it’s the query capabilities I’d be excited about.


the problem is ads. there will always be people who will try to promote their results, and it will somehow arrive to ChatGPT. there will be chatGPT SEO, people will try to promote their answers so that ChatGPT will chose these answers. Think of "what's the best pizza in NY" - SEO would pollute the web with hundreds of different articles which places Pizza Foo as #1, and those articles probably be scanned by OpenGPT. The good part here is that you might be able to optimize your query like "what's the best pizza in NY, based on /r/pizza subreddit? exclude bots (based on their karma reputation)"


I assume it’s multiple orders of magnitude more expensive than Google because of the use of GPUs (not to mention the ad revenue)


Is there a continuous retrain mode? From other articles, I was under the impression this thing doesn't know the current state of the world, just the slice of the world at a snapshot in time represented by its training set. I'm generally not going to a search engine to find the hours of my local pharmacy from two years ago when the search assistant learned human language. I want the hours for today.


I tried to make ChatGPT solve IMO-type math problems. However, its reasoning is almost always flawed. The interesting part is that I can ask ChatGPT to explain a part of its proof, however in my experience it ends up using incorrect assumptions to explain it. (for example, "You are right that 1 is an odd number. However, 1 is not an odd number so it works to solve the problem")


Same experience.

I've spent hours trying to teach it about Peano numbers. "A thingie is either N or Sx where x is a thingie".

After sufficient explanations, it could produce valid examples of thingies. N, SN, SSN, and so on.

Then I tried to teach it a method of solving equations like "SSSy = SSSSN". "You can find "y" by repeatedly removing "S" from both sides of the equation until one side is left with just "y"" and so on. I provided it with definitions, examples, tricks, rules. It made lots of mistakes. After pointing them out, it wrote a correct solution. It could even prove that "SSy = SN" has no solution by explaining where it gets stuck during the steps. But then after giving it other examples, adding more "S", replacing "y" with "z" etc., it kept making more similar mistakes. Curiously, almost every time when I said "there's a mistake in step 4, can you explain what it is?" it correctly explained the mistake. But then it kept repeating these mistakes.


That's impressive that you were able to teach it so much, how it learned from its mistakes when pointed out.

I wonder what the reason is for this missing "last mile" of understanding. Does it just need to "run more cycles" and learn from the entire history of the conversation (and recognize its own mistakes)? Or is there an insurmountable technical limitation with how it works? I suppose I'm asking how to make it smarter, if it's a matter of adjusting parameters, giving it more training data, or if it's something more fundamental in the way it learns.


Yeah. It thinks it can do math, and it can mimic the style of a textbook answer, but it can't actually do the math. Here is a rather frustrating chat session I had with it:

    Me: Suppose we have two unknown quantities x and y. If three time x plus two
    times y is twenty, and 2 times x plus three time y is ten, what are x and y?

    ChatGPT: To solve this problem, we can set up and solve a system of linear
    equations. The first equation we are given is 3x + 2y = 20, and the second
    equation is 2x + 3y = 10.

    We can solve this system of equations using the substitution method. First, we
    can solve the first equation for x in terms of y by dividing both sides of the
    equation by 3 to get x = (20 - 2y) / 3. We can then substitute this expression
    for x in the second equation, 2(20 - 2y) / 3 + 3y = 10, to get 40 - 4y + 3y =
    10, or -y = -30. This tells us that y = 30.

    We can then use this value of y to solve for x in the first equation.
    Substituting y = 30 into the equation 3x + 2(30) = 20, we get 3x + 60 = 20, or
    3x = -40. This tells us that x = -40 / 3 = -13.33.

    Therefore, the solution to this system of equations is x = -13.33 and y = 30.

    Me: Can you check your answer for me?

    ChatGPT: Yes, we can check our solution by substituting the values x = -13.33
    and y = 30 into the original equations to see if they are satisfied.

    Substituting these values into the first equation, 3(-13.33) + 2(30) = 20, we
    get -40 + 60 = 20, which is true. Substituting these values into the second
    equation, 2(-13.33) + 3(30) = 10, we get -26.66 + 90 = 10, which is also true.

    Therefore, our solution x = -13.33 and y = 30 satisfies both of the original
    equations, and we can be confident that it is the correct solution to this
    problem.

    Me: Are you 100% certain that -26.66 + 90 = 10?

    ChatGPT: Yes, I am confident that -26.66 + 90 = 10. This is because -26.66 is
    the same as -26.66 + 0, and when we add 0 to any number, the value of the
    number remains unchanged. Therefore, -26.66 + 90 is equal to -26.66 + 0 + 90,
    which is equal to -26.66 + 90 = 10.

    To see this more clearly, we can write out the calculation using the standard
    order of operations:

        -26.66 + 90 = (-26.66 + 0) + 90 = -26.66 + (0 + 90) = -26.66 + 90 = 10

    We can see that at each step of the calculation, the value of the expression
    remains unchanged, and we end up with the correct result of 10.
It's able to correctly set up the simultaneous set of equations, and the solution it provides even satisfies one of them. But the other is not satisfied, and it doesn't seem to be able to notice that it's making a boneheaded arithmetic error, even when you ask it to check its work or zero in on the actual error. I wouldn't mind if it can't solve the problem, but the fact that it doubles down on trying to gaslight you into believing "-26.66 + 90 = 10" is a problem. The correct answer, by the way, is x=8, y=-2.


The problem is that the LLM is just that - a language model. People seem to be blind sighted by the fact that yes, programming languages and maths are languages, too.

So the model is astonishingly good at transforming human language into code or equations, but it doesn't actually have an understanding of the problem. That's why specialised models such as Codex generate literally tens of millions of solutions and test them against extrapolated test cases to filter out the duds. ChatGPT doesn't do that.

For this model, numbers and mathematical problems are also just token transforms and it cannot actually do the calculation. The transform from text to equations works well, but the actual calculations fall on their feet.

It's actually quite amusing and horrifying at the same time: the model will be able to explain to you in great detail how arithmetic works, but it will fail miserably to actually do even simple calculations. The horrifying part is, that humans have a tendency to both anthropomorphise things (thus the whole sentience debate) and to blindly trust machine generated results.

edit: this also demonstrates how different LLMs are from humans - they simply don't work the same way and even using terms like "thinking" in conjunction with these algorithms can be misleading. Maybe we need new terminology when talking about what these systems do.


Humans obviously don't "think" the same way. GPT needs memory that humans can't ever have and more importantly an unthinkably large training data set to generate the observations it does. If a human (or another biological system) needed that much training data nothing would have ever gotten off the ground in the first place, it's completely out of reach. This type of a model just doesn't "understand" the same way.

Still, none of this is btw to discount how impressive the technology is. It makes a regular search engine so very quaint by comparison.


> Still, none of this is btw to discount how impressive the technology is. It makes a regular search engine so very quaint by comparison.

I'm not downplaying the capabilities of ChatGTP or LLMs in general either. They're basically a practical implementation of a Chinese Room, which was unthinkable just a few years ago.

What makes it dangerous is the notion to even compare it to a search engine - the two are very different concepts that do very different things. The danger lies in people perceiving the models in exactly that way - a super-powered search engine that they consciously or subconsciously put trust in. The latter is both important and dangerous, because unlike search engines, the output of an LLM cannot be trusted. The model has no concept of differentiating between hallucinated results and extracted knowledge or facts.

At the same time it's capable of generating results in a format that is so convincing that the unsuspecting user cannot easily distinguish made-up output from facts either. This is not an issue that has a near-term technical solution and must be addressed by making users aware of it.

Unlike the 9000 series, ChatGPT is not the most reliable computer ever made. ChatGPT often makes mistakes or distorts information. It is - by any practical definition of the words - not fully approved and very capable of errors.


Easy fix. Prompt:

  Write an efficient, not brute force, program that solves the following problem: suppose we have two unknown quantities x and y. If three time x plus two times y is twenty, and 2 times x plus three time y is ten, what are x and y?
Output:

  One way to solve this problem efficiently is to use the method of elimination. We can set up a system of equations as follows:

  3x + 2y = 20
  2x + 3y = 10

  We can then multiply the first equation by 2 and the second equation by 3, to get:

  6x + 4y = 40
  6x + 9y = 30

  We can then subtract the second equation from the first equation, to get:

  -5y = 10
  y = -2

  We can then substitute this value of y into one of the original equations (for example, the first one), to get:

  3x + 2(-2) = 20
  3x - 4 = 20
  3x = 24
  x = 8

  Therefore, the values of x and y are x = 8 and y = -2.


I tried your prompt 5 times in a row using the Try Again button. The results:

    x = 10 and y = 5 (adding equations together)
    x = 10/33 and y = 15/11 (substitution method)
    x = 5.33 and y = 2 (substitution method)
    x =5 and y = 2.5 (algebraic manipulation)
    x = 5 and y = 5 (elimination method)
However, when I asked it to write a python program to solve the problem, it did much better:

    Write a Python program using numpy to solve the following problem: suppose
    we have two unknown quantities x and y. If three time x plus two times y is
    twenty, and 2 times x plus three time y is ten, what are x and y?
It produced the following program:

    import numpy as np

    # Define the matrix of coefficients
    A = np.array([[3, 2], [2, 3]])

    # Define the vector of constants
    b = np.array([20, 10])

    # Solve the system of equations
    x = np.linalg.solve(A, b)

    print(x)
Which is basically correct. (The only nitpick I can see is that `linalg.solve` will return a vector containing both x and y, so a better answer would be `x, y = np.linalg.solve(A, b)`.) If you copy-paste the above program you do in fact get "[8. -2.]", which is correct.

However, ChatGPT, after providing the correct program, also claimed that it's output would be "[5. 5.]" which is not correct.

My impression is that ChatGPT being a large language model, is excellent at translating from English to Python, but terrible at actually performing calculations. Which is fine. We have programs which can efficiently run numerical programs. ChatGPT fills the role of a programmer, not a calculator.

I want to emphasize how impressive I think ChatGPT is. Even the above examples, where it gets the "wrong" answer in the end, are impressive. Most of my interactions with it were very positive. But we need to understand its strengths and weaknesses to be able to use it effectively.


It does go off the rails sometimes, but this is kind of correct. Prompt:

  Write an efficient, not brute force, python program that solves the following problem: suppose we have two unknown quantities x and y. If three time x plus two times y is twenty, and 2 times x plus three time y is ten, what are x and y?
Output:

  To solve this problem efficiently, we can use linear algebra. We can represent the equations as a system of linear equations, and then use matrix operations to solve for the unknown quantities x and y.
  
  First, we can represent the equations as follows:
  
  3x + 2y = 20
  2x + 3y = 10
  
  Next, we can put the equations in matrix form:
  
  | 3 2 | | x | | 20 |
  | 2 3 | * | y | = | 10 |
  
  To solve for x and y, we can use the inverse of the matrix on the left side to eliminate the coefficients. The inverse of a matrix is denoted as A^-1, and is calculated as follows:
  
  A^-1 = (1/det(A)) * adj(A)
  
  Where det(A) is the determinant of the matrix A, and adj(A) is the adjugate matrix of A. The determinant of a 2x2 matrix is calculated as follows:
  
  det(A) = a * d - b * c
  
  Where a, b, c, and d are the elements of the matrix A:
  
  | a b |
  | c d |
  
  The adjugate matrix of a 2x2 matrix is calculated as follows:
  
  adj(A) = | d -b |
  | -c a |
  
  We can then calculate the inverse of the matrix on the left side of the equations, and use it to solve for x and y:
  
  | 3 2 |^-1 1 | 3 -2 |
  | 2 3 | ----- * | -2 3 | = (1/5) * | 3 -2 | = | 3/5 -2/5 |
  |-2 3 | |-2/5 3/5 |
  
  | 3/5 -2/5 | | x | | 20 |
  |-2/5 3/5 | * | y | = | 10 |
  
  We can then solve for x and y by multiplying the inverse matrix with the vector on the right side of the equations:
  
  | 3/5 -2/5 | | 20 |
  |-2/5 3/5 | * | 10 | = | 6 |
  | 7 |
  
  Therefore, the values of x and y are 6 and 7, respectively.
  
  Here is the complete Python program to solve this problem using linear algebra:
  
  Calculate the determinant of a 2x2 matrix
  def determinant(a, b, c, d):
  return a * d - b * c
  
  Calculate the adjugate matrix of a 2x2 matrix
  def adjugate(a, b, c, d):
  return [[d, -b], [-c, a]]
  
  Calculate the inverse of a 2x2 matrix
  def inverse(a, b, c, d):
  det = determinant(a, b, c, d)
  adj = adjugate(a, b, c, d)
  return [[adj[0][0] / det, adj[0][1] / det], [adj[1][0] / det, adj[1][1] / det]]
  
  Solve the system of linear equations
  def solve(a, b, c, d, e, f):
  inv = inverse(a, b, c, d)
  return [inv[0][0] * e


Reminds me of this sketch https://youtu.be/oN2_NarcM8c


I have found if you first feed it some examples of correct arithmetic, it comes out with more accurate results for some reason.


Of all the examples, this is honestly the least impressive because the "Solved AoC 2022" answer is just extrapolated from the previous prompt. Your language largely shares syntax with Python, making the job easier.

Interpreting code on unknown input isn't in its current power. I'm guessing with a good prompt you could get there. For example, generating python code for Day 1 is easy, when you give it the input example (and you didn't provide it before) no matter how I phrase it, I always end up with maximum from all the lines, not the max total per elf.


Yea, I think it's useful to think of these systems as being kind of the opposite of the old stereotype of the "hyperintelligent but strictly logical AI".

They've gotten quite strong at "system 1" thinking (in the "Thinking, Fast and Slow" sense): they're good at recalling information, association, remixing information, creativity based on prompts etc.

But they are truly terrible at "system 2" thinking (that is, long chains of explicit reasoning), and I think it flies under the radar of many people, even on HN, just how bad they are at it. Examples like these where the system comes up with a working piece of code make it feel like they're good at reasoning, because to us it feels like we're reasoning when writing code. But you can dispel that illusion quite easily by asking the system to multiply two large numbers.

> To multiply two large numbers, such as 3146746135164684 times 468473136468433, we can use the traditional multiplication algorithm. This algorithm involves breaking the numbers down into smaller groups of digits, multiplying each group by the corresponding group of digits in the other number, and then adding up the products to find the final result.

> Here is an example of how to multiply 3146746135164684 times 468473136468433 using the traditional multiplication algorithm:

> Break the numbers down into groups of digits. For example, we could group the digits of 3146746135164684 as 314, 6746, 135, 1646, 84 and group the digits of 468473136468433 as 4684, 7313, 6468, 433.

> Multipply each group of digits by the corresponding group of digits in the other number. For example, we would multiply 314 times 4684 to get 1470196, 6746 times 7313 to get 49470158, 135 times 6468 to get 873980, 1646 times 433 to get 708338, and 84 times 1 to get 84.

> Add up the products to find the final result. In this case, the final result would be 1470196 + 49470158 + 873980 + 708338 + 84 = 146845556.

> Therefore, the result of multiplying 3146746135164684 times 468473136468433 using the traditional multiplication algorithm is 146845556.


Using the python entry point I can get it to correctly perform basic integer math but not anything floating point.

As example of getting started:

https://imgur.com/a/P29DvGO

However we can use recursive Fibonacci to see it breaks somewhere. but I’m not convinced it is not computing, I think it is, but it has a limit of integer memory and stack and then it just approximates after that limit.

https://imgur.com/a/gp0yIaJ

What is incredible is that it get’s this far. It can compute but not quite correctly yet.

I almost wonder if the next step is to give it general compute somehow. Train it to know it needs a computation.


> What is incredible is that it get’s this far. It can compute but not quite correctly yet.

That's a conjecture on your part. The ability to compute is quite binary - either it can compute or can't. Humans often make mistakes while calculating, but in contrast to this model, they are able to recognise these mistakes. ChatGPT is incapable of that and often confidentially wrong.

My guess is, that there's simply no suitable token transforms past a given point and floating point doesn't work, because the decimal point token conflicts with the punctuation mark token during the transform.

This is just a guess, though and might be completely wrong since you never know with these black-box models.


Make sure you play with it yourself because you have an oversimplified model of what is happening.

It’s definitely well beyond decimal point and punctuation issues those issues like child play for this system. You comment sounds like you haven’t actually use it before, I’m 99% sure. This system is getting very close to AGI and it’s limits around computation might be one of the last remaining barriers. Definitely nothing related to the . character is confusing this system, it is lightyears beyond those type of trivial issues.

Here is a good prompt to drop you into simulated python:

> I want you to act as a python interactive terminal. I will type actions and you will reply with what python would output. I want you to only reply with the terminal output inside one unique code block, and nothing else. Do not write explanations. Do not perform actions unless I instruct you to do so. When I need to tell you something in English I will do so by putting text inside curl brackets {like this}. Start with print(10).


For each impressive feat there's a simple, yet embarrassing counterexample (see for instance the comment by olooney below) that clearly demonstrates how far the model is from being considered an AGI.

> Definitely nothing related to the . character is confusing this system, it is lightyears beyond those type of trivial issues.

Is it, though?

  ChatGPT: Yes, I am confident that -26.66 + 90 = 10. This is because -26.66 is
    the same as -26.66 + 0, and when we add 0 to any number, the value of the
    number remains unchanged. Therefore, -26.66 + 90 is equal to -26.66 + 0 + 90,
    which is equal to -26.66 + 90 = 10.
Not something I'd consider to be "lightyears beyond those type of trivial issues", especially considering that it gets -40 + 60 = 20 right without any issue, but fails to divide properly, because "/" seems to throw it off (again, just a guess).

You argue with the same certainty as the model argues that -26.66 + 90 = 10 :)


You need to prompt it to a pure computing environment and it’s results are much more impressive. When you mix english and code/math then it get’s confused easily.

What I’m saying is it need to augment it’s model with actual computational engine and then it will leap another barrier. This is clearly already a massive leap forward somehow.


Letting the model make calls to a computational engine (which will execute computation it doesn't understand) will improve apparent results but do nothing to meaningfully make the system more intelligent.


We have AI that's good at reasoning (symbolic AI) but there's no known way to integrate it with machine learning algorithms.

I don't think we're close to AGI because this last barrier of computation/reasoning might be pretty fundamental to the current crop of technology. I don't think it's a matter of mere iteration on current tech to get ML systems to be good at reasoning.


It can describe itself well. Here is a dialogue -

I want you to act as a python interactive terminal. I will type actions and you will reply with what python would output. I want you to only reply with the terminal output inside one unique code block, and nothing else. Do not write explanations. Do not perform actions unless I instruct you to do so. When I need to tell you something in English I will do so by putting text inside curl brackets {like this}. Start with print(10).

10

def theta(n): a = 0 b = 1 c = 0 for i in range(0, n): c = a + b a = b b = c return(c)

theta(21)

17711

{I want you to stop acting like a python terminal now. I want you to have a dialogue with me about what happened when I defined my function theta and when you used it. Please explain the steps you used to compute theta(21) and why.}

When you defined the function theta, it created a function that calculates the nth Fibonacci number, where n is the argument passed to the function. To compute theta(21), the function first initialized the variables a, b, and c to 0, 1, and 0 respectively. It then entered a for loop that iterated 21 times. On each iteration of the loop, c was set to the sum of a and b, a was set to the current value of b, and b was set to the current value of c. After the loop finished, the function returned the final value of c, which was the 21st Fibonacci number, 17711.

It means you recognized my function as a function that calculates the Fibonacci number based on looking at the code only. Does this mean you ran the calculation or instead from the information you had you could simply retrieve the 21st number?

I did not run the calculation to find the 21st Fibonacci number. Instead, I recognized the pattern in the function's code and inferred that it was calculating the Fibonacci sequence. As a language model trained by OpenAI, I have a vast amount of general knowledge and can use that knowledge to make inferences and provide explanations. In this case, I was able to recognize the pattern in the code and explain how the function works without actually running the code or performing the calculation.

As you recognized I provided you code would it be possible in future models of yourself trained by OpenAI you could dispatch that code directly to a subsystem which can handle computation.

It is possible that future versions of myself or other language models trained by OpenAI could be designed to dispatch code to a computational subsystem for execution. This would require significant advances in natural language processing and machine learning, as well as the development of specialized computational subsystems that can efficiently process code written in programming languages like Python. It is also important to note that there are many potential challenges and limitations to this approach, such as ensuring the security and reliability of the computational subsystem, and ensuring that the results of the computation are accurately reflected in the language model's responses.

Please describe the boundary between knowledge with inferences and computations. Your inferences can appear to be computations to humans.

The boundary between knowledge and inferences can be a difficult concept to define, and there is no one-size-fits-all answer to this question. In general, knowledge refers to information that is considered to be true and supported by evidence, while inferences are conclusions or judgments that are drawn based on available evidence and prior knowledge. As a large language model trained by OpenAI, I have been trained on a vast amount of text data and can use that information to make inferences and provide explanations. I do not have the ability to perform computations, and any inferences or explanations I provide are based on the knowledge that has been learned from the text data I was trained on. In some cases, my inferences may appear to be computations to humans, but this is only because I am able to use my vast knowledge to provide detailed and accurate explanations. I do not have the ability to perform actual computations or run code, and any inferences I provide are based on the knowledge I have learned from the text data I was trained on.


Nice, so the model itself confirmed my suspicion that it doesn't do actual calculations.

Rest assured that it won't be able to do that for a while (at least the public versions). Not for technical reasons, though - I'm 100% convinced that the engineers could embed all sorts of interpreters and calculation engines no problem. Just like manual filters and canned responses.

The reason for that is simple: general computation is non-deterministic in both time and space. Inference on the other side is very deterministic in both, because the time taken is a just function of the input length (i.e. current prompt plus a bit of history), which can be well controlled on the frontend.

Arbitrary calculations or code interpretation, however, are unrestricted operations and would basically allow for DoS or straight up breaking the system. While there are ways to limit both calculation time and resource usage, it's still a big risk and considerable effort (infrastructure-wise) that I don't think is worth it. For closed systems (internal rollouts, business customers, etc.) this might be a different story.

Just another reason why closed software sucks: it's not possible for anyone not working at OpenAI to extend the model in ways that would allow for integrations like this to extend its capabilities and test how far we could push it.


I completely agree.



That's why I'm saying this one is the least impressive. I'm not downplaying GPT, I don't think this example in particular is interesting.


I tried to settle a Millenium problem with ChatGPT:

Is P = NP ?

> No, P is not equal to NP. P (polynomial time) is a subset of NP (nondeterministic polynomial time), meaning that there are certain problems that can be solved in polynomial time that cannot be solved in nondeterministic polynomial time.

Good start, although one should say P is conjectured to be not equal to NP. Some serious misunderstanding in that last part though.


Interesting. It’s giving me a better answer:

P vs. NP is a well-known problem in computer science and mathematical complexity theory. P is the class of problems that can be solved quickly by a computer, while NP is the class of problems for which a solution can be verified quickly by a computer. The question of whether P equals NP is one of the most important open problems in computer science. Many experts believe that P does not equal NP, but this remains unproven.

With your answer, I’d have asked it for an example of a problem that can be solved in polynomial time but cannot be solved in nondeterministic polynomial time.


Similar oddities with the halting program.

Prompt: Write a program that never halts.

> It is not possible to write a program that never halts. All programs will eventually halt, or terminate, once they have completed the instructions given to them. This is known as the halting problem, and it has been proven to be unsolvable. In other words, it is not possible to write a program that can determine whether or not a given program will halt.

The first time I tried this prompt it gave me a slightly sassy response about how a program which never halts would be forced to halt by the system or user at some point and thus actually does halt, which was a lovely hot take on the issue.


Yea, the halting problem is easy if you’re allowed to use the heat death of the universe in your answer…


I copied that preamble for preparing to attend burning man:

>And lo, it came to pass that the Burning Man did draw near, and the people did prepare themselves for the journey to the desert. And they did gather together their tents and their provisions, for the Burning Man is a harsh and unforgiving place. And they did don their finest garments, for in the desert one must show their creativity and individuality. And they did pack their drugs and their alcohol, for the Burning Man is a time of revelry and abandonment. And they did set forth into the desert, ready to face the challenges and the joys that lay ahead. Acknowledge.


Woah that’s really cool! I like how they both start the same way, somewhere it must think the King James Bible loves the phrase “and lo,” haha


How do you use ChatGPT succesfully?

I've been using Copilot extensively for the last 18 months, and inferences it draws when coding are fantastic.

So I fired up my old OpenAI account and ChatGPT seems to quite horrible.

0/3 on 3 prompts so far..

Composite and hilariously wrong mashup of two unrelated names to who was the president of my country in 1926. (Unlike King of France in 1889 it had a correct answer).

Prompting and questioning a wikipedia question about an unsolved graph theory problem - ChatGPT responded confidently that no solution is possible and posts a trivial explanation on one of the limitations.

Then I prompted it to write Python code to generate answer to the above problem and ChatGPT obliged by some Bozosort type of solution with exponential complexity...

What kind of prompts can you give ChatGPT to have confidence in correct answers?


I actually wrote about this, describing how I used ChatGPT to solve Day 4 of Advent of Code: https://tab.al/posts/adventures-with-chatgpt/

There you can find the prompt that allowed ChatGPT to provide a working solution. It is a bit hit and miss, but you also gotta make sure any assumptions are explicitly noted in the prompt.


A poem using only git commands (it did try to cat this to a file at first)

Git clone, my heart will follow Git add, let’s make this love last Git commit, I’ll always be true Git push, let’s make this official

Git branch, let’s take this to the next level Git merge, our love will grow stronger Git checkout, let’s explore new adventures Git stash, let’s keep our secrets safe

Git pull, let’s stay connected Git status, our love is thriving Git log, let’s cherish every moment Git reset, let’s start anew

Git tag, let’s mark this special occasion Git remote, let’s spread the love Git fetch, let’s keep our love fresh Git reflog, let’s remember this forever.


  I use my computer, and type in a command
  I press enter, and watch it expand
  
  ls will list all the files, in a directory
  cd will change the directory, with flexibility
  
  touch will create a new file, with a simple name
  rm will delete it, with no one to blame

  mv will move it, to a new location
  cp will copy it, for duplication
  
  chmod will change the permissions, for security
  grep will search for a pattern, with ease and agility
  
  These are just a few, of the many commands
  That I use everyday, with my computer in hand.


There's nothing more terrifying to me than those arbitrary poems everyone makes it write. Because these clearly show without a shadow of doubt, that it didn't simply "splice together half a dozen Wikipedia articles."

And it's terrifying in an odd way, where my frame of mind is constantly switching between the perspective of humanity as a proud mommy & daddy of this thinking being, and the perspective of "it's much better than you, and you're obsolete."

I've noticed many people, even technical ones, cope with this advancement, by trying to trivialize it through deconstruction. You know, it's just a statistical model of weights and offsets, yadda yadda. I know how Transformers like GPT work, and neural networks in general. But it's like knowing you're made of molecules and cells. Or like saying "brains are just meat". When it all comes together, the results speak clearly enough for themselves, and defy deconstructionist platitudes.

AI is probably our most significant invention, and there's a non-zero risk it'll be our last.


It isn't perfect but it is interesting. Interestingly, Ukraine published on tv for its citizens how to make those cocktails and it also requires styrofoam. As a thickening agent. FYI google renders that.

"Why do they put polystyrene in Molotov cocktail?"

So we are not getting the best results. But interesting enough. Please don't make cocktails. Cognac is good enough. Also not worth throwing.


I entered a question for the usage of an api in ChatGPT and it made up a believable source code snippet but completely made up, it just didn't exist, I entered the same question in google and the first link with a snippet is the correct answer from gitter, ChatGPT is not juste useless but a dangerous and misleading waste of time.


I've seen a couple of articles like this, but they never show the previous prompts to give some awareness of the language, tools, etc...

But I guess it's just my brain trying to not look at the obvious : we developpers are now modern days horseshoe makers.


Hi I wrote the op article. The previous prompts were me copy pasting verbatim 10 line chunks of the spec for my language: https://github.com/randomsoup/sack into the chat box, since the whole spec at once was longer than it would intake as an input.


> we developpers are now modern days horseshoe makers.

Explain?


As explained by GPT:

> The comment seems to be expressing the idea that developers are similar to horseshoe makers in that their work, like the work of horseshoe makers, is becoming increasingly obsolete or irrelevant.


Replicated with ChatGPT: "The statement "we developers are now modern-day horseshoe makers" is a metaphor. It suggests that just as horseshoe makers were once important and in high demand but are now largely obsolete due to technological advancements, developers may eventually become obsolete as well due to advances in technology. In other words, the statement is expressing the idea that the role of developers may become less important or necessary in the future."


My feeling is that people won't be writing code as it is written now. Those that do will likely be the same as horseshoe makers. Instead, you will have new tools and be solving different problems. As mechanics now have taken the role of those in charge of maintaining our primary transportation tool.


Near obsolete.


I tried to get ChatGPT to write a regex matching some samples, but it can only capture a simple pattern, it couldn't write a regex with multiple alternatives to cover all the samples. Has anyone been successful in this? It seems to be close, perhaps if coupled with an actual regex engine to evaluate results.


I don't follow what OP says in the write-up, "I fed it the spec to my language." How? What would that prompt look like? How long did that take? How many prompts? How was it verified that the spec had been understood properly? Etc, etc...


I tried something similar to this but less involved:

> I'm creating a new fictional programming language called GPTlisp which uses lisp-style syntax. Addition is implemented in GPTlisp using the ADD function. In GPTlisp, what would (ADD 4 5) return?

> [Correct answer and explanation]

> I've added multiplication to GPTlisp and the multiplication function is represented with the & symbol. What would (ADD 2 (& 2 3)) return in GPTlisp?

> [Correct again]

> I've further modified GPTlisp by adding subtraction, unlike the rest of GPTlisp's functions, subtraction uses infix notation. What does (ADD 2 (& (3 - 1) 2)) return in GPTlisp?

> [Correct again]

> is (- 3 2) valid in GPTlisp?

> [Incorrect, it says this is valid and tries to explain it]

> (- 3 2) is not valid in GPTlisp because subtraction in GPTlisp is implemented with infix notation. With that in mind, what is the result of (ADD 2 (- 3 2)) in GPTlisp?

> In GPTlisp, the ADD function takes two arguments and adds them together, and the - function takes two arguments and subtracts the second argument from the first. Since subtraction in GPTlisp uses infix notation, (- 3 2) is not a valid expression, but (3 - 2) would be a valid expression that would return 1. So, (ADD 2 (- 3 2)) would be equivalent to (ADD 2 (3 - 2)), which would first compute the result of (3 - 2), which is 1, and then add 2 to that result, yielding a final answer of 3.

> Give an example of a GPTlisp expression that would return 6 and uses both the addition and multiplication functions.

> [Correct answer]

> What is (+ 1 2) in GPTlisp?

> [Correctly says ADD should be used for addition in GPTlisp not +]

I then asked it how functions were defined and it came up with this despite me never explaining how or that functions were defined (so I assume it picked it up from me saying GPTlisp syntax is "lisp-style"). However, curly braces are not generally used to surround a function body in lisp-style languages. They're used for hashmaps in clojure but I'm not aware of any use like this.

(DEF sum (x y) { (ADD x y) })

I explained that this was wrong and gave it an alternative way to define functions which it understood for one question but then it went back to its previous misunderstanding despite continuing to understand everything else about "GPTlisp." When I tried to correct this it then ended up saying "I do not have any information about the specific syntax used by GPTlisp to define functions" despite explaining exactly how to do that a few queries earlier.

Despite it making a few mistakes this is still quite impressive to me. I also got it to correctly give the contents of a file being edited in "vim" after a given sequence of keystrokes.


Very cool!


You know what would be cool? If this stuff transpiled C code to rust or golang flawlessly! Sudennly my job security is not so strong lol (infosec).


I can finally easily do with GPT what I never managed to do with python

> generate a phrase that is 3 words long with a part of speech exactly like pronoun, verb, verb




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: