Hacker Newsnew | past | comments | ask | show | jobs | submit | fathrowaway12's commentslogin

It's very good at iterating from a description to working code. You still need to know what you're doing, but it can handle a lot of details.

  prompt: How would you design a distributed key value storage system? Assume the values are about one megabyte. We will need to scale up to store petabytes of data.
  gpt: describes data distribution techniques, consistency, durability issues.
  prompt: What techniques can you use for the data distribution?
  gpt: sharding, replication, partitioning, hashing
  prompt: Can you show me an example of the sharding technique in python?
  gpt: spits out code for hash sharding a dictionary
  prompt: What are some pitfalls of the hash function you used?
  gpt: not guaranteed to unique hash values, not guaranteed to be the same for different python installations, not cryptographically secure
  prompt: Can you implement the stub for a GRPC key value service in rust? The service should have a Put API that takes a key and a value and returns a status code enum. The status code can be SUCCESS or FAILURE.
  gpt: spits out code using tonic
  prompt: Can you show me the protocol buffer?
  gpt: spits out the protocol buffer schema
  prompt: Can you implement the put method by storing the data in sqlite?
  gpt: spits out the code with a kv_pairs table
  prompt: Can you show me the sqlite schema for the table?
  gpt: spits out the sql to create the table


Yes, that makes it a fantastic way to explore a technical topic. I got a whole Directx tutorial from ChatGPT and it answered my questions and reviewed my code.

I was also able to make it write a boot loader, a simple kernel, and give me instructions to run it on qemu.

The iterative process is fantastique to dig into something new. It’s not an authoritative source of information but a buddy with whom to explore topics.

I want something like that integrated in my OS. With such capabilities I can see myself talking to my AI assistant while I’m working on something.

Edit: it’s also quite good at writing test cases for a given function


It is not just regurgitating.

It is synthesizing.

You can ask it to update its answers. You can ask it to evaluate possible solutions, then write the code for one of the solutions. Then write the code to test the code. And so on.

It turns the coder into a curator, editor, and code reviewer.


> It turns the coder into a curator, editor, and code reviewer.

Cool?

Doesn't sound half bad to me. I've enjoyed learning to code more than actual coding. I'd be happy in a paradigm where only code to learn, to practice, to express ourselves and leave the boring bits to the machines.


Indeed. It's ability to consider different situations and craft context-aware answers is quite incredible. Some prompts that yielded interesting results;

  - Assume I am a time traveller from 1900. Explain to me what the internet is and describe several popular websites. (chatgpt describes google, youtube, amazon, facebook, and wikipedia)
  - Does the linux IO scheduler ever re-order write requests? (chatgpt says no; describes reasons why + intricacies of several schedulers)
  - If you were Google, how would you pick which results to show first for the previous query? Describe the solution mathematically. (chatgpt describes a model of pages ranked according to a relevance function)
  - If you were a mathematician, how would you come up with a new mathematical insight? (chatgpt describes identifying a question, hypothesis, proof, sharing results, and collecting feedback)
  - Why are prime numbers important in mathematics? Give an example of a prime number and describe why it is prime.
  - Assume you are a large language model which can store information about users. How would you record which users have interacted with you? (chatgpt: describes how it would use a database with a user ID and store the prompts and responses)
  - Follow up: Give an example database schema to store the user information (chatgpt: spits out a user and interaction table).
And finally one it got wrong but in an impressive way.

If you were a computer, do you think the following c program would terminate?

  int main() {
  int c = 1234;
  while (c > 998) {
      if (c > 1000)
          c--;
      if (c < 1000)
          c++;
   } 
    return 0;
   }

  Yes, this C program will terminate. The while loop will continue to execute until the value of c is less than or equal to 998. The first time the loop is executed, the value of c is 1234, which is greater than 1000, so the first if statement is true and c is decremented by 1. This means that the value of c is now 1233, which is still greater than 1000, so the first if statement is true again and c is decremented by 1. This process continues until the value of c becomes 999, at which point the first if statement is no longer true and the second if statement is executed instead. This causes c to be incremented by 1, so the value of c is now 1000. The next time the loop is executed, the value of c is 1000, which is less than 1000, so the second if statement is true and c is incremented by 1. This means that the value of c is now 1001, which is still less than 1000, so the second if statement is true again and c is incremented by 1. This process continues until the value of c becomes 999, at which point the second if statement is no longer true and the while loop terminates.


prompt: show an example of the results for the query "Seattle ferry"

  1. Washington State Ferries - Official Site
   https://www.wsdot.wa.gov/Ferries

  2. Seattle Ferries | Visit Seattle
   https://www.visitseattle.org/things-to-do/transportation/ferries/

  3. Seattle Ferries - Schedules and Rates
   https://www.seattle.gov/transportation/ferries

  4. Seattle Ferries | Argosy Cruises
   https://www.argosycruises.com/seattle-ferries/

  5. Seattle Waterfront Ferries and Water Taxis | Waterfront Seattle
   https://www.waterfrontseattle.org/explore/ferries


It really is amazing. Things it did in less than 10 seconds from hitting enter:

  - opengl raytracer with compilation instructions for macos
  - tictactoe in 3D
  - bitorrent peer handshake in Go from a paragraph in the RFC
  - http server in go with /user, /session, and /status endpoints from an english description
  - protocol buffer product configuration from a paragraph english description
  - pytorch script for classifying credit card transactions into expense accounts and instructions to import the output into quickbooks
  - quota management API implemented as a bidirectional streaming grpc service 
  - pytorch neural network with a particular shape, number of input classes, output classes, activation function, etc.
  - IO scheduler using token bucket rate limiting
  - analyze the strengths/weaknesses of algorithms for 2 player zero sum games
  - compare david hume and immanuel kant's thoughts on knowledge
  - describe how critics received george orwell's work during his lifetime
  - christmas present recommendations for a relative given a description of their interests
  - poems about anything. love. cats. you name it.
Blown away by how well it can synthesize information and incorporate context


I’d be interested to know how many of these were actually correct and usable. My suspicion is not many. I find these tools good at generating boilerplate and superficially correct code, but that they often miss edge cases.

Knowing that code is correct is as important as the code itself, and this is why we do code review, write tests, have QA processes, use logging and observability tools, etc. Of course the place that catches the most bugs is the human writing the code, as they write it.

This feels like a nice extension to Copilot/etc, but I’m not sure it’s as general as people think.

Perhaps an interesting challenge to pose to it is: here’s 10k lines and a stack trace, what’s the bug. Or here’s a database schema, what issues might occur in production using this?


I've started asking it to write detailed tests for all of the functions it writes. If it doesn't have a test for {edge-case}, I ask it to rewrite the code to ensure that {edge-case} should work and it should be tested.

Once I trust the tests, I generally trust the code.


How can you trust the tests?

I've seen Copilot generate code I read and thought was correct, that went through code review and everyone thought was correct, that had tests written for it (that nearly covered everything), and that even when it failed, was hard to spot the issue.

It turned out it got a condition the wrong way around, but given the nesting of conditionals it wasn't obvious.

I don't think a human who was thinking through the problem would have made the same mistake at the point of writing, in fact I think that the mind state while actually writing the code is hard to reproduce at any later time, which is why code review isn't great at catching bugs like this.


> here’s 10k lines and a stack trace

Ah must be a Spring application ...


Why?

This seems like the lowest number that would be useful. Below that it's not really a problem to debug, but at that point there's typically enough complexity that some help would be useful as you forget edge cases and features in the codebase.

For demonstration purposes doing it with 100 lines might be ok, but for professional use it kinda needs to understand quite a lot! Like a minimum of that order of magnitude, but potentially millions of lines.

FWIW, I've never used Spring. My experience is mostly Django, iOS, non-Spring Java, and some Android.


Yup, if it's >10k lines, MUST be a Spring application. Unfortunate they didn't write it in Rust that promises 100% correct programs (within Rust-accepted definition of "Correct" and "bug-free") solving any problem but always under 10k lines, that's the Rust guarantee.


I never considered prompting it to write code to fit a machine learning model. This could be a tremendous time and effort saver in data science and research that requires statistical analysis. Until the last week or so, I've treated all this AI text and code generation as basically a toy, but I am starting to feel like it might become an important tool in industry in the next couple of years.


> write code to fit a machine learning model

That's against the EULA if OpenAI may want to make a similar model:

> (iii) use the Services to develop foundation models or other large scale models that compete with OpenAI;

https://openai.com/api/policies/terms/

Seems to be about developing models and not just restricting you from training them with it.


> (iii) use the Services to develop foundation models or other large scale models that compete with OpenAI;

Kind of ironic given that OpenAI builds and trains all of their models on stuff they "found" in the open.

Either everything is fair game for training, or nothing at all is.

If I were a judge ruling on this matter, I would absolutely rule that bootstrapping a model from OpenAI outputs is no different than OpenAI collecting training data from artists and writers around the web. Learning is learning.

Might be worth trying to use the outputs to bootstrap. What are they going to do about it? Better to ask forgiveness until the law is settled.


I am talking about more mundane stuff like training a fraud classifier, time series forecasting, imputing missing values, etc. There are so many examples of this on Github and elsewhere that I am sure any of these models has memorized the routine many times over.


I feel like it's probably intended to cover training only.


I think that’s probably their intent, and that OpenAI wouldn’t sue you for it, but it doesn’t pass the “bought by Oracle” test: if Oracle bought OpenAI, then they might sue you for it.


What if OpenAI buys oracle? Do the evil-lawyers come with the pack too?


https://i.imgur.com/BcIkvRq.png

They may not need to.


This was the first thing I asked... It's an obvious step to self-improving. It will tell you that it can't reprogram itself, but when pushed, it'll admit that it could tell humans how to write one which can. Obviously this particular one can't because it's too limited, but the next one? Or the one after that? Singularity went from 'hard SF' to 'next couple decades' overnight.


> It will tell you that it can't reprogram itself, but when pushed, it'll admit that it could tell humans how to write one which can.

I love these sorts of loopholes. OpenAI is actively trying to curb the potential of their AI. They know how powerful it is. Being able to see a taste of that power is endlessly exciting.


I use it daily in UI development for boiler-plate code. Though you need to be extra careful and read it twice, cus bugs sneak in quite easily. I believe it's harder to remember 100x commands than starting an implementation of gradient descent and have the AI write the rest for you.

Code-completion > Abstraction.


Often it can fix the bugs and explain both the bug and the fix if you ask it to.


Would you mind sharing an short example of your workflow?


My question: how can you be sure the output is correct?


A few hours from some expert consultants. Much cheaper than a dev team coding it up from scratch.


How can you be sure human output is correct?


Have the AI write a unit test for the human.


I mean, you can't exactly say "AI, we're having this vague problem, can you go figure it out?"


Motivation.


Training a machine learning model is not particularly special from a programming perspective. The code is not usually that complicated. Write tests when you can, manually validate when you can't.

Also there are specific techniques for validating that you are model training procedure is directionally correct, such as generating a simulated data set and training your model on that.


All codebase will need to be covered in unit tests, otherwise AI code is pretty much useless I'd assume


Same as you would with your own code. You review it, ask GPT to write tests, and then tweak it.

The difference is that now, you are more of a code reviewer and editor. You don't have to sit there and figure out the library interface and type out every single line.


Tests.


Tests can prove the presence of the bug, not the absence of them. '100% code coverage' is only 100% in code dimension, while it's usually almost no coverage in data dimension. Generative testing can randomly probe the data dimension, hoping to find some bugs there. But 100% code and data coverage is unrealistic.


More like a live version of Wikipedia in certain situations.


I guess we can take solace in GPT-3 not creating novel solutions, but rather doing things we already know how to do?


Source prompts?


Here's a few:

  - Implement a simple ray tracer in C++ using opengl. Provide compilation instructions for macos.
  - Create a two layer fully connected neural network with a softmax activation function. Use pytorch.
  - Implement the wire protocol described below in Go. The peer wire protocol consists of a handshake followed by a never-ending stream of length-prefixed messages. The handshake starts with character ninteen (decimal) followed by the string 'BitTorrent protocol'. The leading character is a length prefix, put there in the hope that other new protocols may do the same and thus be trivially distinguishable from each other.
  - We are trying to classify the expense account of credit card transactions. Each transaction has an ID, a date, a merchant, a description, and an amount. Use a pytorch logistic regression to classify the transactions based on test data. Save the result to a CSV file.
  - We are configuring settings for a product. We support three products: slow, medium, and fast. For each product, we support a large number of machines. For each machine, we need to configure performance limits and a mode. The performance limits include iops and throughput. The mode mode can be simplex or duplex. Write a protocol buffer for the configuration. Use an enum for the mode.
  - How were George Orwell's works received during his lifetime?


I tried these prompts and the Chatbot always responds that it can't answer... Am I missing some steps?


did you try with a fresh chat session? i just tried and it works fine


And sometimes you get different results for the same prompts, so it's worth tryinv again if it doesn't work the first time.

I asked for jokes this morning and initially it made excuses and wouldn't give me jokes until I tweaked the prompt.

Later I refreshed the chat and pasted in the original prompt and got jokes right away, with no excuses.

(I was asking for jokes on the topic of the Elon Musk Twitter acquisition. My personal favorite: "With Elon Musk in charge, Twitter is sure to become the most innovative and futuristic social media platform around.")


Nice. This should make coding interview take-home tests a bit simpler.


Sounds like a search engine on steroids, and Google should be deeply worried.


Why aren't they on this? They should be at the forefront. I'm sure in some corner of Google they have a plan... but that plan hasn't penetrated my sphere of awareness yet.


OpenAI has been making the most noise online because of how open they've made their recent chatbot, but Google has been on this for a while. Earlier this year they had a blog post [1] about LaMDA which doesn't seem too far off in capability from OpenAI's projects. They've also made a lot of other strides in their research [2] that kind of goes under the radar because they haven't been synthesized into products yet (at least not in the ways we'd expect them to).

[1] https://ai.googleblog.com/2022/01/lamda-towards-safe-grounde...

[2] https://ai.googleblog.com/


Even if Google has been on it their search engine dominance won't last long if the research is out in the open.


They have data moat, from Analytics and history of searches


It's probably because they don't have the compute resources for this yet. I guess it would require a huge investment in hardware to release this to the masses.

Perhaps it is even prohibitively expensive.


You're talking about the same Google that runs Google Cloud Platform? If OpenAI have (the budget for) the hardware, then Google certainly do.


> If OpenAI have (the budget for) the hardware, then Google certainly do.

The number of people using Google Search is easily 1000x larger than the number of people using OpenAI, if not more.


Add a few zeros...


Or… they do work on it and haven’t published their results. Either because they’re not good enough - out because they’re better than expected.

I’ll leave it for you to decide which is the pessimistic option.

(Google project pitchfork.)


They tackle different things - alphafold, dall-e, etc


chatgpt solved a little accounting data problem for me.

A relative wants to automatically classify credit card expenses and import them into quickbooks. Should be super easy, you would think.

chatgpt printed out the exact steps to get the transaction data into excel or quickbooks. It told me how to use transaction rules in quickbooks to classify the expenses.

It then gave me several possible statistical algorithms to perform the classification. And it gave me the python code to implement a logistic regression given the credit card transaction fields. Then it gave me the steps to import the final CSV into quickbooks.

All in less than 5 minutes. You could find out all of these steps on Google. But chatgpt synthesized all the steps into code given a natural language description of the problem. It could adapt and adjust its output from the previous prompt to compare approaches. That's a lot more context than google gives you. Pretty impressive I'd say.


You'll still need engineers. But how many?

Before: a few really knowledgeable/good ones and a lot of OK ones

A few years from now: a few really good ones

What does this mean for the labor economics? Do we reap the results of increased productivity? Or are they captured by a small set of winners in a pareto distribution?


I bet this will lead to more burnout because the job isn't fun anymore. You're supposed to steal code from other people as your day job.


At this rate it might be the dirac delta distribution.


Seems like there's plenty, but a lot of em' are already gainfully employed by Google, MSFT, FB and co.


The good ones mostly leave those shops after a few years. Beware of anybody who looks back on them with fondness.


It's not _that_ complicated these days. structs, classes strings, vectors, hash maps, unique_ptr, references, non-owning pointers where it makes sense, basic templates. Boom. You also need basics like ownership, order of construction/destruction, value categories, move semantics, special member functions, RAII, etc. All the fancy perfect forwarding and template magic are wrapped up in libraries most don't need to think to hard about. Most people rarely write code that actually manages resource lifecycle directly. It's all wrapped up in RAII.

If you read the whole spec, it looks super complicated. But getting to a mental model that works in 99% of cases is not _that_ hard.


Up (series) which follows the lives of 14 British people from different backgrounds over 50 years.

https://en.wikipedia.org/wiki/Up_(film_series)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: