Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Devon: An open-source pair programmer (github.com/entropy-research)
54 points by lawrencechen on May 19, 2024 | hide | past | favorite | 27 comments


The demo video shows it making a game of life.

People really should start using examples that don't have literally thousands of step by step tutorials all over the web.

That includes clones of Wordle, Flappy Bird, generic todo lists, etc. With those, I can't tell how well the abilities would generalize to real world projects.


> I can't tell how well the abilities would generalize to real world projects.

The secret is that it doesn’t ;)

I doubt our current statistics based AI methods ever will.

Programming requires precision. Code is an exact specification for how a machine should operate.

I don’t see how we can get perfectly precise responses from a nondeterministic process.


Maybe if you're inventing a new compiler that hasn't existed before, but if all you're doing is yet another a CRUD app for a newly undiscovered niche, then it doesn't have to generalize beyond what already exists. The fact that it's nondeterministic is irrelevant, I can write the same function a bunch of different ways just by choosing different names for variables, does that stop the code from working?


The code assistance tools right now are still a bit crude compared to what they will be in a year or two even if capabilities do not increase. I think it will be very useful to have "someone" read along as I code and answer questions I ask, or pose questions on why I do things a certain way. Honestly, even if it is just stuff like "What would be a good variable name for this?" or "What was the map()-syntax like again?", this would already facilitate flow quite a bit.

Not having to context switch as much is a real blessing, and I've only reached it in one language in specific contexts. If I can transfer this to many languages and many contexts, my productivity will rise a lot.


So What you’re looking for is basically a proactive context aware Google search.

LLMs are great for that! Sometimes they’re good at translating from one language to another. They miss idioms from time to time, but they’re usually pretty good with that task.

Not so much for writing software themselves.


Well, it's not like I'm just going to trust a comment like this either :)

I would like to see it in action, and then I'll form an opinion.


LLMs are much more deterministic than humans


It doesn’t work for those generally. Even with rag and large context and clever prompting, it is not doing well as you need something that basically is simple enough to get in one go, so you need a programmer/logical thinker to cut things so small that the AI gets it, which this type of -just tell me what to build- is simply not compatible with. The greatest programming minds on earth have issues with composing complex ideas from simple ideas; now we expect the AI to write the simple ideas and then compose them. Or just one-shot them, which is really far beyond what they can do.

We have been working on a solution for this for decades with my team and it has nothing to do with AI. We have to solve it for us first and then it will work for AI; or maybe we will never solve it and then AI won’t be able to piece anything together that’s complex and it hasn’t seen before.


What demos would you like to see? I have my own agent running on its own Linux system. I reckon it's capable of doing most office work in theory, if you have the budget. On my budget it's a lot more constrained, and often needs a human in the loop to achieve more complex goal. But I'd like to canvas for suggestions for tests I can throw at it.

If get an interesting request, I will try and record its attempt. But it's expensive - in the order of a $1 a prompt.


One random idea:

Wordle clone tutorials are of course all over the web (just google make wordle clone and see the results), but what about a browser extension + Discord bot, that lets groups of friends compare their scores for the official Wordle, in Discord.

It would automatically post people's results to a Discord server - both the score, and a picture of the full guesses under a spoiler tag. And maybe during playing, the extension would show status like "if you solve it now, you'd be better than 50% of people in the server today". And post a weekly top 3 at the end of each week.

Something like it has probably been done before, but nowhere near as much as the examples I mentioned in my previous comment. Might be a bit too big at this cost though.


Why are they manually implementing API interfaces for various companies when something like OpenRouter exists? OpenRouter provides a unified API for Commercial and opensource models. Seems like the obvious answer for something like this.


Lots of reasons!

- They may not know this library exists.

- They may not think the library is actually suitable for their use case.

- They may not want the dependency included.


Openrouter packages APIs and most companies prefer having individual relationships with AI vendors. Choosing an AI gateway might be another way to go


Unclear about the details of this project. Is there an overview or paper related?

Superficially it seems to be an interface for ChatGPT or other similar generative LLM service.


I can't imagine coding with someone else smartassing over what I do. Never tried it. Does anyone actually like pair programming?


Some love it, some hate it. If you work with a bunch of smart asses in a toxic culture and you can't actually stand your coworkers, I can see why you'd have such an instinctively negative reaction. But if there's a safe culture of mutual respect and you don't work with asshats it can result in greater productivity and fewer bugs and you'll learn things to help you be a better programmer.

But also human psychology - if it's forced on you from above then you'll hate it, if it's your idea then you'll love it.


Hmm. Thanks for feedback.

Yeah I guess it all depends on the attitude and if you click.

It could be fun if you get along well with your partner.

But I have a thought and want to make it reality, so I focus on that idea, that thought in order to finish it. Then another voice enters that thought process. Back in the day I would write code and when I had a problem, I would chat up my online contacts on ICQ, and just by explaining the problem I would find a solution, the rubber duck method. However constantly having someone giving their opinion on things... I imagine that being super annoying, when you're in the process of shaping that feature you have laid out in your head.


Hi! Why does it work best with Python? Should I even bother if it’s for a non python project? (In this case, a WordPress plugin).

Thank you!


There's just so much python code up on the web to train from, that LLMs are really good at it, relative to something with fewer examples. However, WordPress uses PHP, and there's also plenty of PHP available online, so it's pretty decent at that too. I just used Devon to create a trivial Wordpress plugin, so you can give it a shot, however because it won't be able to run that code, you can't tell it to test the code.

This is a huge shortcoming. When asking ChatGPT to generate python code, it won't always get it right, but you can ask it to keep trying until the code works. since it can't do that in PHP, it'll be a bit more work. Though, depending on how well you're able to take the output and fix it yourself, it could be enough to get the plugin written.

The value isn't in having everything done for you - the technology isn't there yet, imo. it's in making you more effective. If it generates a page of code and you have to tweak a bunch of it to get it to work right, you still come out ahead. For other work, it'll write a bunch of useless code and you're better off without it, so you have to know when to use it and when not to.


Yup. I was going to ask the same. If there was wide language support I'd love to try that, but as a non-Python programmer the use case is limited for me.

However I might use it as in places where Python might actually be the best way to go for a script, yet I'd have picked another language as I don't know Python. Then I could simply ask it to create whatever I need, and read over the code to actually learn some Python perhaps.


It looks like the reason is because it’s trying to run the code it generates


Why choose this name - seems like a cease and desist waiting to happen.


Especially given all the negative sentiment in the developer community towards Devin. You would think they would want to distance themselves as much as possible


I thought it was a county in England.


Agreed. In fact, I thought this was an iteration on the previously announced "Devin" project until I realised that they have different spelling.

https://www.cognition.ai/introducing-devin


Perhaps because of the "dev in chat" meme reference? When (game) developers would show up in the public chat channel.


Might “Theone” perhaps be a better one to use? Still stays somewhat true to the pronunciation of Devin, but with an emphasis on being The One that might actually work /s




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: