Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I tried Jules multiple times during the preview. Almost every week once, and it’s pretty terrible. Out of all the cloud coding assistants, it’s the worst. I honestly thought it was just an experiment that got abandoned and never expected it to actually become a real product, similar to how GH Copilot Spaces was an experiment and turned into Copilot agent.

It does what it wants, often just “finishes” a task preemptively and asking follow ups does nothing besides it rambling for a bit, the env sometimes doesn’t persist and stuff just stops working. For a while it just failed completely instantly because the model was dead or something

Out of the dozen times I tried it, I think I merged maybe one of its PRs. The rest I trashed and reassigned to a different agent.

My ranking

- Claude Code (through gh action), no surprise there

- ChatGPT Codex

- GitHub Copilot Agent

- Jules

I will try it again today to see if the full release changed anything (they give 3 months free trial for previous testers) but if it’s the same, I wouldn’t use or pay for Jules. Just use Codex or GitHub agent. Sorry for the harsh words



Alright, I wanted to give Jules another fair try to see if it improved, but it's still terrible.

- It proposed a plan. I gave it some feedback on the plan, then clicked approve. I came back a few minutes later to "jules is waiting for input from you", but there was nothing to approve or click, it just asked "let me know if you're happy with the plan and I'll get started". I told Jules "I already approved the plan, get started" and it finally started

- I have `bun` installed through the environment config. The "validate config" button successfully printed the bun version, so it's installed. When Jules tries to use bun, I get `-bash: bun: command not found` and it wastes a ton of time trying to install bun. Then bun was available until it asked me for feedback. When I replied, bun went missing again. Now for whatever reason it prefixes every command with "chmod +x install_bun.sh && ./install_bun.sh", so each step it does is installing bun again

- It did what I asked, then saw that the tests break (none were breaking beforehand, our main branch is stable), and instead of fixing them it told me "they're unrelated to our changes". I told it to fix everything, it was unable to. I'm using the exact same setup instructions as with Copilot Agent, Codex and Claude Code. Only Jules is failing

- I thought I'll take over and see what it did, but because it didn't "finish", it didn't publish a branch. I asked it to push a branch, it started doing something and is now in "Thinking" for a while. Seems to be running tests and lint again which are failing. But eventually it published the branch.

At this point I gave up. I don't have time to debug why bun is missing when in the env configuration it is available, or why it vanished in between steps, or figure out why only jules isn't able to properly run our testsuite. It took forever for a relatively small change, and each feedback iteration is taking forever.

I'm sure it'll be great one day, and I'll continue to re-visit it, but for now I'll stick with the other 3 when I need an async agent


similar experience. i would put codex over claude personally due to the better rate limits (of which i haven’t hit once yet even on extensive days) but jules was not very good - too messy and i prefer alternative outputs to creating a pull request. like in codex you can copy a git patch which is so incredibly useful to add personal tweaks before committing


I still need to try them, but I'm having a hard time envisioning async agents being nearly as useful to me as something local like Claude Code because of how often I need to intervene and ensure it is working correctly.

Won't the loop be pretty long-tail if you're using async agents? Like don't you have to pull the code, then go through a whole build/run/test cycle? Seems really tedious vs live-coding locally where I have a hot environment running and can immediately see if the agent goes off the rails.


We use async agents heavily. The key is to have proper validation loops, tests and strong/well phrased system prompts, so the agent can quickly see if something is broken or it broke convention.

We have proper issue descriptions that go into detail what needs to be done, where the changes need to be made and why. Break epics/stories down into smaller issues that can be chopped off easily. Not really different to a normal clean project workflow really.

Now for most of the tickets we just assign them to agents, and 10 minutes later pull requests appears. The pull requests get screened with Gemini Code Assist or Copilot Agent to find obvious issues, and github actions check lint, formatting, tests, etc. This gets pushed to a separate test environment for each branch.

We review the code, test the implementation, when done, click merge. Finished.

I can focus on bigger, more complex things, while the agents fix bugs, handle small features, do refactorings and so on in the background. It's very liberating. I am convinced that most companies/environments will end up with a similar setup and this becomes the norm. There really isn't a reason why not to use async agents.

Yeah sure if you give a giant epic to an agent it will probably act out of line, but you don't really have these issues when following a proper project management flow


"I honestly thought it was just an experiment that got abandoned and never expected it to actually become a real product, similar to how GH Copilot Spaces was an experiment and turned into Copilot agent."

My guess is that this is a play for the future. They know that current-day AIs can't really handle this in general... but if you wait for the AI that can and then try to implement this functionality you might get scooped. Why not implement it and wait for the AIs to catch up, is probably what they are thinking.

I'm skeptical LLMs can ever achieve this no matter how much we pour into them, but I don't expect LLMs to be the last word in AI either.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: