Hacker Newsnew | past | comments | ask | show | jobs | submit | sqs's commentslogin

I don't think it takes care of tone transformation (eg 他是 ni3shi4 -> ni2shi4). Or if it does, my tones are just off. But it's a really cool idea!

他是 is tāshì which doesn't transform I think. Did you mean to write 你是 nǐshì? I think that transforms differently though. With the half 3rd tone only dropping.

The classical example is 4/4 不是. Which goes bùshì -> búshì.

Or 3/3 that becomes 2/3. E.g. 你好 nǐhǎo becoming níhǎo.

The 1/4 -> 2/4 transformation I think is specific to one. 一个 yīgè becomes yígè.


The tone sandhi example you just gave looks incorrect to me

Well, OP wrote "he is" but then wrote "you are" in pinyin for one, and that's a bit hard to reconcile.

What's super interesting is that Opus is cheaper all-in than Sonnet for many usage patterns.

Here are some early rough numbers from our own internal usage on the Amp team (avg cost $ per thread):

- Sonnet 4.5: $1.83

- Opus 4.5: $1.30 (earlier checkpoint last week was $1.55)

- Gemini 3 Pro: $1.21

Cost per token is not the right way to look at this. A bit more intelligence means mistakes (and wasted tokens) avoided.


Totally agree with this. I have seen many cases where a dumber model gets trapped in a local minima and burns a ton of tokens to escape from it (sometimes unsuccessfully). In a toy example (30 minute agentic coding session - create a markdown -> html compiler using a subset of commonmark test suite to hill climb on), dumber models would cost $18 (at retail token prices) to complete the task. Smarter models would see the trap and take only $3 to complete the task. YMMV.

Much better to look at cost per task - and good to see some benchmarks reporting this now.


For me this is sub agent usage. If I ask Claude Code to use 1-3 subagents for a task, the 5 hour limit is gone in one or two rounds. Weekly limit shortly after. They just keep producing more and more documentation about each individual intermediate step to talk to each other no matter how I edit the sub agent definitions.


Care sharing some of your sub-agent usage? I've always intended to really make use of them, but with skills, I don't know how I'd separate these in many use cases?


I just grabbed a few from here: https://github.com/VoltAgent/awesome-claude-code-subagents

Had to modify them a bit, mostly taking out the parts I didn’t want them doing instead of me. Sometimes they produced good results but mostly I found that they did just as well as the main agent while being way more verbose. A task to do a big hunt or to add a backend and frontend feature using two agents at once could result in 6-8 sizable Markdown documents.

Typically I find that just adding “act as a Senior Python engineer with experience in asyncio” or some such to be nearly as good.


They're useful for context management. I use frequently for research in a codebase, looking for specific behavior, patterns, etc. That type of thing eats a lot of context because a lot of data needs to be ingested and analyzed.

If you delegate that work to a sub-agent, it does all the heavy lifting, then passes the results to the main agent. The sub-agent's context is used for all the work, not the main agent's.


Hard agree. The hidden cost of 'cheap' models is the complexity of the retry logic you have to write around them.

If a cheaper model hallucinates halfway through a multi-step agent workflow, I burn more tokens on verification and error correction loops than if I just used the smart model upfront. 'Cost per successful task' is the only metric that matters in production.


Yeah, that's a great point.

ArtificialAnalysis has a "intelligence per token" metric on which all of Anthropic's models are outliers.

For some reason, they need way less output tokens than everyone else's models to pass the benchmarks.

(There are of course many issues with benchmarks, but I thought that was really interesting.)


what is the typical usage pattern that would result in these cost figures?


Using small threads (see https://ampcode.com/@sqs for some of my public threads).

If you use very long threads and treat it as a long-and-winding conversation, you will get worse results and pay a lot more.


The context usage awareness is a bit boost for this in my experience. I use speckit and have setup to wrap up tasks when at least 20% of context remaining with a summary of progress, followed by /clear, insert summary and continue. This has reduced compacts almost entirely.


Sorry we missed that email! I don’t know what went wrong there, but I just replied and will figure it out. This is definitely not the norm (and Build Crew is a small fraction of our users).


(I can't edit my old post, but it turned out to be a Discord issue, not an issue with the amp link. Oops!)


Yeah, I said about coding agents, “it’s obviously the future, but it’s not there yet”. That talk was from the AI Engineer conference in June 2024 (16 months ago). Coding agents have come a long way since then!


It's a big organization of teen coders who build really cool things together. Instead of coding alone, they get to hack on software and hardware projects in person and online with other smart teens all around the world.

You can see full financial and donor information at https://hackclub.com/philanthropy/ as well. Check it out. It's an organization that lots of HN folks would support (and many do). (I am on the board of Hack Club.)


Sounds like a great project! Sorry you had to deal with this headache.


I'm a hack clubber who is extremely active and has sent over 55K messages in the slack (talk about insanity!). I've been part of Hack Club for about 3 years now, and it's changed my life in ways you couldn't have imagined. Porting over from Slack is super stressful for me + all of the HC staff having to pull all-nighters for the next week :). Hopefully this can all be figured out, and we can finally have a proper FOSS software to allow for lots of additions via PR's! Also, all the finances are available too at hcb.hackclub.com/hq (guess what, this is 99% coded by teenagers too, and open source... woah).


Buildkite is awesome, by far the best CI product and with an amazing team.


That draft RFC you mention is superseded by https://agents.md. Now that Amp uses AGENTS.md (https://x.com/sqs/status/1957945824404729997), I made all the former agent file stuff on https://ampcode.com just redirect to https://agents.md.


Gotcha - this is what I had in my history: https://ampcode.com/AGENT.md

For me, that gives a 404 with no obvious way to get to https://agents.md, I think either a hyperlink or redirect would be nice to have as well.


Thank you for pointing that out. Just pushed a fix, will be live in ~5-10min.


Works now, thank you! :)


Just the filename is standardized. The contents aren't, which is exactly right. From the site:

> Are there required fields?

> No. AGENTS.md is just standard Markdown. Use any headings you like; the agent simply parses the text you provide.


Amp has only been out for a couple of months, and it's growing fast. Cursor and Claude Code have been around for longer, but obviously that's not always a good thing since everything is constantly changing. Check on X, like at https://x.com/sawyerhood/status/1945865702159925734 or https://x.com/search?q=%40AmpCode&src=typed_query&f=top.


This is really cool. I used Vibe Kanban with Amp to update some of our docs and UI components, and it was great.


I would say conservatively that 80% of Vibe Kanban has been built by Amp


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: