Hacker Newsnew | past | comments | ask | show | jobs | submit | all2's commentslogin

Dice the mayo and sticks of RAM and place in a cast iron skillet over medium heat. Turn it every two or three minutes. Remove when you can smell the magic smoke.


I'm on my 5th draft of an essentially vibe-coded project. Maybe its because I'm using not-frontier models to do the coding, but I have to take two or three tries to get the shape of a thing just right. Drafting like this is something I do when I code by hand, as well. I have to implement a thing a few times before I begin to understand the domain I'm working in. Once I begin to understand the domain, the separation of concerns follows naturally, and so do the component APIs (and how those APIs hook together).


My suggestions:

- like the sister comment says, use the best model available. For me that has been opus but YMMV. Some of my colleagues prefer the OAI models.

- iterate on the plan until it looks solid. This is where you should invest your time.

- Watch the model closely and make sure it writes tests first, checks that they fail, and only then proceeds to implementation

- the model should add pieces one by one, ensuring each step works before proceeding. Commit each step so you can easily retry if you need to. Each addition will involve a new plan that you go back and forth on until you're happy with it. The planning usually gets easier as the project moves along.

- this is sometimes controversial, but use the best language you can target. That can be Rust, Haskell, Erlang depending on the context. Strong types will make a big difference. They catch silly mistakes models are liable to make.

Cursor is great for trying out the different models. If opus is what you like, I have found Claude code to be better value, and personally I prefer the CLI to the vscode UI cursor builds on. It's not a panacea though. The CLI has its own issues like occasionally slowing to a crawl. It still gets the work done.


My options are 1) pay about a dollar per query from a frontier model, or 2) pay a fraction of that for a not-so-great model that makes my token spend last days/weeks instead of hours.

I spend a lot of time on plans, but unfortunately the gotchas are in the weeds, especially when it comes to complex systems. I don't trust these models with even marginally complex, non-standard architectures (my projects center around statecharts right now, and the semantics around those can get hairy).

I git commit after each feature/bugfix, so we're on the same page here. If a feature is too big, or is made up of more than one "big" change, I chunk up the work and commit in small batches until the feature is complete.

I'm running golang for my projects right now. I can try a more strongly typed language, but that means learning a whole new language and its gotchas and architectural constraints.

Right now I use claude-code-router and Claude Code on top of openrouter, so swapping models is trivial. I use mostly Grok-4.1 Fast or Kimi 2.5. Both of these choke less than Anthropic's own Sonnet (which is still more expensive than the two alternatives).


> and personally I prefer the CLI to the vscode UI cursor builds on

So do I, but I also quite like Cursor's harness/approach to things.

Which is why their `agent` CLI is so handy! You can use cursor in any IDE/system now, exactly like claude code/codex cli


I tried it when it first came out and it was lacking then. Perhaps it's better now--will give it a shot when I sign up for cursor again.

Thank you for sharing that!


When you say “iterate on the plan” are you suggesting to do that with the AI or on your own? For the former, have any tips/patterns to suggest?


With the AI. I read the whole thing and correct the model where it makes mistakes, fill the gaps where I find them.

I also always check that it explicitly states my rules (some from the global rules, some from the session up until that moment) so they're followed at implementation time.

In my experience opus is great at understanding what you want and putting it in a plan, and it's also great at sticking to the plan. So just read through the entire thing and make sure it's a plan that you feel confident about.

There will be some trial and error before you notice the kind of things the model gets wrong, and that will guide what you look for in the plan that it spits out.


> Maybe its because I'm using not-frontier models to do the coding

IMO it’s probably that. The difference between where this was a a year ago and now is night and day, and not using frontier models is roughly like stepping back in time 6-12 months.


I'm assuming GP means 'run inference locally on GPU or RAM'. You can run really big LLMs on local infra, they just do a fraction of a token per second, so it might take all night to get a paragraph or two of text. Mix in things like thinking and tool calls, and it will take a long, long time to get anything useful out of it.


I’ve been experimenting with this today. I still don’t think AI is a very good use of my programming time… but it’s a pretty good use of my non-programming time.

I ran OpenCode with some 30B local models today and it got some useful stuff done while I was doing my budget, folding laundry, etc.

It’s less likely to “one shot” apples to apples compared to the big cloud models; Gemini 3 Pro can one shot reasonably complex coding problems through the chat interface. But through the agent interface where it can run tests, linters, etc. it does a pretty good job for the size of task I find reasonable to outsource to AI.

This is with a high end but not specifically AI-focused desktop that I mostly built with VMs, code compilation tasks, and gaming in mind some three years ago.


Yes, this is what I meant. People are running huge models at home now, I assumed people could do it on premises or in a data center if you're a business, presumably faster... but yeah it definitely depends on what time scales we're talking.


I'd love to know what kind of hardware would it take to do inference at the speed provided by the frontier model providers (assuming their models were available for local use).

10k worth of hardware? 50k? 100k?

Assuming a single user.


Huge models? First you have to spend $5k-$10k or more on hardware. Maybe $3k for something extremely slow (<1 tok/sec) that is disk-bound. So that's not a great deal over batch API pricing for a long, long time.

Also you still wouldn't be able to run "huge" models at a decent quantization and token speed. Kimi K2.5 (1T params) with a very aggressive quantization level might run on one Mac Studio with 512GB RAM at a few tokens per second.

To run Kimi K2.5 at an acceptable quantization and speed, you'd need to spend $15k+ on 2 Mac Studios with 512GB RAM and cluster them. Then you'll maybe get 10-15 tok/sec.


Hello all,

My name is Albert and I built this statechart engine to scratch my own itch. I work with embedded systems and in an environment where process is paramount. My current employer has a number of certifications that require semi-annual audits of process, paperwork, etc.

I kept reaching for tools I knew about (python-statemachine in particular) in order to solve process problems, but nothing really scratched the itch; I needed an engine that took declarative workflows that were versionable and more importantly, not a part of the core runtime. I want to be able to push/pull/up/down workflows the same way I do docker containers. I want those workflows to expose API endpoints and webhooks so I can interface with them in a variety of ways (UI, test completions, ECO/ECR flows involving people, and so on).

This library is a foundational piece of that puzzle; it gives me a relatively performant statechart execution engine that I can build my dream app on top of.

It consists of two runtimes: an event driven runtime, and a tick-based runtime. The event driven runtime does not make guarantees about event execution order for parallel states, where the tick-based system does. The tick based runtime is single-threaded on purpose, in order to guarantee event evaluation order, which makes it deterministic.

As an aside, I plan on building out a number of demos using this engine in the coming weeks. Amongst the demo ideas are: an AI agentic pipeline doing something arbitrary, like research (because why not), a simple game engine integration where the tick-based runtime takes the place of 'system' in an entity-component-system architecture, and probably an http server implementation just for kicks.

Thanks for taking a look!

---

Oh, and here are some benchmarks for the alpha release:

StatechartX Benchmarks (i5-5300U, 4 cores, Linux)

  Runtime              Burst      Sustained   Latency   Queue   Allocs
  ---------------------------------------------------------------------
  Realtime (1000Hz)    15.05M/s   ~6.1M/s     279 µs    10K     0
  Core - Simple        8.86M/s    8.86M/s     ~83 ns    100K    0
  Core - Internal      12.08M/s   12.08M/s    ~83 ns    1K      0
  Core - Concurrent    4.00M/s    4.00M/s     ~83 ns    10K     0
  Core - Parallel      3.69M/s    3.69M/s     ~83 ns    100K    0
  Core - Hierarchical  3.41M/s    3.41M/s     ~83 ns    100K    0
  Core - Guarded       3.01M/s    3.01M/s     ~83 ns    100K    0

  Memory: 32KB/machine | Zero allocations under load | Graceful backpressure

  Event-driven: burst = sustained (constant, predictable)
  Tick-based: burst > sustained (15M submit, 6.1M process)


It seems like everything converges on either LISP or emacs.


It's actually the opposite, I think. Because of how industrialized the lumber/paper industries have gotten, stewardship of forests has improved over time. This includes replanting in harvested areas.


Says anyone who has tried to do anything requiring the smallest amount of computer science or computer engineering. These models are really great at boilerplate and simple web apps. As soon as you get beyond that, it gets hairy. For example, I have a clone of HN I've been working on that adds subscriptions and ad slot bidding. Just those two features required a lot of hand holding. Figma Design nailed the UX, but the actual guts/business logic I had to spend time on.

I expect that this will get easier as agentic flows get more mature, though.

Then the only place that novelty will occur is in the actual study of computer science. And even then, a well contexted agentic pipeline will speed even R&D development to a great degree.

One very bad thing about these things is the embedded dogma. With AI ruling the roost in terms of generation (basically an advanced and opinionated type-writer, lets be honest) breaking away from the standards in any field will become increasingly difficult. Just try and talk to any frontier model about physics that goes against what is currently accepted and they'll put up a lot of resistance.


I’ve been pleasantly surprised how useful it is for writing low level stuff like peripheral drivers on imbedded platforms. It’s actually-simple- stuff, but exactingly technical and detail oriented. It’s interesting that it can work so well, then go wildly off the rails and be impossible to wrestle back on unless you go way back in the context or even start a completely new context and feed in only what is currently relevant (this has become my main strategy)

Still, it’s amazingly good at wrestling the harmony of a bunch of technical details and applying them to a tried and true design paradigm to create an API for new devices or to handle tricky timing, things like that. Until it isn’t and you have to abort the session and build a new one because it has worked itself into some kind of context corner where it obsesses about something that is just wrong or irrelevant.

Still, it’s a solid 2x on production, and my code is arguably more maintainable because I don’t get tempted to be clever or skip clarifying patterns.

There is a level of wholistic complexity that kills it though. The trick is dividing the structure and tasks into self contained components that contain any relevant state within their confines to the maximum practical extent, even if there is a lot of interdependent state going on inside. It’s sort a mod a meta-functional paradigm working with inherently state-centric modules.


> a clone of HN I've been working on that adds subscriptions and ad slot bidding

Wut, what's the purpose of that? Is this just a toy learning project? Would it be to make money off of people who don't know that an ad-free version of HN exists at news.ycombinator.com? Will you try to sell it to Ycombinator?


I am hoping they are developing it as a satirical art project, otherwise... yikes; needing a credit card and an ad blocker to use HN would be very depressing and is counter to everything I enjoy about this forum.


A "clone" usually doesn't mean that they'll copy the content, but the idea, like Ola is a clone of Uber.

(Though they probably should've said a link aggregator instead of HN clone.)


> (Though they probably should've said a link aggregator instead of HN clone.)

That's fair. It is a feature-to-feature clone, though.


Mostly just learning, to be honest. I'm not trying to replace HN, I'm just fiddling around and seeing what I can do and what I can't.

My long term purpose is to provide the source code for communities/creators that want something simple to set up, and specifically allow creators to gate content behind a paywall. I'm sure stuff like that exists, but I hope what I build will be at least somewhat use-able.


I'm glad you are not being competent enough to create a paid version of HN with the help of AI


> Poorly engineered

How so? As a pedagogic tool it seems adequate. How would you change this to be better (either from a teaching standpoint or from a SWE standpoint)?


I'm curious what the alternative would be? Python's threads/sub-processing almost requires IO queues to function well, and are nearly the same semantically as channels...

I'm not saying these are "good", just wondering what alternatives look like?


One alternative is STM, software transactional memory which is modular and composable. I think Haskell was first to implement it but its also in Clojure and some Scala libs.

This is what ZIO's type safe version looks like https://zio.dev/reference/stm/ Scala's for-comprehension is syntactic sugar for calls to flatMap, map, and withFilter, similar to Haskell's do-notation.


Meh. The US has a history of installing totalitarian or terrorist governments. The middle-east is a lovely example; the US was responsible for the likes of Osama Bin Laden (CIA asset), for the installation of Saddam Hussein, and many many more.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: