More

solomatov · 2026-02-07T21:40:16 1770500416

You aren't supposed to read code, but do you from time to time, just to evaluate what is going on?

navanchauhan · 2026-02-07T22:25:09 1770503109

No. But, I do ask questions (in $CODING_AGENT to always have a good mental model of everything that I’m working on though.

solomatov · 2026-02-07T23:07:33 1770505653

Is it essentially using LLMs as a compiler for your specs?

What do you do if the model isn't able to fulfill the spec? How do you troubleshoot what is going on?

navanchauhan · 2026-02-08T00:10:48 1770509448

Using models to go from spec to program is one use case, but it’s not the whole story. I’m not hand-writing specs; I use LLMs to iteratively develop the spec, the validation harness, and then the implementation. I’m hands-on with the agents, and hands-off with our workflow style we call Attractor

In practice, we try to close the loop with agents: plan -> generate -> run tests/validators -> fix -> repeat. What I mainly contribute is taste and deciding what to do next: what to build, what "done" means, and how to decompose the work so models can execute. With a strong definition of done and a good harness, the system can often converge with minimal human input. For debugging, we also have a system that ingests app logs plus agent traces (via CXDB).

The more reps you get, the better your intuition for where models work and where you need tighter specs. You also have to keep updating your priors with each new model release or harness change.

This might not have been a clear answer, but I am happy to keep clarifying as needed!

solomatov · 2026-02-08T02:03:54 1770516234

But what is the result of your work? What do you commit to the repo? What do you show to new folks when they join your team?

navanchauhan · 2026-02-09T18:46:15 1770662775

> What do you show to new folks when they join your team?

I think this is an interesting question because we have not fully figured out the best way to onboard people to our codebases. Each person is responsible for multiple codebases (yay microservices!), and no one else commits to a repository while they have dibs. We also have conventions for how agents write documentation around deployments and validations.

In theory, when a new person joins the team or is handed a repository, they can throw some tokens at the codebase, interrogate it, and ask questions about how things are implemented.

> But what is the result of your work?

The end result is a final, working codebase. The specs and sprint plans are also committed to the repository for posterity, so agents in a fresh session can see what work has been completed and the trajectory we are moving toward.

solomatov · 2026-02-02T21:26:34 1770067594

>But it does reduced by an order of magnitude the amount of money you need to spend on programming a solution that would work better

Could you share any data on this? Are there any case studies you could reference or at least personal experience? One order of magnitude is 10x improvement in cost, right?

ManuelKiessling · 2026-02-02T21:38:15 1770068295

I‘m not sure it’s a perfect example, but at least it’s a very realistic example from a company that really doesn’t have time and energy for hype or fluff:

We are currently sunsetting our use of Webflow for content management and hosting, and are replacing it with our own solution which Cursor & Claude Opus helped us build in around 10 days:

https://dx-tooling.org/sitebuilder/

https://github.com/dx-tooling/sitebuilder-webapp

solomatov · 2026-02-02T21:47:55 1770068875

Thanks for the link.

So, basically you made a replacement for webflow for your use case in 10 days, right?

ManuelKiessling · 2026-02-03T07:46:37 1770104797

That's fair to say, yes, with the important caveat that it isn't a 1:1 replacement of Webflow, which is exactly the point.

iamacyborg · 2026-02-02T23:06:06 1770073566

I’m not sure the world needed yet another CMS

danielmarkbruce · 2026-02-03T06:06:58 1770098818

It doesn't. The person is saying they built just the functionality they needed. Probably 25% of a CMS. That's the point.

ManuelKiessling · 2026-02-03T07:45:51 1770104751

Exactly.

And the big advantage for us is two things: Our content marketers now have a "Cursor-light" experience when creating landingpages, as this is a "text-to-landingpage" LLM-powered tool with a chat interface from their point of view; no fumbling around in the Webflow WYSIWYG interface anymore.

And from the software engineering department's point of view, the results of the work done by the content marketers are simply changes/PR in a git repository, which we can work on in the IDE of our choice — again, no fumbling around in the Webflow WYSIWYG interface anymore.

danielmarkbruce · 2026-02-03T18:45:23 1770144323

This is the benefit few understand properly. The storage layer is where you get a lot of benefits.

solomatov · 2026-02-02T18:27:16 1770056836

Is it open source? Do they disclose which framework they use for the GUI? Is it Electron or Tauri?

surrTurr · 2026-02-02T18:42:36 1770057756

lol ofc not

looks like the same framework they used to build chatgpt desktop (electron)

edit - from another comment:

> Hi! Romain here, I work on Codex at OpenAI. We totally hear you. The team actually built the app in Electron specifically so we can support Windows and Linux as well. We shipped macOS first, but Windows is coming very soon. Appreciate you calling this out. Stay tuned!

solomatov · 2026-01-27T01:26:14 1769477174

>I think that really high quality code can be created via coding agents. Not in one prompt, but instead an orchestration of planning, implementing, validating, and reviewing.

Do you have any advice to share (or resources)? Have you experienced it yourself?

will__ness · 2026-01-27T17:28:12 1769534892

Here is my exact workflow: https://willness.dev/blog/claude-code-workflow

solomatov · 2026-01-05T00:39:19 1767573559

This all sounds interesting, but how effective are they? Did anyone has experience with any of them?

aranchelk · 2026-01-05T01:02:10 1767574930

Yes, agentic search over vector embeddings. It can be very effective.

solomatov · 2026-01-05T01:05:11 1767575111

It's a very well known pattern. But what about others? There're a lot of very interesting stuff there.

aranchelk · 2026-01-05T01:28:26 1767576506

Tool Use Steering via Prompting. I’ve seen that work well also, but I don’t know if I’d quite call it an architectural pattern.

nkko · 2026-01-05T15:42:10 1767627730

I’m eager to tackle issues and PRs.

solomatov · 2025-06-25T15:01:39 1750863699

I couldn't find any mentions of whether they train their models on your source code. May be someone was able to?

dawnofdusk · 2025-06-25T16:10:53 1750867853

Yes they do. Scroll to bottom of Github readme

>This project leverages the Gemini APIs to provide AI capabilities. For details on the terms of service governing the Gemini API, please refer to the terms for the access mechanism you are using:

Click Gemini API, scroll

>When you use Unpaid Services, including, for example, Google AI Studio and the unpaid quota on Gemini API, Google uses the content you submit to the Services and any generated responses to provide, improve, and develop Google products and services and machine learning technologies, including Google's enterprise features, products, and services, consistent with our Privacy Policy.

>To help with quality and improve our products, human reviewers may read, annotate, and process your API input and output. Google takes steps to protect your privacy as part of this process. This includes disconnecting this data from your Google Account, API key, and Cloud project before reviewers see or annotate it. Do not submit sensitive, confidential, or personal information to the Unpaid Services.

jddj · 2025-06-25T20:33:50 1750883630

There must be thousands of keys in those logs.

Workaccount2 · 2025-06-25T18:16:11 1750875371

If you use for free: Yes

If you pay for API: No

solomatov · 2025-06-09T23:47:07 1749512827

Is anyone aware of a good llm orchestration libraries for go like langchain for Python and Typescript?

myzie · 2025-06-10T02:43:20 1749523400

https://github.com/diveagents/dive

Dive orchestrates multi agent workflows in Go. Take a look and let me know what you think.

solomatov · 2025-06-09T22:12:45 1749507165

I prefer MacBook to iPad most of the time. The only use case for iPad for me where it shines is when I need to use a pencil.

solomatov · 2025-06-09T22:09:44 1749506984

Does anyone know whether they have optimized memory management, i.e. virt machine not consuming more RAM than required?

dontdoxxme · 2025-06-10T07:40:20 1749541220

Not yet: https://github.com/apple/container/blob/main/docs/technical-...

nasretdinov · 2025-06-10T17:12:30 1749575550

From that document I read that it in fact does, but it doesn't release memory if app started consuming less. It does memory balooning though, so the VM only consumes as much RAM as the maximum amount requested by the app

solomatov · 2025-06-05T01:33:40 1749087220

Not a lawyer, but my understanding it's not since legal obligations is a reason for processing personal data.

anticensor · 2025-06-05T02:52:33 1749091953

That excuse in EU holds only against an EU court or ICJ or ICC. EU doesn't recognise legal holds of foreign jurisdictions.

solomatov · 2025-06-05T03:00:46 1749092446

Do you have any references to share?

Kim_Bruning · 2025-06-05T02:41:26 1749091286

It's a bit more complicated. For the purposes of the GDPR legal obligations within the EU (where we might assume relevant protections are in place) might be considered differently than eg legal obligations towards the Chinese communist party, or the NSA.