> it's becoming more about organizing tasks and letting ai agents figure out the implementation details ... different agents pick up different pieces and coordinate with each other
This is exactly what I have been working on for the past year and a half. A system for managing agents where you get to work at a higher abstraction level, explaining (literally with your voice) the concepts & providing feedback. All the agent-agent-human communication is on a shared markdown tree.
I haven't posted it anywhere yet, but your comment just describes the vision too well, I guess it's time to start sharing it :D see https://voicetree.io for a demo video. I have been using it everyday for engineering work, and it really is feeling like how you describe; my job is now more about organizing tasks, explaining them well, and providing critique, but just through talking to the computer. For example, when going through the git diffs of what the agents wrote, I will be speaking out loud any problems I notice, resulting in voice -> text -> markdown tree updates and these will send hook notifications to claude code so they automatically address feedback.
Cool demo!
The first thing that sprung to mind after seeing it, was an image of a busy office floor filled with people talking into their headsets, not selling or buying stocks, but actually programming. If it’s a blessed or cursed image I’ll let you decide.
Haha, blursed one might say. In seriousness though, the social avoidance of wanting to talk to a computer around others will likely be the largest bottleneck to adoption for this sort of tech. May need to initially frame it as for work from home engineers.
Luckily the other side to this project doesn't require any user behavioural changes. The idea is to convert chat histories into a tree format with the same core algorithm, and then send only the relevant sub-tree to the LLM, reducing input tokens and context bloat, thereby also improving accuracy. This would then also unlock almost infinite length LLM chats. I have been running this LLM context retrieval algo against a few benchmarks, GSM-infinite, nolima, and longbench-v2 benchmarks, the early results are very promising, ~60-90% reduced tokens and increased accuracy against SOTA, however only on a subset of the full benchmark datasets.
This is exactly what I have been working on for the past year and a half. A system for managing agents where you get to work at a higher abstraction level, explaining (literally with your voice) the concepts & providing feedback. All the agent-agent-human communication is on a shared markdown tree.
I haven't posted it anywhere yet, but your comment just describes the vision too well, I guess it's time to start sharing it :D see https://voicetree.io for a demo video. I have been using it everyday for engineering work, and it really is feeling like how you describe; my job is now more about organizing tasks, explaining them well, and providing critique, but just through talking to the computer. For example, when going through the git diffs of what the agents wrote, I will be speaking out loud any problems I notice, resulting in voice -> text -> markdown tree updates and these will send hook notifications to claude code so they automatically address feedback.