I'm thinking all the time about what the "best" way of using local AI agents like Claude / Codex / Gemini is. I'm trying to figure out the best UI/UX. There's so so so much that hasn't been explored yet.
Mainly I'm working on a task dispatch dashboard called Prompter Hawk that is designed to be the best UI for task management with agents. If you've been trying to parallelize by running multiple claude code terminals or codex terminals at once, this tool replaces those terminals and fits them all into one view with an AI task tracking board. It sounds more complicated than it is. It's a harness for Claude / Gemini / GPT models with a GUI that speeds up all your workflows. Rather than using sustained chat mode, all Prompter Hawk tasks are fire-and-forget. You just give the task description and come back when it's done. Parallelism first.
Some example highlight features:
-One dashboard view that shows all your parallel sessions and which tasks each agent has in progress and in their queue. Also shows recently completed tasks and outputs. This is my attempt at the ideal "pilot's cockpit view" for agentic development.
-Tasks are well tracked by the manager: see their status, file changes, and git commits. One click task retry. Get breakdowns on cost per run. Tasks can be set to automatically recur on a given schedule. Everything goes into a persistent local DB so you can easily pull up task data from months ago. Far far better user experience than trying to pull up old chat histories IMO.
-Timeline view and analytics views that give you hard stats on your velocity and how effectively your agents are using and updating your codebase. See unique stats like which of your files your agents read the most and how many daily LOC and commit changes you're doing. See how well you're parallelizing workloads at a simple glance.
-Automatic system diagram generation
-Task suggestion feature. If your agents are idle, they can draft tentative tasks to carry out next, based on the project history and your goals. This makes keeping multiple agents spinning actually much easier than you'd think. You don't need to be a multitasking context-switching god to do this.
I haven't shared it much (not even a Show HN) because the landing page isn't converting well at all yet, though I have some reddit ads doing well. I've had a bunch of free users sign up and a handful of paying users too. Looking for users or just feedback on anything! Sorry for wall of text.
With a bit of tuning, you can get models like Claude to output Mermaid-style diagrams. I built this as a feature into the tasks, so that you can hit a toggle which adds a prompt asking the agent to create a Mermaid diagram during or after the task execution. I pull this diagram back into the GUI and display it with the task information. So user flow is like:
-User creates task as usual but toggles the "mermaid diagram" option on
-Agent takes additional step during execution to create diagram
-User sees that diagram on the task details panel for that task
If you specify in your overall task prompt what kind of diagram you want or what you want it to show, it will take your specifications into account. It's just a prompt control + automatically pulling that diagram back into the task tracking.
It's sad for me to admit this but I find online ordering especially for fast food in my area is a much better experience because it standardizes the presentation of the order details for the cashier. When I speak to a human being however I often run into language barriers immediately, and as a result my orders get screwed up more often.
The other side of this coin is that many people take forever at the digital kiosk. If there aren’t enough kiosks and the location is a busy one, I’ll actively avoid that place.
I might as well figure out how to use their website/app at that point.
But all of this leads us further down the path of abstracting/hiding the reality of the purchasing decisions we make. I can’t help but feel this all chips away at our collective humanity.
Yeah a new amazing matcha drink/ice cream place opened at my area. There is always a long line out the door, so I went there to check it out. It was amazing, but the real reason for the long line is that the kiosk wanted an SMS confirmation to place the order.
There was a very tiny "skip" button but most people ahead of me probably didn't see it.
But the average time taken by each and every person at self serve lines seems to be higher, because not only does the person need to decide what they want to order, they also need to figure out the UI at the same time. And there seems to be more choice paralysis when the primary way the customer consumes the menu is the moment they walk up to the kiosk.
In a traditional line, half of the ordering process is typically handled by an individual already trained to rapidly enter the order details, and in most cases, the person ordering knows what they want by the time it’s their turn.
These kiosks are about optimizing cost, and just about everything about the end user experience seems worse.
I can (usually) manage self-checkout at a grocery store, but I find self-service kiosks for ordering fast food to be paralyzing.
There's no way to ask questions. It's not always clear what's on the ingredients list, or how big a particular menu item is, or if a side of rice/bread is included. If I want to ask them not to put something on an order, or to add something extra, I have to go through the entire workflow and then sometimes backtrack. I'm semi-regularly surprised by what I get, and they also usually don't accept cash. The instructions for how to pick up an order and what identifying information to provide are also often unclear.
Sure, I sometimes get something slightly different from what I wanted when I order from a human, but I usually get something at least close. I think I've had a serious mix-up once in my entire life, when ordering in German from someone else who spoke German as a second language. Even then, the server looked confused and double-checked, it was only my own stubbornness regarding my language proficiency that caused a problem.
Interesting, almost every case I know of where a fire alarm has gone off, no one really cared. I even remember a fire alarm going off in an actual movie theater and no one budged.
I'm going to maintain my position that a precondition for something always being a big deal is that it must always be meaningful. Fire alarms are sometimes a big deal, but sometimes they are just noise. I think the CFO of Tesla stepping down is also just noise as well, but you are welcome to think otherwise.
You're missing the point and willfully characterizing others as solely being concerned with making the AI's say slurs. That's not their concern. But you can win any imaginary argument you like.
I definitely appreciate the wish to treat people charitably!
In terms of the site guidelines, "You're missing the point" is kind of a swipe and so should probably be dropped; "willfully" should definitely have been dropped because it's making a claim about negative intent that one can't actually know and such claims always land as an attack on the other person; and the last sentence was snarky and should have been dropped.
If one makes a habit of editing such things out of one's comments, one's substantive point will come to the fore more clearly, which benefits everyone. But it's not always easy in the moment!
My favorite kind of comment: allude to a bigger point the op misses, but don't actually say the point.
I doubt I'm misrepresenting anybody. If its not slurs it's surely something about "wokeness."
You are not yet mature enough for this future if any of this is your concern. The world is going to pass you by while you're just stuck saying "there are only two genders" to all your comrades.
Don't let the politicians mobilize you like this, your time is worth more.
I think multi-currency systems can be used to juice microtransactions pretty hard still. For example League of
Legends gives players free "chests" and free "keys" as a slow-drip, and you can open any chest with a key to get some random loot. If your chest count doesn't match your keys you can buy more of either using real cash. So basically multi currency systems can be used to keep players intentionally in a state of imbalance. It also makes it so their money doesn't go as far in the game if you have to buy separate blue yellow and red coins instead of just yellow coins like it sounds like in your Hades example (just having 9).
Sure, but multi-currency systems juice anything. Control, for example, has tons of different materials used for upgrades and crafting. In order to get the right mix of upgrades, you have to travel to all different parts of the game world. Nier has something similar.
By comparison, there are plenty of games (especially older ones) with only one or two currencies. Maybe you just find the one place in the game world that lets you grind out those currencies the fastest, and you do that over and over until you’re sick of it.
Musk has said that one of his companies will give paraplegics the ability to walk again. He's overstated and overhyped the autonomous driving capabilities a million times as well.
It was cold enough in downtown Boston that a woman froze to death walking a few blocks home, to say nothing of all the homeless people whose deaths probably went unremarked in the media.
Matlab's focus on examples really shines. It's how people actually want to use docs: let me copy and paste something that works and then iterate from there.
"9s" are brought up in reliability discussions. They refer to the number of 9's in an uptime metric like 99%, 99.9%, 99.9999%, ...
See also "march of the 9s"
What do you mean by "however"? He built the entire FSD architecture - in less time and with fewer people than Google, Apple, and god knows who else. He's likely saved more lives than most doctors at this point...
No he didn't, a (large from what I hear) team did. However Karpathy is still a smart, well accomplished, guy and he has a talent many academics don't: being able to write actual production grade, high quality software. And he's a good communicator.
What's ridiculous about stats? A Tesla with autopilot or FSD is more than 10x less likely to be in a collision, based on various US agencies stats. Average chance of collision is 1 in 366 for every 1000 miles driven. Something like 1 in 20000 of those result in fatalities. Now, take millions of Teslas equipped with any assisted-driving tech, multiply by miles driven, include the above averages, and divide by ten. Doctors would be jelly.
Perhaps you can elaborate on what strikes you as ridiculous about fairly straightforward stats? Surprised to see you on a ML thread...
Can you cite "various US agencies stats"? I say this as someone who used FSD Beta for a full year and Autopilot (with Navigate on Autopilot) for over 3 years now. It would actually help the conversation here to cite it, though I never knew third-parties evaluated Tesla's claims/statistics.
There's also a strong selection bias to a bulk stat like the one parent mentioned. Teslas are expensive cars and I'm guessing their drivers are likely to be older, more affluent, and more educated that average - this all correlates to lower accidents/fatalities regardless of FSD[1][2].
Millions of Teslas equipped with ‘any assisted-driving tech’ != autopilot on FSD. We have to compare deaths in Teslas with deaths in other vehicles too, as that’s what people would use if not a Tesla.
The chance of fatality per collision is way off.
If you’re going to go down a rabbit hole we need to look at lives impacted by lithium mining compared to regular combustion engine vehicles.
While I appreciate your point I’d be surprised if the number of QALYs is higher from working on FSD at Tesla compared to being a doctor.
You're probably right that fatalities per collision is off but plug in your own numbers and then divide by ten... the point still stands. Autopilot has saved countless lives. And it was architected by a very small team, led by this one man. But I'll admit, the comparison with doctors is a little hyperbolic. Although in a few years of growth and global scale, who knows...?
Mainly I'm working on a task dispatch dashboard called Prompter Hawk that is designed to be the best UI for task management with agents. If you've been trying to parallelize by running multiple claude code terminals or codex terminals at once, this tool replaces those terminals and fits them all into one view with an AI task tracking board. It sounds more complicated than it is. It's a harness for Claude / Gemini / GPT models with a GUI that speeds up all your workflows. Rather than using sustained chat mode, all Prompter Hawk tasks are fire-and-forget. You just give the task description and come back when it's done. Parallelism first.
Some example highlight features:
-One dashboard view that shows all your parallel sessions and which tasks each agent has in progress and in their queue. Also shows recently completed tasks and outputs. This is my attempt at the ideal "pilot's cockpit view" for agentic development.
-Tasks are well tracked by the manager: see their status, file changes, and git commits. One click task retry. Get breakdowns on cost per run. Tasks can be set to automatically recur on a given schedule. Everything goes into a persistent local DB so you can easily pull up task data from months ago. Far far better user experience than trying to pull up old chat histories IMO.
-Timeline view and analytics views that give you hard stats on your velocity and how effectively your agents are using and updating your codebase. See unique stats like which of your files your agents read the most and how many daily LOC and commit changes you're doing. See how well you're parallelizing workloads at a simple glance.
-Automatic system diagram generation
-Task suggestion feature. If your agents are idle, they can draft tentative tasks to carry out next, based on the project history and your goals. This makes keeping multiple agents spinning actually much easier than you'd think. You don't need to be a multitasking context-switching god to do this.
I haven't shared it much (not even a Show HN) because the landing page isn't converting well at all yet, though I have some reddit ads doing well. I've had a bunch of free users sign up and a handful of paying users too. Looking for users or just feedback on anything! Sorry for wall of text.
[1] https://prompterhawk.dev/