Hacker Newsnew | past | comments | ask | show | jobs | submit | killerstorm's commentslogin

I think "internet" needs a shared reputation & identity layer - i.e. if somebody offers a comment/review/contribution/etc, it should be easy to check - what else are their contributing, who can vouch for them, etc.

Most of innovation came from web startups who are just not interest in "shared" anything: they want to be a monopoly, "own" users, etc. So this area has been neglected, and then people got used to status quo.

PGP / GPG used to have web-of-trust but that sort of just died.

People either need to resurrect WoT updated for modern era, or just accept the fact that everything is spammed into smithereens. Blaming AI and social media does not help.


Well, obviously, `npm` has the same destructive power: package might include a script which steals secrets or wipes a hard drive. But people just assume that usually they don't.

I don't believe this would be more efficient.

Use of common tools like `ls` and file patching is already baked into model's weights, it can do that with minimal amount of effort, leaving more room for actually thinking about app's code.

If you force it to wrap these actions into non-standard tools you're basically distracting the model: it has to think about app-code and tool-code in the same context.

In some cases it does make sense to encourage the model to create utilities for itself - but you can do that without enforcing code-only.


It doesn’t matter if it’s less efficient, what matters is that it has more chances to verify and get it right. It’s hard to rollback a series of tool calls. It’s easier to revert state and rerun a complete piece of code until you get the desired result.

I don't think "efficency" is at all the point? At all?

It's safety, reliability, and human understanding -- and like OOP, for example, are often directly at odds with "efficiency."


We should differentiate AI models from AI apps.

Models just generate text. Apps are supposed to make that text useful.

An app can run various kinds of verification. But would you pay an extra for that?

Nobody can make a text generator to output text which is 100% correct. That's just not a thing people can do now.


True.

Also true that most tech writers are bad. And companies aren't going to spend >$200k/year on a tech writer until they hit tens of millions in revenue. So AI fills the gap.

As a horror story, our docs team didn't understand that having correct installation links should be one of their top priorities. Obviously if a potential customer can't install product, they'd assume it's bs and try to find an alternative. It's so much more important than e.g. grammar in a middle of some guide.


Consider hypothetical scenario: some present in the environment toxin is causing migraine symptoms.

A doctor following diagnostic criteria might assign "migraine" diagnosis and provide standard recommendations for migraine management.

Another doctor seeing a quick uptick of patients with migraine symptoms will try to investigate toxins and infections.

Which doctor is doing something useful here?


OK but why not just go back to Balsamiq and make it 'executable'?

You might believe that TUI is neutral, but it really isn't - there's a bajillion of different ways to make a TUI / CLI.


Weird title. Obviously, early AI agents were clumsy, and we should expect more mature performance in future.

Leopold Aschenbrenner was talking about "unhobbling" as an ongoing process. That's what we are seeing here. Not unexpected


This is just a more elaborate form of an escrow contract.

There's absolutely no need to make a new L1 for that: you can use existing smart contract/dapp platforms, plug into existing stable coin rails, etc.


Well, somehow, most of short-form content on YouTube doesn't have this problem. Perfectly clear dialogs.

I think the main problem is that producers and audio people are stupid, pompous wankers. And I guess it doesn't help that some people go to cinema for vibrations and don't care about the content.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: