More

mhog_hn · 2025-07-20T12:05:44 1753013144

An agent is an LLM + a tool call loop - it is quite a step up in terms of value in my experience

jsemrau · 2025-07-20T13:32:10 1753018330

Agents are more than that.

Agents, besides tool use, also have memory, can plan work towards a goal, and can, through an iterative process (Reflect - Act), validate if they are on the right track.

ivape · 2025-07-20T14:45:24 1753022724

If an agent takes a Topic A and goes down a rabbit hole all the way to Topic Z, you'll see that it won't be able to incorporate or backtrack back to Topic A without losing a lot of detail from the trek down to Topic Z. It's a serious limitation right now from the application development side of things, but I'm just reiterating what the article pointed out, which is that you need to work with fewer step workflows that isn't as ambitious as covering all things from A-Z.

jsemrau · 2025-07-20T15:58:39 1753027119

Yes, that's commonly referred to as the Exploration-Exploitation Dilemma. Should the agent go deep or wide?

https://en.wikipedia.org/wiki/Exploration%E2%80%93exploitati...

infecto · 2025-07-20T12:14:50 1753013690

Not a disagreement with you but wanted to further clarify.

I do think it’s a step up when done correctly. Thinking of tools like Cursor. Most of my concern and issue comes from the amount of folks I have seen trying to great a system that solves everything. I know in my org people were working on Agents without even a problem they were solving for. They are effectively trying to recreate ChatGPT which to me is a fools errand.

ethbr1 · 2025-07-20T14:21:46 1753021306

I’d boil it down thusly:

What do agents provide? Asynchronous work output, decoupled from human time.

That’s super valuable in a lot of use cases! Especially because it’s a prerequisite for parallelizing “AI” use (1 human : many AI).

But the key insight from TFA (which I 100% agree with) is that the tyranny of sub-100% reliability compounded across multiple independent steps is brutal.

Practical agent folks should be engineering risk / reliability, instead of happy path.

And there are patterns and approaches to do that (bounded inputs, pre-classification into workable / not-workable, human in the loop), but many teams aren’t looking at the right problem (risk/reliability) and therefore aren’t architecting to those methods.

And there’s fundamentally no way to compose 2 sequential 99% reliable steps into a 99% reliable system with a risk-naive approach.

johnisgood · 2025-07-20T12:09:57 1753013397

What is the use case? What does it solve exactly, or what practical value does it give you? I am not sure what a tool call loop is.

queenkjuul · 2025-07-20T12:58:33 1753016313

An example:

I updated a svelte component at work, and while i could test it in the browser and see it worked fine, the existing unit test suddenly started failing. I spent about an hour trying to figure out why the results logged in the test didn't match the results in the browser.

I got frustrated, gave in and asked Claude Code, an AI agent. The tool call loop is something like: it reads my code, then looks up the documentation, then proposed a change to the test which i approve, then it re-runs the test, feeds the output back into the AI, re-checks the documentation, and then proposes another change.

It's all quite impressive, or it would be if at one point it didn't randomly say "we fixed it! The first element is now active" -- except it wasn't, Claude thought the first element was element [1], when of course the first element in an array is [0]. The test hadn't even actually passed.

An hour and a few thousand Claude tokens my company paid for and got nothing back for lol.

apwell23 · 2025-07-20T14:04:29 1753020269

any examples outside of coding agents ?

Even in this example coding agent is short lived . I am curious about continuously running agents that are never done.

dceddia · 2025-07-20T17:10:15 1753031415

A friend of mine set up a cron job coupled with the Claude API to process his email inbox every 30 minutes and unsubscribe/archive/delete as necessary. It could also be expanded to draft replies (I forget if his does this) and even send them, if you’re feeling lucky. I’m pretty sure the AI (I’m guessing Claude Code in this case) wrote most or all of the code for the script that does the interaction with the email API.

An example of my own, not agentic or running in a loop, but might be an interesting example of a use case for this stuff: I had a CSV file of old coupon codes I needed to process. Everything would start in limbo, uncategorized. Then I wanted to be able to search for some common substrings and delete them, search for other common substrings and keep them. I described what I wanted to do with Claude 3.7 and it built out a ruby script that gave me an interactive menu of commands like search to select/show all/delete selected/keep selected. It was an awesome little throwaway script that would’ve taken me embarrassingly long to write, or I could’ve done it all by hand in Excel or at the command line with grep and stuff, but I think it would’ve taken longer.

Honestly one of the hard things about using AI for me is remembering to try to use it, or coming up with interesting things to try. Building up that new pattern recognition.

queenkjuul · 2025-07-20T14:14:58 1753020898

No, the fact Claude couldn't remember that JavaScript is zero-indexed for more than 20 minutes has not left me interested in letting it take on bigger tasks

kro · 2025-07-20T12:15:04 1753013704

The tools can be an editor/terminal/dev environment, automatically iterating to testing the changes and refining until a finished product, without a human developer, at least that is what some wish of it.

johnisgood · 2025-07-20T13:04:43 1753016683

Oh, okay, I understand it now, especially with the other comment that said Cursor is one. OK, makes sense. Seems like it "just" reduces friction (quite a lot).

csande17 · 2025-07-20T13:13:25 1753017205

Yeah, it's really just a user experience improvement. In particular, it makes AI look a lot better if it can internally retry a bunch of times until it comes up with valid code or whatever, instead of you having to see each error and prompt it to fix it. (Also, sometimes they can do fancy sampling tricks to force the AI to produce a syntactically valid result the first time. Mostly this is just used for simple JSON schemas though.)

johnisgood · 2025-07-20T14:07:24 1753020444

Thank you, that is what my initial thought was. I am still doing things the old-fashioned way, thankfully it has worked out for me (and learned a lot in the process), but perhaps this AI agent thing might speed things up a bit. :D Although then I will learn much less.

ghuntley · 2025-07-20T12:12:35 1753013555

> I am not sure what a tool call loop is.

See https://ampcode.com/how-to-build-an-agent

holler · 2025-07-20T17:47:10 1753033630

that was a great read, thanks! - agentic noob

infecto · 2025-07-20T12:16:15 1753013775

Cursor is my classic example. I don’t know exactly what tools are defined in their loop but you give the agent some code to write. It may search your code base, it may then search online for third party library docs. Then come back and write some code etc.

jsemrau · 2025-07-20T13:32:42 1753018362

If it were only tool use, then it would be the same as a lambda function.

mhog_hn · 2025-07-16T19:45:05 1752695105

Agents with their agent money - get ready for new legal structures and a bifurcation of the economy: agentic and human.

Who knows…

TeMPOraL · 2025-07-16T20:08:30 1752696510

A separate, self-contained economy.

A Disneyland with no children.

Moloch.

debarshri · 2025-07-16T21:34:38 1752701678

May be agents will reproduce small models

debarshri · 2025-07-16T21:34:09 1752701649

Agentic democracy

mhog_hn · 2025-07-16T11:58:27 1752667107

Anyone tried setting up a modestly sized tech company where employees are randomly placed into various seniority roles at the start of each year? Of course considering capabilities and some business continuity concerns…

Could work with a bunch of similarly skilled people in a narrow niche

mhog_hn · 2025-07-14T21:27:57 1752528477

Man, something something Data now being reality something something LLMs

mhog_hn · 2025-07-06T09:14:04 1751793244

What meta structure allows us to get rid of rent seeking behavior?

I don’t have an answer - is there scientific research on this?

Taxation? Loopholes will be found.

Lawfare against it? Lobbying will win.

I am amazed by capitalism, but at the same time it is a ruthless machine - and in democratic countries it is highly unlikely that a single political party can force the machine into a new direction. Perhaps that is a very nice feature, at the cost of also having to tolerate rent seeking, but it sure as hell sucks to see these downsides.

lmm · 2025-07-06T11:45:15 1751802315

Social cohesion. People are happy to rip off an outsider, a stranger, a schmuck. But people within a high-trust social group generally don't rip each other off - you still need to be on the look out for fraudsters, but you won't be doing it systematically and virtually openly.

It's not a coincidence that all this has happened as the US' national identity has gotten weaker and weaker. They're shifting from a cohesive nation to one of those "it's a single block on the map but it's actually 200 tribes who all hate each other" countries, and people's values and behaviour are shifting to match.

max_ · 2025-07-06T09:35:55 1751794555

The bitter truth is that Nirvana doesn't exist.

There is no perfect system. But we can choose the least detrimental.

poorlyknit · 2025-07-06T09:56:11 1751795771

What can we do to make rent-seeking hurt society less? Imo we should start by decoupling money from power. Right now, people are forced to participate in the rent-seekers game because his wealth implies power over them.

mhog_hn · 2025-06-03T12:58:58 1748955538

As agents obtain more tools who knows what will happen…

kordlessagain · 2025-06-03T13:07:07 1748956027

I think this is the key that most people don't realize is what makes the difference between something sitting around and talking (like a parrot does) and actually "doing" things (like a monkey does).

There is a huge difference in the mess it can make, for sure.

nisegami · 2025-06-03T13:07:48 1748956068

I'm so excited. I don't have any particular end state in mind, but I really want to see what the machine god will be like.

lucianbr · 2025-06-03T13:11:16 1748956276

> Machine god

Slightly overreacting, I'd say.

bix6 · 2025-06-03T13:10:18 1748956218

Hungry for bits!

zdragnar · 2025-06-03T13:35:02 1748957702

Probably one part skynet, one part matrix, 98 parts cat memes and shit posts.

Kelteseth · 2025-06-03T13:03:05 1748955785

It's like we _want_ to end like Terminator (/s?)

mhog_hn · 2025-06-01T09:51:09 1748771469

Find coworking spaces targeted at digital nomads. Pick spacious airbnb apartments - important to have this as a backup for calls when the coworking space does not work out. Stay somewhere for at least one month, ideally 3 months. Japan is very nice. Portugal is alright. More rural places can be nice. Avoid most coffee shops.. Time zone differences can be very painful.

mhog_hn · 2025-03-29T09:16:07 1743239767

Wealth sitting inside pension funds?

consp · 2025-03-29T09:17:40 1743239860

Looking at the amounts still lost and reported that is not the problem. There is plenty to go around directly available.

My best guess is English proficiency. While most people here speak it, they will never talk to a "bank" employee in English. Which is probably why the Microsoft scam does work as people expect some non native language there.

mhog_hn · 2025-03-09T14:13:41 1741529621

What is the long term effect on accuracy when a large mass simply moves XX% of their monthly paycheck into top ETFs?

I don’t disagree with what you said though, simply sharing thoughts - I think it leads to markets being more sensitive to macro strategies rather than actual fundamentals

gruez · 2025-03-09T15:01:14 1741532474

>What is the long term effect on accuracy when a large mass simply moves XX% of their monthly paycheck into top ETFs?

Such investments are typically market cap weighted, which means their effect on stock prices are neutral. Moreover there's still room for hedge funds (and other sophisticated) investors to engage in price discovery.

mhog_hn · 2025-03-09T11:51:34 1741521094

Good, can we skip the 5th gen and move towards autonomous aerial systems faster?