Hacker Newsnew | past | comments | ask | show | jobs | submit | more hm-nah's commentslogin

I’d rename this “Show HN: The ChapGPT Canvas missing functionality in 10mi new of JavaScript”


Careful what you wish for. There are likely billionaires considering this very thing as a method of dealing with “the AI impact”.

They will define what “…doing social or creative work…” entails, likely contractually, and then you’re right back where you started.

I think, we need to rethink, where this basic income originates.

    - Philanthropic individual billionaires? 
        - Mythical creatures.             

    - Philanthropic trillionaires (aka: large govts or corps)? 
        - Mythical creatures. 
   
    - Collective individuals (aka: you and me)?
        - Now you’re on to something.
Unfortunately, organizing humans is right up there with trench digging in terms of easy work.


I’ve been attempting to deploy a customized AnythingLLM instance within an enterprise env. TimC (and presumably dev crew) are top notch and very responsive.

Waiting for EntraID integration. Post-that, a customized version of AnythingLLM can tick the boxes for most of the lowest hanging use cases for an org.

Thanks for the killer app TimC and crew!


Reflect on where I am in life, where I’ve been, where I’d like to go.

Write (paper and pen), draw diagrams, sometimes sketch.

If sick, watch fav movies and only do email via phone.


I’m not a “Tech Lead”, but as a Sr. who leads emerging GenAI tech at Corpo, here’s my personal approach:

1. Support the orgs journey - This could be everything from accross requests or ad gov queries.

2. Experiment independentl - This is critical. If you don’t have the elbow rooms to explore, you’re likely taking directions from some turd.

3. Experiment with business stakeholders - This is how you learn about what biz (thinks they) wants. - Also critical.

4.(Internally) Open-source our teams findings - Try to generalize the technical components to a degree where others can utilize.

5. Consult with AppDev Teams - Sharing best practices, methods and pointing them to the open-source components as needed.

6. Build community - We host Bi-Weekly Task Force Meetups - Focused on tech-minded business stakeholders - On the off-weeks of these Task Force Meetups, we host: - GenAI Developers Meetups. - Focused on tech tech. - GenAI Office Hours

Sprinkled in there are white-glove training and general consulting.


@YCombinator: let’s go with some simple markdown parsing, eh!?



The first I think of when anyone mentions agent-like “tool use” is:

- Is the environment that the tools are run from sandboxed?

I’m unclear on when/how/why you’d want an LLM executing code on your machine or in a non-sandboxed environment.

Anyone care to enlighten?


The llm just returns a method name and arguments to pass it. Your code is in charge of actually executing it, and then replying with an answer.


Well often the code at the end of the day just reads data from a database or processes it in some way that relies on moving bits around / operations that the LLM on its own cannot do.

IMO Tool is a bad word for the majority of the use cases ("calculator", "weather API"). It's more like giving the LLM an old school calculator + a constrained data retriever.

Because you or somebody you entrust knows every line of code in the functions ultimately called at a high-ish level, you can do it, and know it is only really receiving data, not taking arbitrary action.

Now letting it rampantly run a python process arbitrarily etc, that'd be different, I suppose that fits in. But I think this is largely NOT how people are using tools since if you do that, how do you ever usefully know how to get the output of running it and apply that output?


It's "function calling" that's the even worst naming IMHO, as the point is that the LLM is not actually calling a function, but just proposes a function call... Who will out themselves as having come up with this confusion?


You can use it to feed extra context in, similar to RAG but allowing the LLM to "decide" what information it needs. I think it's mostly useful in situations where you want to add content that isn't semantically related, and wouldn't RAG well.

E.g. if I were making an AI that could suggest restaurants, I could just say "find a Mexican restaurant that makes Horchata", have it translate that to a tool call to get a list of restaurants and their menus, and then run inference on that list.

I also tinkered with a Magic: The Gathering AI that used tool calling to get the text and rulings for cards so that I could ask it rules questions (it worked poorly). It saves the user from having to remember some kind of markup to denote card names so I can pre-process the query.


> Is the environment that the tools are run from sandboxed?

It is up to the person who implements the tool to sandbox it as appropriate.

> I’m unclear on when/how/why you’d want an LLM executing code on your machine or in a non-sandboxed environment.

The LLM does not execute code on your computer. It returns the fact that the LLM would like to execute a tool with certain parameters. You should trust those parameters as much as you trust the prompt and the LLM itself. Which in practice probably ends up being "not much".

Good news is that in your tool implementation you can (and should) apply all the appropriate checks using regular coding practices. This is nothing new. We do this all the time with web requests. You can check if the prompt originates from an authenticated user, if they have the necessary permissions to do the action they are about to do. You can throttle the requests, you can check that the inputs are appropriate etc etc.

If the tool is side-effect free and there are no access restrictions you can just run it easy. For example imagine an LLM which can turn the household name of a plant to its latin name. You would have a "look_up_latin_name" tool which searches in a local database. You have to make sure to follow best practices to avoid an sql injection attack, but otherwise this should be easy.

Now imagine a more sensitive situation. A tool with difficult to undo side-effects, and strict access controls. For example launching an ICBM attack. You would create a "launch_nukes" tool, but the tool wouldn't just launch willy nilly. First of all it would check that the prompt arrived from directly the president. (how you do that is best discussed with your NSA rep in person) Then it would check that the parameter is one of the valid targets. But that is not enough yet. You want to make sure it is not the LLM hallucinating the action. So you would pop up a prompt directly on the UI to confirm the action. Something like "Looks like you want to destroy <target>. Do you want to proceed? <yes> <no>" And would only launch when the president clicks the yes.


It's up to the implementation to determine what running a tool actually means: "tool-use" means you can tell the LLM "you have these functions which take these options", and then it can output a magic stanza asking the code conversing with the LLM to invoke one of those functions with the given parameters.

You COULD do dangerous things, but it's not like the LLM is constructing code it runs on its own.


The given examples like checking weather or performing a nice clean mathematical operation seem more or less automatically safe. On the other hand, they talk about the ability to drive a web browser, which is decidedly less read-only and would also make me nervous.


Huge deal for humans moving forward. ~5-8yrs too late for me (2-5yrs for product dev, 3yrs since a hemorrhagic stroke that was likely caused by serious blood pressure).

I was 40 at the time and never measured my blood pressure (and certainly never when exercising). After the event I measured it all the time. During the 8th time of sitting in a chair, rolling up my sleeve, I thought, the Apple Watch has BP sensor, right?

That question sent me on a quest only to find that humans had not yet figured out a way to measure blood pressure on-the-go.

Congratulations on this effort!


Also curious of data source.


Nhtsa FARS


This site is full of ads and typos. There has to be a more authoritative source.


Seems my ad blocker is doing its job, I saw like a dozen diagrams and no ads.

Maybe try https://www.weatherzone.com.au/news/sudden-stratospheric-war...


I use the latest Firefox (115.13.0 esr) on a Win7 Pro desktop with uBlock Origin. I don't see any ads at all. The text appears to adequately describe the diagrams referenced. I found it to be readable. They make a forecast based on their observations and support that using analogies from recent historic data.

Basically, this could spin up a situation like we had here in Texas back in 2021 where we had a long-duration cold spell. From experience, it was not as fun as that sounds.

Or maybe I'm tired and it only seems to make sense because I'm not thinking clearly at this late hour.


It's a very bizarre site, with the ads and rambling text, but I've always found that it has very good info on things like polar vortex formation/breakdown and summer heatwaves, and pretty accurate longer term (up to a few months ahead, say) forecasts. You never feel like linking any of the articles to anyone though, because it feels like a spam blog. Very strange.


  > I've always found that it has very good info
You've heard of this site before? Or are you affiliated with it?


It's been linked on HN before


Yeah the ads were too much...


Found the folk who don't use adblockers.


Or who do not use reader mode, which works perfectly here.


Nice work Simon!


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: