Hacker Newsnew | past | comments | ask | show | jobs | submit | lewispollard's commentslogin

Etymologically speaking, "scapegoat" does mean "escaped goat" so it's not a crazy mistake to make.

Yeah that's a good salary in Europe. It's only slightly less than I make in the UK as a senior.


Ditto. It seems like the graduate wage in the US is 2x my senior salary in the UK, which sounds very similar to yours. It seems massively inflated compared to other US jobs. Tech jobs in the UK seem to be more inline with other sectors.


The point is that you don't need an LLM to pilot the thing, even if you want to integrate an LLM interface to take a request in natural language.


That’s a pretty boring point for what looks like a fun project. Happy to see this project and know I am not the only one thinking about these kinds of applications.


An LLM that can't understand the environment properly can't properly reason about which command to give in response to a user's request. Even if the LLM is a very inefficient way to pilot the thing, being able to pilot means the LLM has the reasoning abilities required to also translate a user's request into commands that make sense for the more efficient, lower-level piloting subsystem.


We don't need a lot of things, but new tech should also address what people want, not just needs. I don't know how to pilot drones, nor do I care to learn how to, but I want to do things with drones, does that qualify as a need? Tech is there to do things for us we're too lazy to do.


There are two different things:

1. a drone that you can talk to and fly on its own

2. a drone where the flying is controlled by an LLM

(2) is a specific instance of the larger concept of (1).

You make an argument that 1 should be addressed, which no one is denying in this thread - people are arguing that (2) is a bad way to do (1).


You're considering "talking to" a separate thing, I consider it the same as reading street signs or using object recognition. My voice or text input is just one type of input. Can other ML solutions or algorithms detect a tree (same as me telling it there is a tree,yaw to the right), yes, can LLMs detect a tree and determine what course of action to take? also true. Which is better? I don't know, but I won't be quick to dismiss anyone attempting to use LLMs.


Definitely maybe - but then we are discussing (2), i.e. "what is the right technical solution to solve (1)".

Your previous comment was arguing that (1) is great (which no one denies in this thread, and it is a different discussion about what products are desirable rather than how to build said product) in an answer to someone arguing (2).


I don't think you understand what an "LLM" is. They're text generators. We've had autopilot since the 1930s that relies on measurable things... like PID loops, direct sensor input. You don't need the "language model" part to run an autopilot, that's just silly.


You see to be talking past him and ignoring what they are actually saying.

LLMs are a higher level construct than PID loops. With things like autopilot I can give the controller a command like 'Go from A to B', and chain constructs like this to accomplish a task.

With an LLM I can give the drone/LLM system complex command that I'd never be able to encode to a controller alone. "Fly a grid over my neighborhood, document the location of and take pictures of every flower garden".

And if an LLM is just a 'text generator' then it's a pretty damned spectacular one as it can take free formed input and turn it into a set of useful commands.


They are text generators, and yes they are pretty good, but that really is all they are, they don't actually learn, they don't actually think. Every "intelligence" feature by every major AI company relies on semantic trickery and managing context windows. It even says it right on the tin; Large LANGUAGE Model.

Let me put it this way: What OP built is an airplane in which a pilot doesn't have a control stick, but they have a keyboard, and they type commands into the airplane to run it. It's a silly unnecessary step to involve language.

Now what you're describing is a language problem, which is orchestration, and that is more suited to an LLM.


"they don't actually learn"

Give the LLM agent write acces to a text file to take notes and it can actually learn. Not really realiable, but some seem to get useful results. They ain't just text generators anymore.

(but I agree that it does not seem the smartest way to control a plane with a keyboard)


If thats youre definition of learning, my casio FX has an "ans" feature that "learns" from earlier calculations!!


Can that "ans" variable influence the general way your casio does future calculations?

I don't think so. But with a AI agent it can.

Sure, they still don't have real understanding, but calling this technology mere text generators in 2026 seems a bit out of the loop.


My confusion maybe? Is this simulator just flying point a to b? Seems like it’s handling collisions while trying to locate the targets and identify them. That seems quite a bit more complex than what you are describing has been solved since the 1930s.


LLMs can do chat-completion, they don't do only chat completion. There are LLMs for image generation, voice generation, video generation and possibly more. The camera of a drone inputs images for the LLM, then it determines what action take based on that. Similar to if you asked ChatGPT "there is a tree in this picture, if you were operating a drone, what action would you take to avoid collision", except the "there is a tree" part is done by the LLMs image recognition, and the sys prompt is "recognize objects and avoid collision", of course I'm simplifying it a lot but it is essentially generating navigational directions under a visual context using image recognition.


> There are LLMs for image generation,

That part isn’t handled by an LLM

> voice generation,

That part isn’t handled by an LLM

> video generation

That part isn’t handled by an LLM


Yes it can be, and often is. Advanced voice mode in chatGPT and the voice mode in Gemini are LLMs. So is the image gen in both chatGPT and Gemini (Nano Banana).


What is it handled by? I'm honestly curious, there are models specifically labeled as for those tasks.


"You don't need the "language model" part to run an autopilot, that's just silly."

I think most of us understood that reproducing what existing autopilot can do was not the goal. My inexpensive DJI quadcopter has an impressive abilities in this area as well. But, I cannot give it a mission in natural language and expect it to execute it. Not even close.


Says right underneath:

> Beyond adjusting parameters, phase8 invites physical interaction. Sculpt sound by touching, plucking, strumming, or tapping the resonators – or experiment by adding found objects for new textures.

Like prepared piano.


Yes, but it's published there under a restrictive license which doesn't allow sharing of derivative works.


To be fair, it didn't originally support it, a firmware update came out some time after release that enabled BTLE connection.


It's the only controller I use (bar the Steam Deck's built in controller) despite owning plenty of other conventional controllers. Once you get used to it and make use of Steam Input's per-game customisation and mapping it works really well, especially if you treat it as a mouse-like input rather than conventional gamepad.

The only place it suffers for me is games that aren't coded to support simultaneous gamepad and mouse input, which you can work around by mapping the joystick as a keyboard input. Otherwise it's great.


The page shows, near the bottom, how the main output is gaussian splats, but it can also generate triangular meshes (visual mesh + collider).

However, to my eye, the triangular meshes shown look pretty low quality compared to the splat: compare the triangulated books on the shelves, and the wooden chair by the door, as well as weird hole-like defects in the blanket by the fireplace.

It's also not clear if it's generating one mesh for the entire world, it looks like it is - that would make interactability and optimisation more difficult (no frustrum culling etc, though you could feasibly chop the mesh up into smaller pieces I suppose).


Seems to be referencing this Forbes article, specifically talking about The Wonderful Company:

https://www.forbes.com/sites/michelatindera/2021/11/21/how-m...


I suspected this was about a singular case. Any typical pistachio farm doesn’t need anywhere near that kind of water.

Here we’d be talking about “the corporate name” in pistachios. That’s a very different thing from what “a” typical pistachio farmer needs. I submit the original comment is cherry-picking to make their argument.

A pistachio tree needs something like (on average) 60 gallons of water a day.

For one 365-day year that is 21,900 gallons/tree/year.

That’s a lot of water, of course. But most farms do not have in excess of 5.39 million pistachio trees (using the 130 billion gallon number).


WhoBird is much better for this, as it's realtime and processed on-device. It stores a log of each bird identified along with the percentage match, and can optionally store a clip of the audio that triggered the match.


Is WhoBird Android-only? Not finding it in the iPhone App Store?


yes, it is an android version of birdNET as stated on the github page https://github.com/woheller69/whoBIRD


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: