more robots0only's comments

robots0only · 2025-08-26T19:58:54 1756238334

Claude is extremely poor at vision when compared to Gemini and ChatGPT. i think anthropic severely overfit their evals to coding/text etc. use cases. maybe naively adding browser use would work, but I am a bit skeptical.

bdangubic · 2025-08-26T20:07:45 1756238865

I have a completely different experience. Pasting a screenshot into CC is my de-facto go-to that more often than not leads to CC understanding what needs to be done etc…

user453 · 2025-08-26T21:05:06 1756242306

Is it overfitting if it makes them the best at those tasks?

robots0only · 2025-08-11T21:56:23 1754949383

The 1 million robot number that Amazon keeps on using is a quite nuanced. It includes more ~800K robots that simply just move stuff in a 2D plane. I think the number of robots that actually manipulate things is far far less (probably less than 500) (but really no human wants to just move things from A to B).

Also, I completely agree with what you said. Cars (w/ no self-driving) can be thought of as primitive robots (just like robots of today). For good or bad, we will move towards more and more automation.

Animats · 2025-08-11T22:17:43 1754950663

The simple Kiva mobile platforms are most of the robot count, but they replaced large numbers of people who did walk around warehouses moving stuff from A to B.

rangestransform · 2025-08-11T22:51:31 1754952691

IIRC Amazon laid off the entire team that was working on manipulation research at the Boston area Amazon Robotics

Animats · 2025-08-12T04:32:41 1754973161

They did? Amazon has recently been showing their Vulkan picking robot, and Aaron Parness still seems to be at Amazon.

robots0only · 2025-06-24T21:00:29 1750798829

ohh wow, that's bad, just tried this with Gemini 2.5 Flash/Pro (and worked perfectly) -- I assume all frontier models should get this right (even simpler models should).

quantadev · 2025-06-24T21:07:18 1750799238

I'd be willing to bet a more clear prompt would've given a good answer. People generally tend to overlook the fact that AIs aren't like "google". They're not really doing pure "word search" similar to Google. They expect a sensible sentence structure in order to work their best.

roywiggins · 2025-06-24T21:12:31 1750799551

Maybe, but this sort of prompt structure doesn't bamboozle the better models at all. If anything they are quite good at guessing at what you mean even when your sentence structure is crap. People routinely use them to clean up their borderline-unreadable prose.

quantadev · 2025-06-24T23:00:26 1750806026

I wish I had a nickle for every time I've seen someone get a garbage response from a garbage prompt and then blame the LLM.

macNchz · 2025-06-24T21:41:24 1750801284

I'm all about clear prompting, but even using the verbatim prompt from the OP "ffmpeg command to convert movie.mov into a reasonably sized mp4", the smallest current models from Google and OpenAI (gemini-2.5-flash-lite and gpt-4.1-nano) both produced me a working output with explanations for what each CLI arg does.

Hell, the Q4 quantized Mistral Small 3.1 model that runs on my 16GB desktop GPU did perfectly as well. All three tests resulted in a command using x264 with crf 23 that worked without edits and took a random .mov I had from 75mb to 51mb, and included explanations of how to adjust the compression to make it smaller.

quantadev · 2025-06-24T23:04:13 1750806253

There's as much variability in LLM AI as there is in human intelligence. What I'm saying is that I bet if that guy wrote a better prompt his "failing LLM" is much more likely to stop failing, unless it's just completely incompetent.

What I always find hilarious too is when the AI Skeptics try to parlay these kinds of "failures" into evidence LLMs cannot reason. If course they can reason.

Kiro · 2025-06-25T14:09:50 1750860590

I get better result when I intentionally omit parts and give them more playroom to figure it out.

quantadev · 2025-06-25T14:54:38 1750863278

Less clarity in a prompt _never_ results in better outputs. If the LLM has to "figure out" what your prompt likely even means its already wasted a lot of computations going down trillions of irrelevant neural branches that could've been spent solving the actual problem.

Sure you can get creative interesting results from something like "dog park game run fun time", which is totally unclear, but if you're actually solving an actual problem that has an actual optimal answer, then clarity is _always_ better. The more info you supply about what you're doing, how, and even why, the better results you'll get.

Kiro · 2025-06-25T15:06:47 1750864007

I disagree. Less clarity gives them more freedom to choose and utilize the practices they are better trained on instead of being artificially restricted to something that might not be a necessary limit.

quantadev · 2025-06-25T16:06:49 1750867609

The more info you give the AI the more likely it is to utilize the practices it was trained on as applied to _your_ situation, as opposed to random stereotypical situations that don't apply.

LLMs are like humans in this regard. You never get a human to follow instructions better by omitting parts of the instructions. Even if you're just wanting the LLM to be creative and explore random ideas, you're _still_ better off to _tell_ it that. lol.

Kiro · 2025-06-25T18:02:20 1750874540

Not true and the trick for you to get better results is to let go of this incorrect assumption you have. If a human is an expert in JavaScript and you tell them to use Rust for a task that can be done in JavaScript, the results will be worse than if you just let them use what they know.

quantadev · 2025-06-25T18:25:25 1750875925

The only way that analogy remotely maps onto reality in the world of LLMs would be in a `Mixture of Experts` system where small LLMs have been trained on a specific area like math or chemistry, and a sort of 'Router pre-Inference' is done to select which model to send to, so that if there was a bug in a MoE system and it routed to the wrong 'Expert' then quality would reduce.

However _even_ in a MoE system you _still_ always get better outputs when your prompting is clear with as much relevant detail as you have. They never do better because of being unconstrained as you mistakenly believe.