Hacker Newsnew | past | comments | ask | show | jobs | submit | mromanuk's commentslogin

Probably another case of trying to measure something difficult, and people usually substitute that problem for an easier or more accessible one. Checking if a person can work under pressure and sensing their emotions and ability to deliver is easier to assess. This pattern comes from Thinking, Fast and Slow.

According to Runpod pricing page, you can run H100 for $2.39, it can go as lower as $528,629.76

WARNING: This is highly speculative and napkin math

H200 (141 GB HBM3 - $3.99/h - 1.4x perf) 216 x 24 x 17 = 88128h = 351.895,104 (17 days and 216 cards)

B200 (192 GB HBM3e - $5.99/h - 2.8x perf) 158 x 24 x 9 = 34128h = $204.426,72

Probably wrong math, should be more efficient and cheaper. Doubt that they have 100/200 cards available for that long.

Source: I've only trained using RTX4090 and stuff like that with 8 cards.

Not affiliated in any way with Runpod.


I definitely like the LLM in the middle, it’s a nice way to circumvent the SEO machine and how Google has optimized writing in recent years. Removing all the cruft from a recipe is a brilliant case for an LLM. And I suspect more of this is coming: LLMs to filter. I mean, it would be nice to just read the recipe from HTML, but SEO has turned everything into an arms race.


> Removing all the cruft from a recipe is a brilliant case for an LLM

Is it though, when the LLM might mutate the recipe unpredictably? I can't believe people trust probabilistic software for cases that cannot tolerate error.


I agree with you in general, but recipes are not a case where precision matters. I sometimes ask LLMs to give me a recipe and if it hallucinates something it will simply be taste bad. Not much different from a human-written recipe where the human has drastically different tastes than I do. Also you basically never apply the recipe blindly; you have intuition from years of cooking to know you need to adjust recipes to taste.


Hard disagree. I don’t have “years of cooking” experience to draw from necessarily. If I’m looking up a recipe it’s because I’m out of my comfort zone, and if the LLM version of the recipe says to add 1/2 cup of paprika I’m not gonna intuitively know that the right amount was actually 1 teaspoon. Well, at least until I eat the dish and realize it’s total garbage.

Also like, forget amounts, cook times are super important and not always intuitive. If you screw them up you have to throw out all your work and order take out.


All I'm arguing is that you should have the intuition to know the difference between 1/2 cup of paprika and a teaspoon. Okay maybe if you just graduated from college and haven't cooked much you could make such a mistake but realistically outside the tech bubble of HN you won't find people confusing 1/2 cup with a teaspoon. It's just intuitively wrong. An entire bottle of paprika I recently bought has only 60 grams.

And yes cook times are important but no, even for a human-written recipe you need the intuition to apply adjustments. A recipe might be written presuming a powerful gas burner but you have a cheap underpowered electric. Or the recipe asks for a convection oven but your oven doesn't have the feature. Or the recipe presumes a 1100W microwave but you have a 1600W one. You stand by the food while it cooks. You use a food thermometer if needed.


Huh? You don't care if an LLM switches pounds to kilograms because... recipes might taste bad anyway????


Switching pounds with kilograms is off by a factor of two. Most people capable of cooking should have the intuition to know something is awfully wrong if you are off by a factor of two, especially since pounds and kilograms are fairly large units when it comes to cooking.


Not really an apt comparison.

For one an AI generated recipe could be something that no human could possibly like, whereas the human recipe comes with at least one recommendation (assuming good faith on the source, which you're doing anyway LLM or not).

Also an LLM may generate things that are downright inedible or even toxic, though the latter is probably unlikely even if possible.

I personally would never want to spend roughly an hour or so making bad food from a hallucinated recipe wasting my ingredients in the process, when I could have spent at most 2 extra minutes scrolling down to find the recommended recipe to avoid those issues. But to each their own I guess.


There is a well-defined solution to this. Provide your recipes as a Recipe schema: https://schema.org/Recipe

Seems like most of the usual food blog plugins use it, because it allows search engines to report calories and star ratings without having to rely on a fuzzy parser. So while the experience sucks for users, search engines use the structured data to show carousels with overviews, calorie totals and stuff like that.

https://recipecard.io/blog/how-to-add-recipe-structured-data...

https://developers.google.com/search/docs/guides/intro-struc...

EDIT: Sure enough, if you look at the OPs recipe example, the schema is in the source. So for certain examples, you would probably be better off having the LLM identify that it's a recipe website (or other semantic content), extract the schema from the header and then parse/render it deterministically. This seems like one of those context-dependent things: getting an LLM to turn a bunch of JSON into markdown is fairly reliable. Getting it to extract that from an entire HTML page is potentially to clutter the context, but you could separate the two and have one agent summarise any of the steps in the blog that might be pertinent.

    {"@context":"https://schema.org/","@type":"Recipe","name":"Slowly Braised Lamb Ragu ...


I foreseen this a couple years ago. We already have web search tools in LLMs, and they are amazing when they chain multiple searches. But Spegel is a completely different take.

I think the ad blocker of the future will be a local LLM, small and efficient. Want to sort your timeline chronologically? Or want a different UI? Want some things removed, and others promoted? Hide low quality comments in a thread? All are possible with LLM in the middle, in either agent or proxy mode.

I bet this will be unpleasant for advertisers.


LLM adds cruft, LLM removes cruft, never a miscommunication


Do you also like what it costs you to browse the web via an LLM potentially swallowing millions of tokens per minutes ?


This seems like a suitable job for a small language model. Bit biased since I just read this paper[0]

[0] https://research.nvidia.com/labs/lpr/slm-agents/


I'm working on a meditation app, using an llm as a guide. It tracks your heart rate using the main camera of the phone, later will add breathing. Soon to be released.


The last animation is hilarious, represents very well the AI Hype cycle vs reality.


I procrastinate all the time, listen to much to your mind or chase the fun stuff only, would not get you traction, probably it’s a distraction, because the mind is lazy. Most of our systems wants to conserve energy or expend at little as possible. Going to the gym in a cold morning is not something that the mind or body is seeking, so listen to the idea of not going would be bad for you. Muscles are lazy too, they just want to chill. But if you make them do a little work, they like it and ask for more. We are weird and we need to force us to do stuff. That’s your job, you command your body


Every time I ask an LLM to write some UI and model for SwiftUI I have to specify to use @Observable macro (is the new way), which they normally do, after asking for it.

The LLM tells me that they prefere the "older way" because it's more broadly compatible, that's ok if you are aiming for that. But If the programmer doesn't know about that they will be stuck with the LLM calling the shots for them.


You need to create your own preamble that you include with every request. I generally have one for each codebase, which includes a style guide, preferred practices & design (lots of 'best practices' are cargo culted and the LLM will push them on you even when it doesn't make sense - this helps eliminate those), and declarations of common utility functions that may need to be used.


Use always-enabled cursor (or your agentic editor of choice) rules.


A thing people miss is that there are many different right ways to solve a problem. A legacy system might need the compatibility or it might be a greenfield. If you leave a technical requirement out of the prompt you are letting the LLM decide. Maybe that will agree with your nuanced view of things, but maybe not.

We're not yet at a point where LLM coders will learn all your idiosyncrasies automatically, but those feedback loops are well within our technical ability. LLM's are roughly a knowledgeable but naïve junior dev; you must train them!

Hint: add that requirement to your system/app prompt and be done with it.


It's just a higher level abstraction, subject to leaks as with all abstractions.

How many professional programmers don't have assemblers/compilers/interpreters "calling the shots" on arbitrary implementation details outside the problem domain?


But we trust those tools to do the job correctly. The compiler has considerable latitude in messing with the details so long as the result is guaranteed to match what was ordered--when we find any deviation from that even in an edge case we consider it a bug (Borland Pascal debugger, I'm looking at you. I wasted a *lot* of time on the fact that in single step mode you peacefully "execute" an invalid segment register load!) LLMs lack this guarantee.


We trust those tools to do the job correctly now.

https://vivekhaldar.com/articles/when-compilers-were-the--ai...


Have you tried writing rules for how you want things done, instead of repeating the same things every time?


The attempting backward compatibility trained behavior has never once been useful to me and is constantly an irritation.

> Please write this thing

> Here it is

> That's asinine why would you write it that way, please do this

> I rewrote it and kept backward compatibility with the old approach!

:facepalm:


Sounds like an OK default, especially since the "better" (in your opinion) way can be achieved by just adding "Don't try to keep backwards compatibility with old code" somewhere in your reusable system prompt.

It's mostly useful when you work a lot with "legacy code" and can't just remove things willy nilly. Maybe that sort of coding is over-represented in the datasets, as it tends to be pretty common in (typically conservative) larger companies.


You will get better results if you reset the code changes, tweak the prompt with new guidelines (e.g. don’t do X), and then run it again in a fresh chat.

The less cruft and red herrings in the context, the better. And likewise with including key info, technical preferences, and guidelines. The model can’t read our minds, although sometimes we wish so :)

There are lots of simple tricks to make it easier for the model to provide a higher quality result.

Using these things effectively is definitely an complex skill set.


I didn’t take typing lessons, but I’ve been typing since the 1980s, probably since 1987. At some point, I discovered that people typed without looking, decided that using 10 fingers and typing without looking at the keyboard was better, so I started optimizing for it, and it worked.


I was surprised that he didn't try to use on the flight compression, provided by rsync:

  -z, --compress              compress file data during the transfer
      --compress-level=NUM    explicitly set compression level
Probably it's faster to compress to gzip and later transfer. But it's nice to have the possibility to improve the transfer with a a flag.


Or better yet, since they cite corruption issues, sqlite3_rsync (https://sqlite.org/rsync.html) with -z

sqlite transaction- and WAL-aware rsync with inflight compression.


The main point is to skip the indices, which you have to do pre-compression.

When I do stuff like this, I stream the dump straight into gzip. (You can usually figure out a way to stream directly to the destination without an intermediate file at all.)

Plus this way it stays stored compressed at its destination. If your purpose is backup rather than a poor man's replication.


The main point was decreasing the transfer time - if rsync -z makes it short enough, it doesn't matter if the indices are there or not, and you also skip the step of re-creating the DB from the text file.


The point of the article is that it does matter if the indices are there. And indices generally don't compress very well anyways. What compresses well are usually things like human-readable text fields or booleans/enums.


I believe compression is only good on slow speed networks.


It would have to be one really fast network... zstd compresses and decompresses at 5+ GB (bytes, not bits) per second.


I just tested on a ramdisk:

  tool  cspeed    size  dspeed
  zstd  361 MB/s  16%   1321 MB/s
  lzop  512 MB/s  29%    539 MB/s
  lz4   555 MB/s  29%   1190 MB/s
If working from files on disk that happen not to be cached, the speed differences are likely to disappear, even on many NVMe disks.

(It just so happens that the concatenation of all text-looking .tar files I happen to have on this machine is roughly a gigabyte (though I did the math for the actual size)).


Looks like it depends heavily on choice of file, but I see good performance on both compressible and uncompressible files. Small files tend to perform (relatively) bad though. Here is a sample of 3 large files with different compression ratios:

  zstd -b1 --fast -i10 some-rpi-os-image-idk.img
  -1#-os-image-idk.img :2097152000 -> 226798302 (x9.247), 6765.0 MB/s, 5897.3 MB/s

  zstd -b1 --fast -i10 jdk-11.0.8+10.tar
  -1#jdk-11.0.8+10.tar : 688844800 -> 142114709 (x4.847), 2660.7 MB/s, 2506.8 MB/s

  zstd -b1 --fast -i10 archlinux-2025.04.01-x86_64.iso
  -1#.04.01-x86_64.iso :1236303872 ->1221271223 (x1.012), 3643.5 MB/s, 7836.6 MB/s


Ain't no way zstd compresses at 5+, even at -1. That's the sort of throughputs you see on lz4 running on a bunch of core (either half a dozen very fast, or 12~16 merely fast).


Where are you getting this performance? On the average computer this is by far not the speed.


Valve tends to take a different view...


Valve has different needs then most. Their files are rarely change so they only need to do expensive compression once and they save a ton in bandwidth/storage along with fact that their users are more tolerant of download responsiveness.


Is the network only doing an rsync? Then you are probably right.

For every other network, you should compress as you are likely dealing with multiple tenants that would all like a piece of your 40Gbps bandwidth.


In your logic, you should not compress as multiple tenants would all like a piece of your CPU.


This will always be something you have to determine for your own situation. At least at my work, CPU cores are plentiful, IO isn't. We rarely have apps that need more than a fraction of the CPU cores (barring garbage collection). Yet we are often serving fairly large chunks of data from those same apps.


Depends. Run a benchmark on your own hardware/network. ZFS uses in-flight compression because CPUs are generally faster than disks. That may or may not be the case for your setup.


What? Compression is absolutely essential throughout computing as a whole, especially as CPUs have gotten faster. If you have compressible data sent over the network (or even on disk / in RAM) there's a good chance you should be compressing it. Faster links have not undercut this reality in any significant way.


Whether or not to compress data before transfer is VERY situationally dependent. I have seen it go both ways and the real-world results do not not always match intuition. At the end of the day, if you care about performance, you still have to do proper testing.

(This is the same spiel I give whenever someone says swap on Linux is or is not always beneficial.)


He absolutely should be doing this, because by using rsync on a compressed file he's passing by the whole point of using rsync, which is the rolling-checksum based algorithm that allows to transfer diffs.


or used --remove-source-files so they didn't have to ssh back to rm


I was expecting a different outcome, that you tell us that Qwen3 nailed at first.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: