Hacker Newsnew | past | comments | ask | show | jobs | submit | sigbottle's commentslogin

[Autovectorization is not a programming model](https://pharr.org/matt/blog/2018/04/18/ispc-origins).

Sure, obviously, we will not undersatnd every single little thing down to the tiniest atoms of our universe. There are philosophical assumptions underlying everything and you can question them (quite validly!) if you so please.

However, there are plenty of intermediate mental models (or explicit contracts, like assembly, elf, etc.) to open up, both in "engineeering" land and "theory" land, if you so choose.

Part of good engineering as well is deciding exactly when the boundary of "don't cares" and "cares" are, and how you allow people to easily navigate the abstraction hierarchy.

That is my impression of what people mean when they don't like "magic".


> Then, when it fails [...], you can either poke it in the right ways or change your program in the right ways so that it works for you again. This is a horrible way to program; it’s all alchemy and guesswork and you need to become deeply specialized about the nuances of a single [...] implementation

In that post, the blanks reference a compiler’s autovectorizer. But you know what they could also reference? An aggresively opaque and undocumented, very complex CPU or GPU microarchitecture. (Cf. https://purplesyringa.moe/blog/why-performance-optimization-....)


I phrase I use is spooky action at a distance. Quantum entanglement but with software.

> And so now we have these “magic words” in our codebases. Spells, essentially. Spells that work sometimes. Spells that we cast with no practical way to measure their effectiveness. They are prayers as much as they are instructions.

Autovectorization is not a programming model. This still rings true day after day.


Are mac kernels optimized compared to CUDA kernels? I know that the unified GPU approach is inherently slower, but I thought a ton of optimizations were at the kernel level too (CUDA itself is a moat)

There’s this developer called nightmedia who converts a lot of models to apple MLX. I can run Qwen3 coder next at 60 tps on my m4 max. It works

Depending on what you do. If you are doing token generations, compute-dense kernel optimization is less interesting (as, it is memory-bounded) than latency optimizations else where (data transfers, kernel invocations etc). And for these, Mac devices actually have a leg than CUDA kernels (as pretty much Metal shaders pipelines are optimized for latencies (a.k.a. games) while CUDA shaders are not (until cudagraph introduction, and of course there are other issues).

Mac kernels are almost always compute shaders written in Metal. That's the bare-minimum of acceleration, being done in a non-portable proprietary graphics API. It's optimized in the loosest sense of the word, but extremely far from "optimal" relative to CUDA (or hell, even Vulkan Compute).

Most people will not choose Metal if they're picking between the two moats. CUDA is far-and-away the better hardware architecture, not to mention better-supported by the community.


That's actually a great point. I feel like unless you know for sure that you will never need something again, nothing is disposable. I find myself diving into places I thought I would never care about again ALL the time.

Every single time I have vibe coded a project I cared about, letting the AI rip with mild code review and rigorous testing has bit me in the ass, without fail. It doesn't extend it in the taste that I want, things are clearly spiraling out of control, etc. Just satisfying some specs at the time of creation isn't enough. These things evolve, they're a living being.


In a simple text based game I'm vibe coding for fun, I created skills that help the specs evolve.

I started with chatgpt, I told it to make me a road map of game features.

Then I use that road map to guide my LLM (I use codex 5.3), with the specification — when working on tasks, if you learn anything that may be out of scope, add it to the road map.

There's a bit more to it than that, but so far I've got a playable game, and at some point the requirement of adding an admin dashboard for experiments got added to the road map, and that got implemented pretty well too.

At first I did review a lot of its code, but now I just let it rip and I've been happy with it thus far.

At work I use AI heavily but obviously since I'm responsible for whatever code I push I do actually review and test and understand, but mostly I just need to tweak some small things before it's good enough to ship.


https://learn.microsoft.com/en-us/shows/seth-juarez/anders-h...

https://news.ycombinator.com/item?id=11685317

https://lobste.rs/s/dwf2yn/sixten_s_query_based_compiler

https://ericlippert.com/2012/06/08/red-green-trees/

Rust's salsa, etc.

Related search terms are incremental compilation and red-green trees. It's primarily an ide driven workflow (well, the original use case was driven by ides), but the principles behind it are very interesting.

You can grok the difference by thinking through, for example, the difference between invoking `g++` on the command line - include all headers, then compile object files via includes, re-do all template deduction, etc. and one where editing a single line in a single file doesn't change the entire data structure much and force entire recompilation (this doesn't need full ownership of editing either by hooking UI events or keylogging: have a directory watcher treat the file diff as a patch, and then send it to the server in patch form; the observation being that compiling an O(n) size file is often way more expensive than a program that goes through the entire file a few times and generates a patch)

AST's are similar to these kinds of trees only insofar as the underlying data structure to understand programming languages are syntax trees.

I've always wanted to get into this stuff but it's hard!


OK, but that is distinctly NOT what clang does... incremental compilation with clang is handled at the build system level. I can't speak for rustc, but I do know that it typically ends up going through llvm, which, contrary to the author's claims, is exactly a pipeline.

I save out of a combination of buying into minimalism as a kid (though obviously it's nuanced) but also out of laziness, lol.

Whoa. Guess I have more literature reading to be doing! Always interesting to hear a counterargument along with future reading

Roko's basilisk attributes some kind of moral superiority to the AI, be it be much smarter than humans (whatever that even means), plus more compassionate, more rational, etc.

This is more like people in power dictating what matters or doesn't matter simply because it's what they think. And that gets encodified in reality.


In some sense, it's good to talk about what you aren't saying, to be more informative and precise.

But like, all of these statements are basically ampliative statements, to make it more grand and even more ambiguous.


Unironically, this is great training data for humans.

No sane person would say this kind of stuff out loud; this often happens behind closed doors, if at all (because people don't or can't express their whole train of thought). Especially not on the internet, at least.

Having AI write like this is pretty illustrative of what a self-consistent, narcissistic narrative looks like. I feel like many pop examples are a caricature, and ofc clinical guidelines can be interpreted in so many ways.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: