There was a company in the Netherlands, can't seem to find the name right now, that rented out GPU clusters as central heaters, while using the GPUs to mine crypto. I believe they went backrupt during the whole crypto crash and energy crisis.
I think they refer to Lich. From wikipedia: In fantasy fiction, a lich (/ˈlɪtʃ/;[1] from the Old English līċ, meaning "corpse") is a type of undead creature. https://en.m.wikipedia.org/wiki/Lich
That looks like a case where "analyse the AST after constant folding" might be a theoretical path if you had a language frontend that could emit the AST at that point.
I suspect that things like "these two functions both start with the same conditional+early return" would be more useful to -me- given the sort of things I tend to be working on. Also a 'fuzzy possible copy+paste detector' in general to help identify refactoring targets.
It also strikes me that something that was mostly 'just' a structure-aware diff so e.g. you got diffs within-if-body and similar but I'm now into vigorous hand waving because it's been ages since I've thought about this and I probably need more coffee.
I -did- do a pure maths degree many years ago but I don't generally seem to end up working on computational code
to the downvoter: I thought this was a reasonable question? Semantic equivalence is IIRC undecidable in general. Some languages (Backus' FL?) try to deal with that but I dunno.
It's incredibly useful if you have many threads that produce a variable number of outputs. Imagine you're implementing some filtering operation on the GPU, many threads will take on a fixed workload and then produce some number of outcomes. Unless we take some precautions, we have a huge synchronization problem when all threads try to append their results to the output. Note that GPUs didn't have atomics for the first couple of generations that supported CUDA, so you couldn't just getAndIncrement an index and append to an array. We could store those outputs in a dense structure, allocating a fixed number of output slots per thread, but that would leave many blanks in between the results. Now once we know the number of outputs per thread we can use a prefix sum to let every thread know where they can write their results in the array.
The outcome of a prefix sum exactly corresponds with the "row starts" part of the CSR sparse matrix notation. So they are also essential when creating sparse matrices.
I haven't tried WebGPU yet, is there an overall performance hit compared to direct CUDA programming?
AFAIK Thrust is intended to simplify GPU programming. It could well be that for specific use cases, in particular when it is possible to fuse multiple operations into single kernels, you could outperform Thrust.
There is definitely at least a performance hit in that wgpu (and I think WebGPU too) only supports a single queue. That means you can't asynchronously run compute tasks while running render tasks.
Additionally Wgpu (the library) will insert fences between all passes that have a read-write dependency on a binding, even if there is technically no fence needed as 2 passes might not access the same indices.
Finally I know that there is an algorithm called decoupled look back that can speed up prefix sums, but it requires a forward-progress guarantee. All recent NVIDIA cards can run it but I don't think AMD can, so WebGPU can't in general. Raph Levien has a blog post on the subject https://raphlinus.github.io/gpu/2021/11/17/prefix-sum-portab...
I bought the first edition when it came out, and definitely it was a gold mine of information on the subject. I wonder though, is the fourth edition worth buying another copy? Nvidia has been advancing CUDA, in particular moving more towards C++ in the kernel language. But none of that was present when this book came out in 2007. Now more and more stuff is happening at thread block level with the cooperative group C++ API and warp level for tensor cores. It would be great if the authors revisited all the early chapters to modernize that content, but that's a lot of work so I don't usually count on authors making such an effort for later editions.
I also read the older edition and got the 4th for the second read recently.
I felt that the updated coverage is more on the GPU side than the language side.
It covers new GPU features and architectures well. I don't think it covers Tensor core things. But I might be wrong.
So it's worth the update if you're interested in general NVIDIA GPU evolution.
If you want to go really in-depth I can recommend GTC on demand. It's Nvidia streaming platform with videos from past GTC conferences. Tony Scuderio had a couple of videos on there called GPU memory bootcamp that are among the best advanced GPU programming learning material out there.
100% this. You can find all kinds of detailed topics, like CUDA graphs, memory layout optimization, optimizing storage access, etc. https://www.nvidia.com/en-us/on-demand/. They have "playlists" for things like HPC or development tools that collect the most popular videos on those topics.
Historically, how did people make higher ABV drinks? It's my understanding that most "wild yeast" will die off at around 5-6% ABV. Were people cultivating and sharing yeasts capable of surviving higher ABVs, or am I misunderstanding?
"Most yeast strains can tolerate an alcohol concentration of 10–15% before being killed. This is why the percentage of alcohol in wines and beers is typically in this concentration range."
I assume temperature and other factors play a role in how successful you are at keeping them alive that long though.
Brewers reused yeast skimmed from the top of fermenting beer ("barm"). Wikipedia's "History of beer" article quotes a 1557 source mentioning this, and it probably goes back much earlier. The motivation here is probably speed and reliability of fermentation, which are obvious benefits to people not aware that yeast is a microorganism, but it also incidentally breeds for alcohol tolerance (especially considering the popularity of strong beers). Reusing wooden brewing equipment without sanitizing between batches has similar effects.
It does. There's an overuse of graphs and especially the lack of units and y-axis labels on some of the graphs was annoying, but overall it's still a quite interesting and entertaining read in my opinion.
I live in the Netherlands. My own gas usage was 25% lower in 2022 compared to the average over 2019-2021. We didn't do anything drastic to reduce gas usage. The first 3 months of 2023 were somewhat colder with gas usage is much closer to Spring 2021 and nowhere near as low as Spring 2022. I wonder how accurate the model used in this article is. The article isn't very clear on what kind of temperature values are used as input to this model. Is it daily mean? Hourly? I can imagine that you want to somehow factor in the diurnal cycle which would be obscured by daily means.