Do you have any Intel references for it? I mean, Rust has its own memory model and it will not always give the same guarantees as when writing assembler.
“Because the WC protocol uses a weakly-ordered memory consistency model, a fencing operation implemented with
the SFENCE or MFENCE instruction should be used in conjunction with VMOVNTDQ instructions if multiple processors might use different memory types to read/write the destination memory locations”
That is something I can agree with, but I can't in good faith just let "it's just a hint, they don't have anything to do with correctness" stand unchallenged.
IIRC they used the write-combining buffer, which was also a cache.
A common trick is to cache it but put it directly in the last or second-to-last bin in your pseudo-LRU order, so it's in cache like normal but gets evicted quickly when you need to cache a new line in the same set. Other solutions can lead to complicated situations when the user was wrong and the line gets immediately reused by normal instructions, this way it's just in cache like normal and gets promoted to least recently used if you do that.
A source on what? The Intel optimization manuals explain what MOVNTQ is for. I don't think they explain in detail how it is implemented behind-the-scenes.
“The non-temporal move instructions (MOVNTI, MOVNTQ, MOVNTDQ, MOVNTPS, and MOVNTPD) allow data to be moved from the processor’s registers directly into system memory without being also written into the L1, L2, and/or L3 caches. These instructions can be used to prevent cache pollution when operating on data that is going to be modified only once before being stored back into system memory. These instructions operate on data in the general-purpose, MMX, and XMM registers.”
I believe that non-temporal moves basically work similar to memory marked as write-combining; which is explained in 13.1.1: “Writes to the WC memory type are not cached in the typical sense of the word cached. They are retained in an internal write combining buffer (WC buffer) that is separate from the internal L1, L2, and L3 caches and the store buffer. The WC buffer is not snooped and thus does not provide data coherency. Buffering of writes to WC memory is done to allow software a small window of time to supply more modified data to the WC buffer while remaining as non-intrusive to software as possible. The buffering of writes to WC memory also causes data to be collapsed; that is, multiple writes to the same memory location will leave the last data written in the location and the other writes will be lost.”
In the old days (Pentium Pro and the likes), I think there was basically a 4- or 8-way associative cache, and non-temporal loads/stores would go to only one of the sets, so you could only waste 1/4 (or 1/8) on your cache on it at worst.
That’s what I hope for, but everything that isn’t bananas expensive with unified memory has very low memory bandwidth. DGX (Digits), Framework Desktop, and non-Ultra Macs are all around 128 gb/s, and will produce single digits tokens per second for larger models: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inferen...
So there’s a fundamental tradeoff between cost, inference speed, and hostable model size for the foreseeable future.
This is almost perfect for my needs (communication and location sharing in areas without phone signal.) But it's missing GPS/GNSS. I'm not sure how niche that use case is, so more broadly I wish Lilygo offered something akin to Raspberry Pi hats -- i.e. some way to add extensions.
I’ve heard Beaglebone Blacks are great (never used them), but have to say the experience with Blue was awful. I had a hardware defect that’s apparently commonplace. It gave me the feeling (maybe unjustified) that there’s a low bar for slapping the Beaglebone label on a product. In contrast, none of my several RPi’s have had any hardware issues, and I’m pretty brutal with them.
Having said that, apart from what you also flagged, it's also a bit bland. Like "there's some legal stuff here, exercise caution".
Instead the badge could have symbolic character, an emotive icon ... something. Something that strongly implies "Danger Will Robinson" without explicitly saying so. Something any company would want to avoid risking that thing showing next to their logo, unless it was absolutely necessary.
As it is now, all I'm getting is a bland "huh, something legal must have happened here".
I hear what you're saying, and I think that it would be good to have more clear language, particularly for people new to white-collar employment (which, I imagine, would be a good portion of Glassdoor's audience)(people with more established careers can check Glassdoor AND ask people in their network, whereas people new to the industry may lack/have less of the professional network).
That said, at least here in the US, a carefully-bland legal statement strongly implies what you're looking for. Like, the more bland, the bigger the warning sign :)
I think it should also be below the ZURU logo. Right now it looks like a generic warning for the site (e.g. "Glassdoor goes offline for maintenance in 5 mins")
I imagine Glassdoor is reacting to the situation and is going on the "Better to get out part of the solution now than wait to get everything out perfectly" which I would agree with.
Not only is a “heavyweights of the rust community”, he was literally one of the main designer of the language at Mozilla (he's still contributor #6 by commit[1] despite not having worked on it for the past 7 years!)
This is becoming a pointless meta, but the parent comment didn't indicate in any way that I was talking to a "professor". The comment said, great that there are more languages with GC. I disagree whoever may say that.
I'm not a "professor" but as a software engineer with 35 years in this industry I can say that new languages should avoid GC's (as in, generational and related) and stick to either ARC or Rust-like compile-time memory management.
Just because the original comment is by, let's say, a prominent figure, doesn't make it right.
P.S. I rarely downvote out of disagreement, only for comment quality.
> I'm not a "professor" but as a software engineer with 35 years in this industry I can say that new languages should avoid GC's
With respect, and much less experience than You, I really don’t think so. I believe the majority of languages are better off being managed. Low-level languages do have their place and I am very happy for Rust that does bring some novel idea to the field. But that lower detail is very much not needed for the majority of applications. Also, ARC is much much slower than a decent GC, so from a performance perspective as well, it would make sense to prefer GCd runtimes.
ARC is in fact faster than GC, and even more so on M1/M2 chips and the Swift runtime. There were benchmarks circulating here on Hacker News, unfortunately can't find those posts now. GC requires more memory (normally double the amount of that of an ARC runtime) and is slower even with more memory.
How can more and sync work be faster than a plain old pointer bump and then some asynchronous, asymptotic work done on another thread? Sure, it does take more memory, but in most cases (OpenJDK for example) it is simply a thread local arena allocation where it is literally an integer increase, and an eventual copy of live objects to another region. You couldn’t make it any faster, malloc and ARC are both orders of magnitude slower.
ARC, while in certain cases can elide, will still in most case have to issue atomic increases/decreases that are the slowest thing on modern processors. And on top it doesn’t even solve the problem completely (circular references), mandating a very similar solution than a tracing GC (as ref counting is in fact a form of GC, tracing looking it live edges between objects, ref counting looking at dead edges)
I'm not familiar with the details but it is said that Swift's ARC is several times faster than ObjC's, it somehow doesn't always require atomic inc/dec. It also got even better specifically on the M1 processors. As for GC's, with each cycle there's always overhead of going over the same objects that are not disposable.
Someone also conducted tests, for the same tasks and on equivalent CPU's Android requires 30% more energy and 2x RAM compared to iOS. Presumably the culprit is the GC.
That’s a very strong presumably, on a very niche use case of mobile devices.
It is not an accident that on powerful server machines all FAANG companies use managed languages for their critical web services, and there is no change on the horizon.
It might be because on the server side they usually don't care about energy or RAM much. The StackOverflow dev team has an interesting blog post somewhere, where they explain that they figured at one point C#'s GC was the bottleneck and they had to do a lot of optimizations at the expense of extra code complexity to minimize the GC overhead.
It is actually quite rare that companies think of their infrastructure costs, it's usually just taken for granted, plus that there aren't many ARC languages around.
Anyway I'm now rewriting one of my server projects from PHP to Swift (on Linux) and there's already a world of difference in terms of performance. For multiple reasons of course, not just ARC vs. GC, but still.
With all due respect, (big) servers care about energy costs a lot, at least as much as mobile phones. By the way, out of the manages languages Java has the lowest energy consumption. RAM takes the same energy whether filled or not.
Just because GC can be a bottleneck doesn’t mean it is bad or that alternatives wouldn’t have an analog bottleneck. Of course one should try to decrease the number of allocations (the same way you have to do in case of RC as well), but there are certain allocation types that simply have to be managed. For those a modern GC is the best choice in most use case.
It usually solves this in under 6 guesses. The guessing could be improved; at the moment it's random, but it could select for words with non-repeating letters to narrow down the search space faster.