I was excited for this one to replace a bunch of single type Box<dyn Trait> returns, but there are still quite a few limitations using -> impl Trait in traits in general. It's still discouraged to use them in public APIs: https://blog.rust-lang.org/2023/12/21/async-fn-rpit-in-trait...
didn't read it, but someone told on a forum that Microsoft's patent is about dynamically updated rANS probabilities during decoding -- JPEG XL is about static rANS codes, the codes are never changed/updated during decoding, they are context modeled and already clustered at encoding time, which makes it faster to decode than dynamically updated codes
For those add-ons, don't you have to provide them with the prompts? You still need to actively _know_ how to do something to have it provide you with information.
You can use something like LunarVim which has everything already setup.
It's not the highest of quality hash functions (see the SMHasher benchmarks), but it is fast. A great alternative is XXH3 (https://cyan4973.github.io/xxHash/), which has seen far more usage in practice.
I'm using XXHash3 for verifying data integrity of large (10+MB) blobs. Very fast and appears to work very well in my testing--it's never missed any bit error I've thrown at it.
Aside: when storing hashes, be sure to store the hash type as well so that you can change it later if needed, e.g. "xxh3-[hash value]". RFC-6920 also has things to say about storing hash types and values, although I haven't seen its format in common use.
> be sure to store the hash type as well so that you can change it later if needed
Thanks for sharing this, I'd been doing this on my own for my own stuff (eg. foo.txt-xxh32-ea79e094), but it's good to know someone else has thought it through.
I ran into the problem once where someone had named some files foo-fb490c or something similar without any annotation, and when there was a problem, it took a file to figure out they were using truncated sha256 hashes.
If you had made it one section into the analysis, you would have seen that at the time MeowHash made certain cryptographic claims that the author set out to disprove.
The readme has since been updated. I didn't check whether any algorithmic changes were made on top, but the discussion of the analysis on github didn't point to a lot of low-hanging fruit.
It's not useless analysis, because even for non-cryptographic hashes you want the likelihood of any arbitrary hash to be roughly equal. A hash function which "prefers" certain outputs has a far higher probability of collision.
Don't you think asset planting is an attack against a game's pipeline?
The author of the article's page claims the hash is not cryptographic but actually goes on to make security claims about the hash. People who do not understand cryptography should be careful about making such claims. The author appear to understand this more than your comment demonstrates.
For example, a claim about change detection is a cryptographic claim of detecting preimage attacks. In a threat model, a security professional would determine whether a first preimage or a second preimage attack is what should be guarded in attack scenarios. Then, the professional would help with analysis, determining mitigations, defense in depth, and prioritization of fixing the vulnerabilities exposed by how the hash is used.
A hash cannot be considered standalone. It is the architecture and use-case where the hash's security properties are used to determine what security properties of the application are fulfilled.
So, if the author is correct, which seems to be the case, then meowhash should not be used in a production environment outside of the most simplistic checks. It seems faster for its intended use case to simply check for a single bit difference between two images - no hash required.
Realistically hashes are used during the development of a game to detect when a file or asset hash changed, and therefore it will trigger regeneration of assets that depend on it. For some long or slow build steps it is better to rely on a hash changing than the timestamp changing to trigger a build. It can also be used to fetch pre-built data from where it's cached on a server.
That's why it should be fast and doesn't have to be cryptographic.
Regarding security if you're at the point of someone malicious creating a file on your internal systems you've already lost the battle.
> But then you have to store the entire before & after locally?
Yes, there is difference between two (as you say) and there is integrity (modification detection). In the case of comparing new assets in a pipeline to those that were created earlier, it sounds plausible both copies would be present.
> That's the entire point of using a hash for change detection.
This is called integrity protection. Change detection is the incorrect term to use here. Please see what I referenced earlier for first and second preimage.
What determines whether a hash is "cryptographic"? What would make it suitable for change-detection but not be "cryptographic"? Is the claim here that it would not be suitable for detecting "malicious" changes, but is still suitable for detecting "natural" changes?
A couple features are that is hard for an attacker to find a collision, and it is hard to reverse the original data from the hash. Both can cause serious security problems but aren't necessary in something like this where most image changes are "natural" and you gain speed by weakening those constraints.
Saying it's twice as fast is rather misleading? They can both hash as fast as RAM speed allows anyway. And if it's something in cache I doubt one is significantly better than the other.
No releases for 17 yet, but checking the site, I discovered that AdoptOpenJDK was moved to the Eclipse Foundation and renamed to Adoptium: https://adoptium.net/
AdoptOpenJDK offers two builds builds for each version, HotSpot and OpenJ9. Adoptium doesn't. Instead, there's only one downloadn, which makes reference to "Temurin", whatever that is. A quick google search lands me on an Eclipse project page which describes Temurin as follows:
"The Eclipse Temurin™ project provides code and processes that support the building of runtime binaries and associated technologies that are high performance, enterprise-caliber, cross-platform, open-source licensed, and Java SE TCK-tested for general use across the Java ecosystem."
With all these big institutions around maybe some of them will get together and push for namespaces in crates.io?
It would be great if e.g. everything under "amazon/" could be trusted to be an official Amazon crate so you don't have to vet every dependency from every tutorial, and this seems like a common need for Amazon, Google and Microsoft.
See e.g. kibwen's comment at https://news.ycombinator.com/item?id=24445788 "The problem is that crates.io is a free, volunteer-run project with zero full-time employees who could be tasked with the drudgery of intervening in naming disputes or managing an identity layer. [...] Solve the funding issue first, and then you can start solving the rest."
The namespace issue was frequently brought up with the crates.io team on Discord, Github, and focus groups at various points over the life of the service. The takeaway is that the lack of namespaces isn't a funding issue, they did not, and still do not believe it is the right design. I personally don't think this is correct, and squatting is very prevalent right now. It's been a bit of a broken record at this point, but at the end of the day it's a volunteer run project and the volunteers that are willing to spend the time to maintain it don't want the feature. I think overall this is a pain point, but by no means a deal breaker.