Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think the behavior is that if you have a Vec of u128 (say 1000), filter that to fewer elements (say 10), and then collect it into a Vec of u32 you might expect the resulting value to be around 40 bytes but in beta Rust it is 16000 bytes. In current Rust the collect would cause a new 10 element Vec of u32 to be allocated, in beta it reuses the original larger allocation. The author's code is doing a bit more but essentially when moving to beta Rust the new optimization caused memory usage to increase by a big multiple.


Ok. I missed that there’s a filter step that compounds the problem. The more I read the less this sounds like a bug and more like application code is missing a shrink_to_fit and was relying on a pessimization.

That being said, it’s also not an unreasonable expectation on the user’s behalf that the size and capacity don’t get crazy different in code as innocuous and idiomatic as this.

I wonder how the open bug will end up training itself.


Someone here linked an open ticket for this issue. In the comments at least one person made basically the same argument that holding on to a potentially large % of memory is a surprising sharp edge, meanwhile shrinking the Vec and perhaps allocating is unsurprising behavior. Requiring many additional defensive shrink_to_fit calls to avoid this problem seems like the wrong tradeoff but I don't write enough Rust to have a strong opinion.


The question is how much overhead would adding a check to determine to shrink if > % of freedom to all code that doesn’t need this optimization entail?

The reason it’s important to consider is that I can always add a shrink_to_fit even if it’s a sharp edge. I can’t remove the conditional within the std library even if I know it doesn’t apply. And adding explicit APIs to control this nuance is a bit much (whether through a dedicated collect_maybe_shrink function or as a new argument to collect to control shrinkage) and there are usability implications to complicating the UI. It may be that ultimately this should be fixed purely through documentation even though defensive shrink_to_fit sucks. Not all technical problems can be solved and it’s all trade offs.


> The question is how much overhead would adding a check to determine to shrink if > % of freedom to all code that doesn’t need this optimization entail?

Once per collect? Damn near nothing.


I'm guessing allocator time, memory usage, and cache residency are the major performance considerations. Vec knows what size it is so the comparison is cheap and in any case collect is already expected to allocate depending on what it is fed.


The comparison is "cheap" if you can speculate through it & even then isn't free. If your comparison is 50/50 on branching then you're going to pay a penalty every time you hit that code path. It's entirely possible that the map operation is going to dominate the check, I'm just highlighting it's not free and it's a good idea to validate the cost somehow first (but also recognize there are applications you're going to penalize without them realizing it).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: