Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Rust has also recognized the problem from a very early stage one.

For example this is why there was a `box` operator in early rust.

And e.g. placement-in like APIs had been in the works for years, it's just that no satisfying and sound solution has been found (but multiple solutions which initially seems sound).

Which is why we currently are in a "hope the optimizer does the right thing" situations (through it is pretty much guaranteed to do the right thing for a lot of cases around Box).

But then it also isn't the highest priority as it turns out a log of "big" data structures (lists, maps, etc.) tend to grow on the heap anyway, the the situation that someone run into debug builds crashing because of a big data-structure is pretty rare, and it also crashing of release build is even rarer. Some of the most likeliest ways to have a too-big data-structure on the stack is: Having some very deep type nesting. But then such types are (in the rust ecosystem) often seen as an abuse of the type system and an anti-pattern anyway. Through it can be a fun puzzel, and some people are obsessed which bending the type system to their will to create DSLs or encode everything possible in the type system. But I have yet to see commercial projects with mixed skill level team members where using such libraries didn't lead to productivity reduction on the long run (independent of programming language).



It’s just a bit of a surprise, and Rust hasn’t ironed out some of these surprises. I’m sure it will get fixed eventually.

Yes, you can give examples of cases where unusual code (like deep type nesting) can create these large data structures, and you can call it an anti-pattern. But Rust is also pitched as a C++ replacement for greenfield projects, so you have all of these C++ programmers who are used to being able to “new” something into existence of any size, and then initialize it. A series of design decisions in Rust has broken that for objects which don’t fit on the stack.

I’m satisfied with the explanation that “no satisfying and sound solution has been found” and I’m also satisfied with “Rust developers haven’t gotten around to addressing this issue”. I’m not really interested in hearing why some people who run into the same issue are making bad decisions.


One piece of context I want to add, although there's no language construct for placement new, the unsafe `MaybeUninit` allows you to write partially to memory, and a macro[1] can be written to make almost seamless to use.

[1]: https://crates.io/crates/place


> But I have yet to see commercial projects with mixed skill level team members where using such libraries didn't lead to productivity reduction on the long run (independent of programming language).

Mixed skill team or not, I really don’t see why Box<[u8; 1024 * 1024]> should be something the language struggles with.


EDIT: I realized the TryFrom is just implemented for Box<[T]> not Vec<T> but you can easily convert a Vec<T> to a Box<[T]>. I updated the code accordingly.

vec![0u8; 1024*1024].into_boxed_slice().try_into().unwrap()

isn't that terrible to use

her as a function:

fn calloc_buffer<const N: usize>() -> Box<[u8; N]> {

   vec![0u8; N].into_boxed_slice().try_into().unwrap()
}

I you want to rely a bit less on the optimizer using `reserve_exact()` + `resize()` can be a good choice. I guess it could be worthwhile to add a helper method to std.


Agreed – but why would you want to box an array instead of simply using a Vec?


You can save memory by having fewer fields. This can matter when you have lots of small arrays.

Vec<u8> has {usize length, usize capacity, void* data}. Box<[u8]> has {usize length, void* data}. Box<[u8;N]> has {void* data}.


For a typical use case that seems like a rather extreme optimization, no? If you have a lot of objects with many small arrays and you're keeping them in a Vec, they'll be on the heap. If you're dealing with a bunch of small parts of a big blob of binary data, you'd use slices and not create new arrays. If you're on an embedded system you're not likely to have an allocator anyways.

(without trying to be too argumentative) right? Or?

Edit since I've been throttled:

  For example it can make a difference between passing values per register or per
  stack in some situations. … But then for some fields where C++ is currently very
  prevalent it might matter all the time.
That's an interesting one I hadn't thought about (and I didn't realize that the register keyword was deprecated in C++17). In a rather broad sense I hope Rust catches on in the kinda niche stuff where C++ is often popular. For example I've only done a little bit of dabbling with Rust in an embedded context but overall I thought it brought a lot to the table.


In a system at $WORK I recently optimized a structure from String to Box<str> (similar optimization to remove the 8 byte capacity field) and saved ~16Gb of memory. Granted, the program uses 100-200Gb of RAM at peak, but it still was a nice win for basically no work. It's also a semantic win, since it encodes "this string can't be mutated" into the type.


yes but also no,

In some situations "optimizing smart pointers" to just be a single pointer size (Box<[T; N]>) instead of two pointer sizes (Box<[T]>) or instead of three pointer sizes (Vec<T>) can make a difference.

For example it can make a difference between passing values per register or per stack in some situations.

Or it could make the difference of how many instances of the boxed slice fit efficiently into the stack part of a smallvec. Which if it's the difference between the most common number fitting or not fitting can make a noticeable difference.

Through for a lot of fields of programming you likely won't opt. to do such optimizations as there are more important things to do/better things to spend time optimizing at. But then for some fields where C++ is currently very prevalent it might matter all the time.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: