How would you have gone about making Strings be "as comfortable as integers"?
Arrays are indexed by usize, so if you're on a 64-bit machine, then you shouldn't need a cast. It's _unconstrained_ numbers that default to i32, not anything without a suffix.
> How would you have gone about making Strings be "as comfortable as integers"?
I would prefer a str be a str be a str, regardless of how you got it. Lowercase type-name and fundamental like an integer.
I'm fairly certain I understand why Rust made the choice they did. I've read the forum threads at HN, Reddit, and users.rust-lang, and I've seen previous replies by you and other Rusties, so I hope you won't try to educate me about the performance advantages of having slices as references and another string type as an ownership class or why we need OsStr and friends.
If strings are the fundamental processing concern in your application, then I think you should be able to opt-in to that kind of micro-optimization and complexity, but it would've been better to spare the rest of us who have different concerns. I don't want to become a string expert to build a filename, and the default implementation could (at least conceptually) be always on the heap for all I care. Go one step further and implement the "small string optimization" (Alexandrescu's fbstring), and you'd probably get back most of the performance without nearly the complexity.
> Arrays are indexed by usize
I've spent the last half hour trying to troubleshoot why this line hangs:
let aa = vec![1u8; 10e9 as usize];
However, I'm at home using the Ubuntu under Windows thing, so maybe there's some bad mojo between Rust and my less than usual setup. (If you're interested, I have 32 Gigs of memory, and the equivalent C malloc and memset code runs just fine, so I don't think Rust is doing the right thing here...).
Anyways, I wanted to test the commented out line, but I'm stuck for now. If that line would work, I'll admit I was wrong, but I think the uncommented line shows a similar complaint.
for ss in 0..63 {
//print!("{}\n", aa[1<<ss]);
print!("{}\n", 1<<ss);
}
> It's _unconstrained_ numbers that default to i32, not anything without a suffix.
Looking at the present and the future, why is that a sensible default? Both x64 and ARM are going to use a 64 bit integer register for the operations, and many of those operations are going to be 1-clock throughput. You can probably find a counter example, but 32 bit integers aren't generally faster than 64 bit ones.
> I would prefer a str be a str be a str, regardless of how you got it. Lowercase type-name and fundamental like an integer.
The lowercase type names are reserved to primitives. The String type is not a primitive but a comprehensive data structure, hence the capital S. The String type contains an `str` primitive though, along with size information.
> I don't want to become a string expert to build a filename, and the default implementation could (at least conceptually) be always on the heap for all I care.
Is it that hard to understand that when you create a string, you will create it as either a `String` or `PathBuf`? File methods are designed to automatically convert input parameters into a `&Path` so it doesn't matter what string structure you provide.
There is also no way (currently) to create a stack-allocated string with the standard library out of the box. You can do this with crates like `arrayvec` though. It's very much opt-in for that performance.
let path = String::from("/tmp/file");
let mut file = File::open(&path).unwrap();
> Looking at the present and the future, why is that a sensible default? Both x64 and ARM are going to use a 64 bit integer register for the operations, and many of those operations are going to be 1-clock throughput. You can probably find a counter example, but 32 bit integers aren't generally faster than 64 bit ones.
No need to use a 64-bit integer when you only need a 32-bit integer. You can fit two 32-bit integers into a single 64-bit integer and perform a calculation on both simultaneously with a single cycle, versus spending two cycles to calculate two 64-bit integers. There's also no need to pay that memory cost either.
No, I'm interested in what tradeoffs you would have made differently. I now understand. Thanks! (I disagree, but at least I understand.)
> why this line hangs:
It compiles and runs effectively instantaneously for me on Ubuntu under Windows as well, so that's very strange. Maybe file a bug?
> I think the uncommented line shows a similar complaint.
Yes, there's no constraint on that literal, so it's going to be an i32. When we made this decision, we did some analysis, basically no numbers in real world programs weren't constrained, it was often tests, toy programs, and documentation. It should be a rare thing. YMMV.
> why is that a sensible default?
Your assertion about the speed was the opposite of what was asserted while we had the discussion, basically. And not everybody is running on 64-bit hardware, so it's a broader default.
> It compiles and runs effectively instantaneously for me on Ubuntu under Windows as well, so that's very strange. Maybe file a bug?
Follow-up: I tried it with -O (don't know why I didn't think of that earlier), and it runs fine. So maybe the debug version is just generating terrible code, initializing by iterating through 10 billion bounds checks or something?
Anyways, more importantly, it works as I would like and does not behave as a 32 bit integer. I think I understand what you mean by "constrained" now. And clearly, I was wrong.
However, if most un-suffixed integers in real-world programs will become constrained (as you claimed), this further confuses me why i32 is the unconstrained choice. It doesn't seem like something so rare could be enough of a performance problem to justify being anything but the largest supported size.
I remember installing it by cut and pasting one of the "curl ... | sh" commands there.
> Nothing about that code should be doing bounds checks, as it's just allocating an array.
I didn't dive into the macro definition for vec!, but I assume there is a loop in there to Copy the initialization element 10 billion times. I think you guys do bounds checking on the lower level reference to a slice that Vec uses. But I really don't know. If it's not that, then it was hanging or spinning doing something else. (Debug version of your memory allocator?)
> basically no numbers in real world programs weren't constrained, it was often tests, toy programs
The fact that C and C++ compilers generally chose to leave int at 32 bit on 64 bit platforms, combined with the standards requiring "usual promotions" for smaller types to go to int bites me all the time. I'm very happy that Rust dodges the promotions problem altogether, and I'm sorry if I'm wrong about the array subscripting thing (does the snippet I provided panic at 1<<32 or 1<<34?).
> And not everybody is running on 64-bit hardware, so it's a broader default.
That argument could be used to justify 8 or 16 bit integers... :-)
> (does the snippet I provided panic at 1<<32 or 1<<34?).
Overflow is a "program error", and in debug builds, is required to panic. In other builds, if it does not panic, it's required to two's compliment overflow. Rustc currently just overflows, but in the future, we'll see.
That's true except our 16 bit support is nonexistant at the moment :)
Arrays are indexed by usize, so if you're on a 64-bit machine, then you shouldn't need a cast. It's _unconstrained_ numbers that default to i32, not anything without a suffix.