Announcing the `http` crate

int_19h · on Aug 1, 2017

It's great to see this approach to core API design, and I wish other platforms adopted it. There are many advantages to having a rich ecosystem of third-party libraries, but composability becomes a problem when they define different types for the same concepts. This is the best of both worlds - you get multiple competing implementations of the interesting bits, but everything "just works" with any of them.

nly · on Aug 1, 2017

I couldn't agree more. Although, as someone mostly in to C++, I think data types like this should be standardised as concepts rather than added as concrete types. The HeaderMap described in this crate sounds like it is quite complex. I'd rather have a concept with some kind of trait mechanism for determining if e.g. the type preserved insertion order (which defaulted to false)

The Beast HTTP library that just got accepted in to Boost follows this approach. E.g. the following Fields concept specifies the requirements the library has on its any HeaderMap-like type you feed it. Writing an adapter for your own types then becomes straightforward

http://vinniefalco.github.io/beast/beast/concepts/Fields.htm...

derefr · on Aug 1, 2017

Defining the type as a trait would prevent link-time optimization of library-consumer-side manipulations of data of that type, no?

Really, you want a concept that's like a trait, but is explicit about the fact that there's exactly one implementation of the trait, and that that implementation is discoverable at compile time. Less like an interface, more like a C typedef in a third-party-library header file—just without the "header file" part.

mtanski · on Aug 1, 2017

Most Rust compilation is non-incremental but rather one shot with all type information available. Thus, because of LTO optimization either in Rust or LLVM the compiler should discover what the concrete type of the trait is and be able to optimize / inline accordingly.

cheez · on Aug 1, 2017

Can't wait for concepts...

barrkel · on Aug 1, 2017

Counterexamples: XML DOM APIs, as interfaced and implemented (separately) in, say, Java; or OpenGL, with its extension functions.

One problem is construction. When you need to interact with a library at one remove using interfaces everywhere, you can't simply new up instances; you have to go via factories. And that leads to ugly alien code.

Another is extension (OpenGL). If your super duper implementation has nifty new features, how do you expose them if you're living behind an interface? Does the interface enable extension in a usable way? Or is it indirect and awkward again?

In practice, you code and test against one or two implementations (bug compatibility), and other implementations are only supported by accident or effort by the implementors.

(I'm fully on board with the idea that a language ecosystem is constrained by the highest level of abstraction in the type system that's shared across all ecosystem libraries. It's the strongest argument for a large standard library.)

icebraining · on Aug 1, 2017

One problem is construction. When you need to interact with a library at one remove using interfaces everywhere, you can't simply new up instances; you have to go via factories. And that leads to ugly alien code.

That's mostly a Java problem, though, which lacks first-class types.

erikpukinskis · on Aug 1, 2017

I think there might be a typo in your second paragraph ("at one remove"), but this is very interesting to me, so I would love to pick your brain.

Why is a factory more alien than simply newing up an instance? Shouldn't they have basically the same interface? Is it just a matter of giving people rope to hang themselves... because they can do an arbitrary interface, someone inevitably will make a weird one, and then someone else will think it's cute, and the fashion gets perverted?

That does seem like a problem. It could be solved with culture, but solving problems with culture is hard so I understand your decision to write it off as a bad direction.

I am writing a lot of code with this kind of structure lately though, so you've got me a little worried. One of the things I dislike about a library is when it exposes raw data structures to the user, and then presumes other libraries will understand that structure. For example, if I have a client library that is sending HTTP requests, and then another separate library on the server that handles them, there is an opportunity there for miscommunication in that data layer.

On the other hand, if I have one library which gives me both a factory and an interface which consumes that factory object, then the library handles both sides of the data structure management. I am forced to only work through the interface, which means as long as my libs are in order, everything should be able to communicate.

Sorry if that's vague, but I'm programming in JavaScript and this post is about Rust and you're referencing Java and OpenGL, so I'm not sure exactly where the ground is. Maybe I only like my approach because it's JavaScript, so I can't ever assume the type of anything is correct. #stockholmsyndrome

barrkel · on Aug 1, 2017

No, I meant "at one remove". When you're interacting with a library using interfaces defined by a third party, you're intermediated; you're at one remove[1] from the library.

Needing to use a factory means you don't have ambient authority to create instances. Code that constructs needs to be parameterized by the factory, irrespective of how far down a call chain it is. Annotating the call stack with a handle to the library adds clutter and clumsiness throughout.

You can kind of get around this using module systems of various kinds, and the de facto module system in JS of putting your entire program in a function parameterized by the modules it uses - it's far from the worst way of doing things, and it's a lot less clumsy than many alternatives. There's a bonus in that it's the typical idiom in JS. But it isn't in many other languages, so the benefits of the API design pattern needs to be traded off against how it clashes with the language culture. It's not an unalloyed good.

[1] https://www.collinsdictionary.com/dictionary/english/at-one-...

MrBuddyCasino · on Aug 1, 2017

I agree, Rust leadership continues to make well thought out decisions. Coming from Java, the Servlet spec is comparable (though broader in scope) and was really successful.

laumars · on Aug 1, 2017

Go's interface{} gets heavy criticism, a lot of it deservingly too, but this is one area where I think Go works really well. You can have a standard API structure and use that across multiple domains.

I really do need to give Rust a try though. I'd been holding back because my impressions of it was the core libraries were still in flux so code that compiled today might fail 2 months down the line. Is this still the case now?

cytzol · on Aug 1, 2017

It's all good now! Rust has been stable with backwards-compatibility guarantees since 1.0 was released two years ago.

staticassertion · on Aug 1, 2017

Right, but I think when they say "core libraries" they probably mean things like hyper, which are still unstable (though getting closer to stability). That said, a lot of big rust crates have stabilized in the last few months.

am1988 · on Aug 1, 2017

yeah, but you still end up needing nightly all the time.

steveklabnik · on Aug 1, 2017

The vast majority of our users are on stable.

It's true that some cases still use nightly, as some people want to opt into cutting-edge stuff. But it's much much less than it was a while back.

am1988 · on Aug 2, 2017

I believe you, but at the same time it's an issue every time I go to use it. I want to use this crate, it has a feature flag and now I'm on nightly.

Now, I'm willing to use the bleeding edge, since I'm just just learning the language, but it still just bugs me in that way that code smells always do.

steveklabnik · on Aug 2, 2017

Which crates are you running into, out of curiosity?

ented1 · on Aug 2, 2017

No bullshit: interface{} is almost the same level hack as void*. And is not well, no matter how much Stockholm syndrome you have.

laumars · on Aug 2, 2017

Like I said, much of the criticism against interface{} is deserving, but genuinely it works really well for passing generic APIs like Reader / Writer.

The way how interface{} works there is you actually write your own named structure that is bootstrapped by interface{} rather than dealing with interface{} as a generic type that needs casting. So what you're actually working with is an io.Reader / io.Writer structure but which can be transferred to any domain providing your specific implementation of Reader / Writer supports the same methods (since it's the methods that define the interface you avoid all the horrible hacks than normally trouble interface{}). This means you can transfer data from a gzip archive to a base64 encoder or JSON marshaller to a OS STD* file or network device all as if they were the same logical interface and without having to write a lot of additional layers encoding / decoding the data nor describing the interface. It all works surprisingly painlessly. In fact it's literally the only time when working with interface{} that the process isn't painful.

So it's going nothing to do with fanboyism nor stockholm syndrome, interface{} just behaves quite differently to it's usual behavior and works really well in this specific situation in my personal opinion. If the Go developers left interface{} there instead of using it as a hacky alternative to generics then I doubt there would be the same backlash against it. But sadly they didn't.

amelius · on Aug 1, 2017

It's a lot like the Boost libraries are for C++. They are curated, and easily composable by design. I don't see the big leap here, but you are right, it is a very good way of working.

On the other hand, this is the Nth time that somebody implemented a HTTP library on some platform. It seems that something can be improved still, software-engineering wise.

tyingq · on Aug 1, 2017

One watch out here is patterns that encourage developers to do the wrong thing.

With golang, for example, it's common to see applications that don't set reasonable cache control headers, content type and length headers, handle head requests properly, compression, and so on.

It's not that golang can't do those things, but the libraries, documentation, and examples don't encourage it. So you see a lot of minimal http implementations that "work", but not well. It's like every (less experienced) end user of the library has to learn all the basics by trial, error, bug reports, etc.

grahameb · on Aug 1, 2017

I think this is mostly a layering issue. If everyone that just wants to scrape HTML or talk to an API is using the basic HTTP API, then you've got this problem. That's where Python was about ten years ago - using the stdlib's urllib, it was very easy to talk HTTP, but you'd probably do it badly.

Then requests came along, aimed to make it easy for anyone to consume HTTP, with sensible defaults -- and it rapidly became the defacto standard way to get things done. Hopefully the rust ecosystem will have similar libraries, built on top of this new 'http' base.

jonreem · on Aug 2, 2017

Good news! The `reqwest`[0] library already exists and serves exactly that purpose.

[0]: https://github.com/seanmonstar/reqwest

anp · on Aug 1, 2017

This particular announcement is about defining shared API types between libraries for which (I think) the things you've listed would be a concern. It's not a full HTTP implementation AFICT.

taeric · on Aug 1, 2017

The point is that pushing this as a concern for all implementing libraries is dangerous and likely not to end well. (At least, that is how I took the point.)

tadfisher · on Aug 1, 2017

I'd argue that all but the most extremely basic of HTTP options are far from universal, and defining sane defaults is the responsibility of frameworks and not a core library.

joosters · on Aug 1, 2017

As I understand it, this http library is meant to represent a 'low-level' representation of a HTTP request, and so I think it's right that it doesn't force lots of default HTTP headers or settings.

HTTP requests rarely live in isolation, you are probably going to be making several requests to one or more servers, so most language libraries have another layer on top of the raw HTTP objects. (e.g. in Perl, there's LWP::UserAgent, among others). Here is where the network code tends to exist, for instance. It's more appropriate to put some defaults at this layer - for example, keep-alives and connection re-use are vital concepts for efficient HTTP communication but apply across multiple requests. You can't sensibly control them at the base 'this is a single HTTP request' object layer.

tyingq · on Aug 1, 2017

That doesn't rule out good examples though. And some of these are close to universal, maybe not in set value, but at least setting them to some value, versus omitting them.

JoshTriplett · on Aug 1, 2017

Agreed. And in particular, these kinds of types would make it much easier to write a framework-agnostic library that helps you handle caching correctly, built on top of these types, like Request and Response.

taeric · on Aug 1, 2017

I'd argue that at the API level, these things should be required to think about. Any level of "help" so that I don't have to worry about cache headers should be from a framework that tells me what they think about them.

JoshTriplett · on Aug 1, 2017

> I'd argue that at the API level, these things should be required to think about. Any level of "help" so that I don't have to worry about cache headers should be from a framework that tells me what they think about them.

Agreed completely. The lowest-level API should expose them and every other bit of the standard, and higher-level APIs should handle them automatically in sensible ways.

RandomKid · on Aug 1, 2017

>> defining sane defaults is the responsibility of frameworks and not a core library

But...but.. Golang core team teaches us that "framework" is a 4 letter word and a core library is enough for everybody. No need to overcomplicate with extra abstractions, just use the standard library they say.

tyingq · on Aug 1, 2017

Not sure if that's a jab at my post. Not trying to be controversial. But, it doesn't appear that the example code shown for returning a json response even sets the Content-type header to application/json.

Since there's obviously some time lag before frameworks that use this core library will appear, good documentation on this sort of thing seems advisable.

RandomKid · on Aug 2, 2017

That's not a jab at your post. Look at Swift and the way they encourage creation of new frameworks and server applications [1].

Then look at the hostile atmosphere in the Go community and tweets of its core team members stating that "http" package is great for everything, no need to use anything else. The framework (or as they prefer to call it toolkit) authors are throwing shit at each other accusing their opponents of creating something not "in the spirit of go" / unconventional, usefulness for the end users is not taken into account. Some others are creating posts about how everybody should stop creating frameworks immediately because that makes them uncomfortable. People hesitating to open-source their code because they are afraid to become a victim of crusade. That's not exactly healthy athmosphere, and it is nourished by core team members.

[1]: https://swift.org/blog/server-api-workgroup/

the_common_man · on July 31, 2017

Would it not be better to call it http-types/http-base/http-common as this just provides common types and not the transport or security? Thanks and keep up the awesome work.

JoshTriplett · on Aug 1, 2017

I think "http" actually makes sense here, for two reasons.

First, this crate provides a common layer for the HTTP protocol; if you're interacting with HTTP, you'd use this crate and its types, often directly. Note, in particular, that this crate provides builders to construct HTTP requests and responses, which is often a large part of client and server libraries, respectively.

And second, I don't think any specific HTTP client or server software should claim the crate name "http". A common layer like this, built around the protocol itself, seems to have a much more reasonable claim to it.

tadfisher · on Aug 1, 2017

In addition, the "transport" and "security" portions are the least-HTTP-related parts of an HTTP stack. Far more than HTTP runs on TCP and TLS.

gregwebs · on Aug 1, 2017

In haskell it is called http-types

arianvanp · on Aug 1, 2017

Seems like they took the hint from haskell :) good job. We have WAI (https://www.stackage.org/lts-9.0/package/wai-3.2.1.1) a common interface for multiple HTTP Web Servers. It's very useful, and a rich ecosystem of middleware has flourished.

deathanatos · on Aug 1, 2017

I really wish I could start a rebellious movement in HTTP libraries to call the constant UNAUTHENTICATED and for it to return

    401 Unauthenticated

…and then once rough consensus switches, get an update in the RFC.

(I'd likely be horrified to learn that something out there critically depends on the Reason Phrase…)

JoshTriplett · on Aug 1, 2017

You're not supposed to depend on the specific reason phrase given with the code, but I wouldn't find it surprising if something does.

See https://hackernoon.com/three-bytes-and-a-space-8f9fbd1c669b for a related debugging adventure.

Also, technically, you can return 401 for either "unauthenticated" (you need to pass authentication information) or "unauthorized" (you passed authentication information but it wasn't acceptable), depending on the nature of what you need.

BillinghamJ · on Aug 1, 2017

401 Unauthorized means you aren’t authenticated, or your authentication is broken/expired/invalid.

403 Forbidden means that your authentication is valid, but that you don’t have access to the resource.

401 should not be used to indicate the latter.

JoshTriplett · on Aug 1, 2017

HTTP Authentication requires a lot of care to interact with the browser's authentication flow and UI.

I worked on a site for several years that used HTTP Digest authentication. We finally gave up on it and switched to the standard form-and-cookie approach, because the browser authentication flows had so many bugs, quirks, per-browser idiosyncrasies, and other issues to work around.

BillinghamJ · on Aug 2, 2017

I've never implemented the status codes for browser purposes - only for APIs.

HTTP Digest looks interesting, but I think I'd generally feel more comfortable just using HTTP Basic over HTTPS. Or better, of course, just doing it yourself with some signed cookies.

throwaway91111 · on Aug 1, 2017

Why not? 401 seems like a fine fit if your client doesn't need to differentiate between authentication and authorization; most don't need to at all.

Also, you can fail authorization without passing authentication. For instance, you could be authorized by ip range or something unrelated to any of the data in the http request.

erikpukinskis · on Aug 4, 2017

Wut. Most clients don't need to differentiate between "you're not allowed to do that" and "you're not logged in"? Those things require totally different reactions, no?

BillinghamJ · on Aug 2, 2017

> Why not?

Because those are the semantics as-per the spec.

In relation to your latter point, 403 also covers any other reasons your access is forbidden - e.g. IP ranges etc.

majewsky · on Aug 1, 2017

> "unauthorized" (you passed authentication information but it wasn't acceptable)

That's still "unauthenticated". "Unauthorized" means that you were authenticated (i.e. the server knows who you are), but you are not allowed (i.e. authorized) to execute the requested operation. So the correct names would be "401 Unauthenticated" and "403 Unauthorized".

awj · on Aug 1, 2017

Cf. "Windows 10", which had to a version number because people were looking for a "Windows 9" (i.e. Windows 98 & Windows 95) prefix in a version name string despite it being available as direct numeral values.

Feel free to be horrified, but you'd be a bit naive to be surprised.

azdle · on Aug 1, 2017

> (I'd likely be horrified to learn that something out there critically depends on the Reason Phrase…)

IIRC, the 8xxx RFCs have specified that you must read the reason and that the code itself does not fully specify the result because there have been some custom codes that overlap on number.

deathanatos · on Aug 1, 2017

the 8xxx RFCs? Did you mean the newer 7xxx ones? If so, they say,

> The reason-phrase element exists for the sole purpose of providing a textual description associated with the numeric status code, mostly out of deference to earlier Internet application protocols that were more frequently used with interactive text clients. A client SHOULD ignore the reason-phrase content.

I'm not aware of any highlevel HTTP RFCs in the 8xxx range.

mixedCase · on Aug 1, 2017

I can't seem to find a parser for HTTP requests in the repo. I assume this is out of scope for the crate?

connorcpu · on Aug 1, 2017

Looks like this crate is merely trying to provide a standard set of types used in an HTTP pipeline to make different crates that want to have a part in that pipeline able to be combined together, without having to convert your types all over the place

ivanbakel · on Aug 1, 2017

To quote the announcement:

>It does not cover transport, but is rather intended for use in libraries like Hyper, and to support an ecosystem of crates that can share HTTP-related types.

So I don't think they added any implementation details to the crate itself, to avoid people taking issue with something being too bloat-ey or not to their need, and rejecting the common types, which seem to be the real goal. It does seem like something you'd always need, though - I can only imagine that a proper parser would be pretty large, and they didn't want people to complain about it.

JoshTriplett · on Aug 1, 2017

I wouldn't be surprised if an "http-parser" crate pops up on top of this. But unlike the http crate, an http-parser crate would have to make at least one potentially controversial choice, namely "which parsing library".

steveklabnik · on Aug 1, 2017

I've been working with this crate for a while, and use https://crates.io/crates/httparse with it, it works pretty well.

tatterdemalion · on Aug 1, 2017

Parsing would get into policy choices this crate doesn't want to make, like DoS protection (max request size, mid-request timeouts, etc), and often is intertwined with reading data from IO, which this crate definitely doesn't want to get involved in.

gPH45 · on Aug 1, 2017

Interesting micro-optimizations... For example the use of lookup tables and nested switches in uri.rs and method.rs.

Is this common practice in Rust? Shouldn't the compiler be smart enough to do that for you? Such code seems quite sensitive to typos.

DiThi · on Aug 1, 2017

I don't think the lookup tables is a micro-optimization. They're both clear and fast in this case (I can't think of a more clear alternative, just of a less verbose one). The "switches" are actually pattern matching, a very powerful feature of Rust. I'm not sure about nesting them being a micro-optimization either. I think it's also a choice that is both readable and fast. Can a more experienced Rust code shed light?

thijsc · on Aug 1, 2017

This uses pattern matching, the compiler will verify that all options exists and that the coverage of the enum is complete.

pimeys · on Aug 1, 2017

This is good news. Waiting it to be merged with hyper including the http2 support. With both of them it will make Rust the perfect choice for all my current use-cases.