Crow – C++ Microframework for Web, inspired by Python Flask

ipkn · on July 8, 2014

I'm writer of this project and it's not completed yet. I planned to publish this after finishing basic features and documentations.

jvickers · on July 8, 2014

It looks very convenient for making some relatively simple web services (based on me not knowing much C++, others would get more mileage probably). It looks like the kind of C++ project of a moderately small size that's implementing something that I'm very familiar with already, so I think it will be useful to help me learn C++.

I have a couple of questions, and that is what is the reason for only having .h files with no .cpp files in the main part of the microframework? Did something force you to do it that way or was it done like that because you like that code structure?

Mr_P · on July 8, 2014

Header-only libraries have several advantages (see http://en.wikipedia.org/wiki/Header-only). Most-notably, they are dead-simple to include with a project, as opposed to having to separately build and link to a shared-library.

frankzinger · on July 8, 2014

Yeah but usually "header-only" implies templates, of which there seems to be little in this case. Without templates there's little point in putting everything in headers because all the code becomes inline.

Inlining everything is bad because:

  - it makes the binary much bigger.
  - the smallest code change forces library users (applications) to be recompiled.

For libraries it's usually better to go to the other extreme: hide as much code and data as possible in the source files. See, for example, https://en.wikipedia.org/wiki/Opaque_pointer. It greatly reduces the need for applications to be recompiled when the library is updated.

It's actually very unusual to put all function implementations inline (edit: as in, inside the class declaration) in C++. I wonder whether the author is a heavy Java user.

tpush · on July 8, 2014

I'd even argue that, with an LTO-optimizing compiler (e.g. clang -flto), writing functions inline is basically obsolete. In my case clang/llvm is very much able to inline code across object file boundaries, even capable of analyzing function-pointer assigments and inline the respective functions.

geon · on July 8, 2014

> because all the code becomes inline.

The language says that's just a suggestion to the compiler, right? Does any compiler actually inline for non-trivial methods?

pbsd · on July 8, 2014

`inline` has a different meaning in C++. The compiler is free to inline the call if it wishes to, but `inline` means that the same function can be defined in multiple translation units without breaking the one definition rule. Example:

    // header.hpp
    inline int f(int x) { return x + 1; }

    // a.cpp
    #include "header.hpp"

    // b.cpp
    #include "header.hpp"

If f was not marked inline, linking a.cpp and b.cpp together would find a conflicting method f, and compilation would fail. `inline` lets the compiler ignore this, and it simply picks one of the multiple definitions as the 'real' one and moves on with the compilation.

mtdewcmu · on July 8, 2014

What you are describing sounds a lot like 'static' in C, which marks a function to not export its name, so it can't be seen from outside the file.

'inline' in C is a hint to the compiler that you'd like the function inlined.

You can combine the two, and, in fact, it seems like a good idea IME to also use static if you're using inline.

Are you sure c++ is that different?

pbsd · on July 8, 2014

`static` will result in a copy of f for every translation unit (without LTO, at least). `inline` will not. `static inline` is effectively the same as `static`, with a slight hint to the compiler to inline the call.

`inline` is used extensively in C++ to make header-only libraries possible; otherwise you'd get constant symbol clashes during linking. With `static` you would get enormous size blowup. It has little to do with the actual inlining of the call, which is mostly up to the compiler.

In C, the situation is complicated. `inline` does not exist in C89. GCC has an interpretation of it for C89 (-std=gnu89), which differs from the C99 interpretation. The only safe way to use inline in C is usually to couple it with `static`, unless you know what you're doing. The C99 interpretation of inline is similar to C++, but once again not exactly. For example:

    // header.h
    // int f(int x);
    inline int f(int x) { return x + 1; }
    // a.h
    int a(int x);
    // a.c
    #include "header.h"
    #include "a.h"
    int a(int x) { return f(x); }

    // b.h
    int b(int x);
    // b.c
    #include "header.h"
    #include "b.h"
    int b(int x) { return f(x); }

    // main.c
    #include "a.h"
    #include "b.h"
    int main(int argc, char **argv) {
      return a(argc) + b(argc);
    }

This is code that compiles perfectly fine in C++, but is invalid C, because when the compiler decides not to inline the calls to f, it has no linkage of its own. But when one declares f to have linkage (by uncommenting that line in header.h), we now get 'multiple definition' errors.

mtdewcmu · on July 8, 2014

Thanks for that. I always hoped that static C functions would not be generated if they are never called, at least. Which I can't see anything to prevent.

It sounds like you've confirmed my intuition about inline in C, and I find inline to be only marginally-useful at best. inline functions are syntactically-prettier than macros, but they lose the other major benefit of macros, which is increased flexibility about typing and being able to interact with syntax in ways that functions can't. I get the impression that inline probably didn't need to be included in the standard, or, at least, somehow they blew the opportunity to add something more useful.

C's situation still seems less complicated than C++'s. I can't grasp exactly what C++ 'inline' actually tells the compiler to do, based on your description. It sounds like 'inline' in C++ is just a smarter 'static'. Why can't those smarts be implanted into 'static'?

pbsd · on July 8, 2014

`inline` indicates to the compiler: "this function has external linkage, and no matter how many times it's defined it is to be defined only once in the final linked output". It's the same as if there was no inline, but when the linker finds multiple definitions of the same function it is allowed to ignore them instead of failing. It also serves as a inlining hint to the compiler in its free time.

Note that you don't necessarily have to type `inline` to have inline functions. Methods defined in the declaration of a class are implicitly inline; so are template functions (but not explicit specializations).

The reason it's called `inline` instead of something else probably has something to do with the committee's aversion to new keywords, and commitment to backwards compatibility. Changing `static` would probably break a lot of code: think what would happen to static variables inside static functions.

mtdewcmu · on July 8, 2014

'static' is old. It must have meant something to Kernighan and Ritchie.

I see no connection to the word inline in the C++ meaning. In C, at least, inline means inline.

My guess is that the C++ inline got its meaning from the winding path of c++ history, and only makes sense in the context of that history.

geon · on July 12, 2014

I guess it is that in C++, methods implemented in the class declaration are implicitly "inline". It could be done to avoid the problem outlined above.

cbsmith · on July 9, 2014

Note that in C++, if you really want to have just want definitions in multiple translation units you should just use an anonymous namespace...

frankzinger · on July 8, 2014

No, not everything will be inlined, but there will still be much more inlining than there should be.

And whether or not the compiler inlines the code, applications will still have to be recompiled whenever there's a code change in the library.

mattgreenrocks · on July 8, 2014

It may seem unusual because you haven't seen it, but header-only libraries are perfectly valid for small, focused libraries.

frankzinger · on July 8, 2014

How do you justify mass recompilations for every minor version bump or bugfix to your users?

mattgreenrocks · on July 8, 2014

This is only a problem if the user structured their code horribly. The library handles HTTP requests, it should be on the edge of the architecture.

Side note: C++ is fantastic in this regard because it makes you suffer every time for excessive coupling. The compile/link times act as a recognizable metric that devs have an interest in minimizing, and the process of doing so produces better code. I love that it is ruthless in punishing poor design.

frankzinger · on July 9, 2014

> This is only a problem if the user structured their code horribly. The library handles HTTP requests, it should be on the edge of the architecture.

Even then you will be rebuilding and redeploying the edge of your architecture every time this library gets a minor minor version number bump.

I would choose not to have to do that, every time.

Iftheshoefits · on July 8, 2014

Just a couple of points.

1. It is an error to couple this library closely enough to the rest of a project's code to cause the condition you note to exist. This kind of library is best used in a small project or as part of the implementation of a user-defined abstraction interface (an abstraction specific to his project's use cases that would not make sense being included in the library code). A small project will compile quickly anyway, and the second kind of project will only need to be compiled if the abstraction's interface changes.

2. C++ compile times aren't that bad. I won't argue that it's not bad in large code bases: it indeed becomes atrocious when the codebase becomes large and spread out over a large number of compilation units (or when coupling is excessive).

frankzinger · on July 8, 2014

> 1. It is an error to couple this library closely enough to the rest of a project's code to cause the condition you note to exist.

Any compilation unit (source file/module) which calls a function in this library will have to be recompiled if any of the called functions change which means that your application will also have to be re-linked. There's just no way of getting around that.

plorkyeran · on July 8, 2014

C++ makes maintaining ABI compatibility quite difficult, so in practice libraries with a C++ interface tend to default to requiring that anyway.

frankzinger · on July 8, 2014

Yes, it's difficult by default, but the solution is https://en.wikipedia.org/wiki/Opaque_pointer and it is quite well-known in the C++ world.

amatheus · on July 8, 2014

Look really nice! To make the people who will complain about no example in the readme, I would suggest you just copy the example.cpp file into the readme, that'll do until you have more time.

davvid · on July 8, 2014

This looks nice. I would suggest avoiding macros in the final release; it should be possible to implement CROW_ROUTE() using template meta-programming instead of #define's.

ipkn · on July 8, 2014

I also want to remove CROW_ROUTE, but with the current c++ standard, it cannot be avoided.

To check whether handler is valid with given URL at compile time, `url' (string literal) argument requires in compile time and in run time. const char* value is invalid for template argument and argument of non-constexpr function cannot be constexpr value. Thus I used macro to provied `url' argument twice; in template argument through constexpr function and in argument.

pmr_ · on July 8, 2014

You can still do metaprogramming on single character constants and with a bunch of really ugly hackery make it somewhat pretty.

You might be interested in metaparse [1] which can greatly simplify compile time parsing of strings but has a very steep learning curve.

[1]: http://abel.web.elte.hu/mpllibs/metaparse/

ipkn · on July 8, 2014

I already considered using a template with single character constants, and I thought the technic didn't have much benefit over the macro version. Maybe compile-time routing function genetation could be possible with it (and would faster), but requres HUGE work I think. I will try and benchmark it later.

mtdewcmu · on July 8, 2014

I'd hope that in the end, the efficiency of the routing matters more than templates vs macros. You will never finish anything if you pay too much attention to all the purists.

Useful strings at compile time is a desirable feature beyond c++, though. A perfect hash could be a nice solution, I thought, but I got around to trying out gperf, and it was much slower than I expected. Probably too slow to use in ordinary situations. I guess gperf is for when (runtime) performance is incredibly important.

Another possible approach to strings at compile time is something like flex, or re2c. I haven't tested them in this type of scenario. But, apparently Zed Shaw used ragel to parse http in Mongrel to excellent effect. My problem with ragel is its complicated syntax.

_glsb · on July 8, 2014

Took a look at the source code. Noticed

  crow::black_magic::is_equ_p(...)

I haven't yet tried the framework, but I already like you.

acron0 · on July 8, 2014

So, you are alive ;)

fritz_vd · on July 8, 2014

very cool

mempko · on July 8, 2014

Love the black magic!

ipkn · on July 8, 2014

the power of `constexpr'!

mike-cardwell · on July 8, 2014

The way your routes work mean that a response is expected to be immediately generated and returned. It would be much nicer if a response object was passed to the callback and you could return immediately from the callback, but send a response independently, when you are ready. Kind of like this:

  CROW_ROUTE(app, "/about")
    ([](Response res){
        res.send("About Crow example");
    });

Why you might ask? So you can do this:

  CROW_ROUTE(app, "/about")
    ([](Response res){
        responses.push_back(res);
    });

And then some independent method could come along and do the res.send() when it is ready. The connection would hang until res.send() or similar is called on it. There would also be methods on the Response object so you can see if the connection is still alive etc, and maybe the ability to set timeouts directly on the Response object.

[edit] This would allow people using your framework to implement long polling without locking up an io_service thread for each connection. It would also make it easier to add support for web sockets etc at a later date.

[edit2] This is how NodeJS works. Both a request and a response object is passed to the callback, then you can do for example:

  function callback (req, res) {
      setTimeout(function(){
          res.writeHead(200, {'Content-Type': 'text/plain'});
          res.end('Hello World\n');
      }, 5000);
  }

ipkn · on July 8, 2014

I agree that allowing implementing long polling with crow is important, just I didn't know a good way to do that. Your suggestion is big help.

I think supporting both way is better if there is a enough explanation. I don't want to drop a simpler way to do the same job.

  CROW_ROUTE(app, "/about")
    ([](){
        return "About Crow example";
    });

  CROW_ROUTE(app, "/about")
    ([](Response res){
        res.send("About Crow example");
    });

shanemhansen · on July 9, 2014

^this is exactly how twisted implements their web resources.

mox1 · on July 8, 2014

If all of this is happening in it's own thread (and there's probably 1 thread per connection), why add the overhead and complexity of something like this?

mike-cardwell · on July 8, 2014

I've looked at his code and that is not how it works. It uses boost::asio. You specify how many io_service threads to run. It is not 1 thread per connection. You could easily have 10000 connections spread across 4 threads for example. The threads that handle the connections are the same threads which run the callbacks. You wouldn't want threads being blocked by callbacks that take a long time to run. You'd want to pass those operations off to a different thread pool. So you would want to be able to do what I have suggested.

Also, you said this adds complexity and overhead. I dispute that it adds complexity. For most people: "return stuff" vs "res.send(stuff)". And I wouldn't assume that it would add any overhead either. If you disagree, let me know when you've read the code and understand how boost::asio works.

SomeCallMeTim · on July 8, 2014

It's important to note that requiring 10000+ real threads would be a huge limitation, performance-wise. At that scale you end up with a lot of overhead from process switching.

The difference is a big reason why Apache 2.2 would slow to a crawl and eat up gigabytes of RAM at 100% processor utilization on the same load that Nginx could handle with 10Mb of RAM and 20% processor utilization. [1] (I understand more recent versions of Apache now support polling [2].)

[1] See, for example, this explanation: http://stackoverflow.com/questions/2583350/is-epoll-the-esse...

[2] https://httpd.apache.org/docs/2.4/mod/event.html

cbsmith · on July 9, 2014

Seems like you could do something much cleaner by simply using Boost.Coroutine or some such...

ivoras · on July 8, 2014

I don't code much in C++ but recently I started a project just for fun and found it much easier to make a FastCGI app than I anticipated. Yes, there are still some exceedingly low-level stuff to write by yourself, but overall, it's just fine.

For example, this is how I handle routing: http://goo.gl/G3jvGp

A simple static map of regexes and pointers to methods will do the task nicely, and the principle is extendible if I need to create or compose complex apps.

There are still things which I would need to encapsulate to make it a rapid development framework (such as QUERY_STRING handling in http://goo.gl/l7Ff75) but it's surprisingly non-painful.

rch · on July 8, 2014

I always wanted to give okws (the c++ server that ran/runs okcupid) a try sometime. It seemed like a great design, if a little tough to get going, because of some dependencies. The license made it a tough sell at the company I was with when it came out too.

mappu · on July 8, 2014

Today i learned constexpr, enum class, and operator "". Thank you.

The full trie implementation caught my eye, are there no suitable alternatives in std or boost?

Mixed tabs and spaces (?) cause strange display in github in json.h from 555 through 670 or so.

Also at the start of json.h it seems a shame to repeat __builtin_expect rules, how about #if defined() || ?

sdab · on July 8, 2014

huh, never seen operator "". Seems to be a c++11 feature.

[1] helped me understand it a bit, but Im not sure I see a compelling use case. Does it just provide a shorthand for calling functions on primitive types (and strings) or have I misunderstood? While requiring less characters might it not decrease readability?

1. http://en.cppreference.com/w/cpp/language/user_literal

innover · on July 8, 2014

I like the technique to generate compile error when there is a mismatch between format string and actual argument list and type - This is what only statically typed language can do and dynamically typed languages like python can't do.

Maybe this technique could be applied in printf or other C++ APIs using format string.

unwind · on July 8, 2014

Not sure if you're being ironic, but you're describing e.g. GCC's "format" function attribute (https://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html#...), it makes the compiler verify the arguments for printf() and other format-string functions.

It's not new, it has been in GCC for quite a number of years (not sure how to check this quickly).

UPDATE: I found a list (https://ohse.de/uwe/articles/gcc-attributes.html#func-format) that says format was added in GCC 2.3-3.4 (whatever the range means). GCC 3.4.0 was released on April 18 2004. Now I'm sad I didn't say "ten years" above, as my original hunch was. :)

nly · on July 8, 2014

Specific to printf and scanf format codes though, not under the control of the programmer.

nly · on July 8, 2014

boost::format does away with format codes altogether... you just use '%' place holders. Using constexpr string literal evaluation would be useful for counting them at compile time though.

boatzart · on July 9, 2014

The way the crow::black_magic::get_parameter_tag works is very impressive. I don't think I ever would have thought to do this with a recursive constexpr like that. I'm a huge fan of providing compiler errors whenever possible, so I'm glad to be able to add this trick to my toolbox.

przemoc · on July 8, 2014

C++ _micro_framework using boost. Just couldn't resist to notice that. ;) (I know that boost consists of many libraries, many of them are header only, etc. so using them doesn't necessarily result in terrible bloat.)

Will have a proper look at it later.

ascotan · on July 8, 2014

I noticed that to. For C++03 you might need it for scoped_ptr/shared_ptr, but not if you're going to require C++11. If you're going down the -lpthread route you should probably just use them rather than forcing the include on boost threads.

oscargrouch · on July 8, 2014

Im not coding much C++ lately, but i think if someone just pick some basic(base, ipc, net, etc..) parts of chromium and create a redistributable library with it.. it would be a very sane and powerful competitor to boost when you are coding in C++ 03..

ddevault · on July 8, 2014

Is this inviting remote code execution exploits on your websites? One of the reasons most people don't write C/C++ for websites is safety. It's a lot harder to introduce entire categories of bugs when you use safer languages.

neuromancer2701 · on July 8, 2014

It would be interesting to see how this would work with Cheerp(former duetto). Building a frame work on top of that project has been an idea I have been contemplating.

has_bin · on July 8, 2014

Much better!

daftshady · on July 8, 2014

It looks like very interesting project

WoodenChair · on July 8, 2014

This is going to sound overly critical and judgmental to many - but to me, I just don't understand how you can spend the time involved in developing an entire web framework, and don't have the time to write a couple paragraphs for a README on how to use it and what it's all about along with some examples. I'm not talking full documentation - I'm just talking a few sentences and an example or two. Why bother releasing it before then? Why let people unnecessarily struggle to use it? Maybe somebody posted it here before the author was ready (assuming they're different people).

jeremyjh · on July 8, 2014

Well the author did not submit this himself; I just stumbled on it and thought it looked cool. I think example.cpp gives you a pretty good idea for what it is about, but in retrospect I should have let him post it when he was ready.

thezilch · on July 8, 2014

Different authors -- post and repo -- it would appear. But between the example.cpp and unittest.cpp, there is quite a bit there; if you're not familiar with Flaks, it is a very simple web "framework" that basically just provides request, routing, and response primitives. This library also includes a JSON lib, but Flask gets that from Python's batteries.

Also, Flask was created as a joke; it obviously took off; but there's something there about not taking it super super serious.

adamnemecek · on July 8, 2014

See example.cpp. Also the codebase is very small so not much documentation is necessary. Also I'm pretty sure it's more of an exercise than a framework that's intended to be actually used.

has_bin · on July 8, 2014

And that line you wrote would be a perfect example of what to write.

I agree, I find this annoying.

octo_t · on July 8, 2014

you're right, you do sound overly critical and judgmental.