It sounds to me like the scarce resource was found to be RAM, not CPU. If an "instance" uses a big piece of RAM to serve a request, and doesn't give it up while waiting for a backend, then scalability is RAM-bound, not CPU-bound, and they should probably charge that way.