I was just talking to someone about scaling so I'm reusing what I said:
Scaling out technically and socially seems a little bit related. I want to scale out like this: a public search server (node) knows about other public nodes and the semantic topics their data carries. When a node cannot sufficiently answer a query it can reach out to other nodes by looking up a map of topic/list of nodes. Sharding by table/collection can also be solved the same way. That way, people owning public nodes can create queries that span tables they don't even host. They can build analytics using _their_ data _and_ the world's data. That's super-powerful.
As always, the question is how it scales.