Hi all, I'm not sure how responsive I'll be able to be on this thread since I'm in orientation today. :)
But, here's the things I expect to be repeating a lot:
- Sandstorm is still an independent company, under control of Jade and myself.
- Sandstorm is "my baby" and I'm not going to stop working on it just because I have a day job. Yes, it will slow down a bit -- but on the bright side, we were previously spending the vast majority of our time trying to figure out enterprise sales (rather than actually building cool features), and we don't have to do that anymore. So, the difference may not be as big as you'd think. See: https://sandstorm.io/news/2017-02-06-sandstorm-returning-to-...
- Nevertheless I'm pretty excited about the work I'll be doing at Cloudflare as well -- which includes Cap'n Proto and other fascinating infrastructure tech.
I'm pretty stoked about future cap'n proto developments. In the scenarios where I've brought up Cap'n Proto, all but the most senior/prolific developers have a near impossible time of ascertaining why one might want to use capnp over competing approaches (e.g. Protobuf). I think, despite a pretty decent description on the website and my own skills in explaining/teaching, such protocols are nearly impossible for the majority of developers to intuit -- so they're all perceived as being nearly identical, save for financial backing and other forms of social proof.
With some of the rough corners filed smooth (e.g. Windows support), a bit more marketing and promotion, and eventually more big names advertising their use of capnp, my job getting buy in from my peers would be much easier.
1) Cap'n Proto doesn't encode/decode messages thus it's nuch cheaper for processing and memory management
2) protobuf in the proto3 design doesn't cary default values. So if you have a bool field and want to explicitly send false, well you have to change it to some other type or use the default values all the time
3) protobuf generates incredibly large serialization/deserialization support coce for each template. For some languages like Python in can be in hundreds of kilobytes. Cap'n proto messages are significantly smaller
There is more for CnP but Protobuf has much better support and is by default used in projects like gRPC. Also new CnP is lacking speed in new development in comparison to Protobuf.
But I'm using in one of my side projects and I'm very happy with it
(2) is explicitly one of the reasons to use proto3 over proto2, though :) It's weird seeing it listed as a disadvantage instead of the other way around.
Depends on the use case. For instance if you're mapping the data to database, you can't send null variables. Or you store the defaults (waste of space) or you send it as another type.
In our case we've now over 1T of rows. Storing default variables would be incredibly inefficient and expensive.
I'm not understanding your case much. Proto3 does not serialize zeros/false, precisely because it'd be a waste without a benefit. After deserialization, the value is as equal to zero as if the zero had been written on the wire or on disk. Why would you want to store that?
Is nullability of primitive types what you're missing? As in, having a boolean value that can be true, false, or null. If you need that, defining your field as google.protobuf.BoolValue instead of bool gives you that (same with the other primitive types: https://github.com/google/protobuf/blob/master/src/google/pr... ). But making primitive types never null aligns proto3 with most programming languages, making the generated code more idiomatic and performant.
I use protobuf-generated RTTI for the purpose of doing endpoint SQL storage of the messages. (tables, columns, foreign keys, insert/select statements are all auto-generated from the protobuf message) using C++.
Does capnproto provide similar RTTI information to walk through properties or recurse into other messages?
Absolutely. I was the one who originally designed the Protobuf RTTI (aka "reflection") interface, after all... :)
It's actually considerably easier to implement for Cap'n Proto since the in-memory objects are actually backed by byte buffers containing the wire format. No need to compute offsets of class members (which technically violates the C++ standard though it works on all compilers).
Well, you could look at it that way. But the "encoding" is the same encoding that an "int" field in a struct uses, and "decoding" it is a trivial load instruction.
The format is little-endian, so it's necessary to use a little-endian load instruction, which almost all architectures (including most big-endian architectures) have.
That such independence is - historically at least - usually the opposite. There are very few examples of where post acquisition such independence turned out to be the real deal.
But (as I interpret the blog post, at least) this isn't an acquisition. Cloudflare didn't buy Sandstorm, it just hired all the Sandstorm developers. Those developers still retain ownership of Sandstorm-the-legal-entity and any intellectual property it owns. It's not an acqui-hire, in other words, just a hire.
Which, if that's the case, means that Sandstorm still is legally independent of Cloudflare, and the original Sandstorm devs can pretty much continue to do whatever they want with it. There would be some gray areas, like what will happen to future code or other IP they create for Sandstorm while employed by Cloudflare, or while using Cloudflare systems/resources; and of course their continued employment with Cloudflare may be contingent on their doing things with Sandstorm that Cloudflare likes, or at least doesn't disapprove of too strongly. But if the Sandstorm devs are just selling their future labor and not the company itself, they'll have more independence than they'd have gotten in an acqui-hire scenario.
We've worked closely with Kenton on this. In fact, our lawyers wanted us to add language to ensure that it was clear that we don't own Kenton's IP around this.
As far as I can tell, Sandstorm did not get acquired or acquihired - Kenton, Jade, and others (?) as individuals all accepted job offers from Cloudflare.
It seems almost universally better? I can't think of any acquirer that would be more successful with Sandstorm than the founders themselves. Nobody I can think of is aligned with "users own their data" and big enough / cares enough to help Sandstorm gain adoption.
Even Sandstorm not gaining traction with their B2B strategy is likely a blessing in disguise: with enterprise money coming in they would have felt a hard pull to do more things that enterprises like. No matter how much conviction you have, making payroll often trumps that.
Now, they've proven that the enterprise route doesn't work. My hope is that the project can continue to improve and gain traction to the point where it sets up or joins a foundation, then potentially try to restart a complementary commercial venture on top of that.
> Now, they've proven that the enterprise route doesn't work.
Well, no... I don't think we proved it didn't work, only that we lacked the expertise to make it work. With more time and some good hiring I think we could have fixed the problem, but the investor money ran out, so...
Ok, that might have been an oversimplification on my part, maybe "isn't a natural fit" would have been better. I'm sure there's enough value there that you could have eventually made a business out of it.
Hey Kenton, you replied previously (in another thread) that he will keep running Oasis since it's your baby. So, let's say Oasis goes down right now and right now you are in the middle of your orientation... does it mean that nobody will look into it until later tonight (that too if you have time) ? What is the agreement with Cloudflare?
If I get paged I'll step out to deal with it. Other team members can also handle pages, though I'm primary on-call.
Note that Oasis is extremely stable. We haven't had an incident requiring human response in weeks -- and the only kind of issue we regularly had before that is now automated.
I guess I'm confused by the carefully chosen wording of this post. Did Cloudflare acquire Sandstorm? Was this an acqui-hire? Appreciate the clarity and I understand if there are elements that can't be shared publicly.
No, Cloudflare did not acquire Sandstorm, but four of the seven team members (including both founders) are joining as individuals. This was our preference -- I love Cloudflare but I'd also like Sandstorm to remain independent.
Sorry, I meant for that to be clear in the post but I guess I failed.
It's ok. That's why I asked. I've seen acquisition announcements worded the same way so I just wanted to make sure I wasn't misunderstanding. Thanks for the clarity!
If you're running a founder-team-only private company without any real assets, the company only really "exists" for as long as the founders can be considered to be its employees. So, if another company comes along and makes an offer to the people to hire them as individual employees—not a company, not even a gelled team, just individual offers—then the company-that-was just sort of ceases to exist, because it now has nobody working for it. I think that's more what happened here.
If the company sells off all its other assets, yes. But in this case, Sandstorm Development Group, Inc. owns copyright in the Sandstorm open source project and (I'd assume) continues to operate the hosted service.
This matters because Sandstorm Development Group, Inc. (and not Cloudflare!) has the ability to grant proprietary licenses to the Sandstorm codebase.
I suspect they intend to keep working on Sandstorm in their spare time (like most of us on here and our OSS side projects), but my have explicitly had their work contracts worded in such a way as that's allowable (or maybe Cloudflare is giving them x% of time to dedicate to it?)
I'm currently in the process of leaving my full time job to work on my OSS stuff full time again. It's a tough move because you do eventually run out of money if you can't get enough grants, donations or a following on Patreon. That's how I ended up back in a full time gig myself.
Congratulations, I'm really happy for you, Jade, and the rest of the team.
I am sad about the projected lower development activity in Sandstorm in the coming months. You had a tremendous community/brand built up, which is very hard to do for decentralized web technology initiatives -- several independent chat groups I'm in are lamenting the news today. If you're aware of any groups discussing further development in the decentralization space, please let me know (email's in profile, or I will check this comment every hour or so).
The community (including me) is continuing to organize on the sandstorm-dev mailing list (https://groups.google.com/group/sandstorm-dev) and #sandstorm on Freenode IRC. Definitely drop by and say hello!
I've often found that my motivation to work on something decreases if I'm being paid to do it. Side projects are often much more fun and energizing than your day job. Hope you feel the same!
Are you going to integrate any of the LANGSEC principles or stuff like formally-verified parsers at INRIA into Cap n Proto? Or have you already? I've been especially curious to see what a combination of your excellent scheme with verified parsing/protocol tech would be like.
Congrats! Since your LAN House is in Palo Alto do you take the Caltrain up there? What do you think of the commute? I myself am between jobs now and trying to settle on a place - currently located in SF but thinking of moving to Palo Alto.
Yes, Jade and I are taking Caltrain. CF is right next to the SF station so it works. I have yet to find out if spending 1.5 to 2 hours on the train every day will drive me crazy. :)
I started commuting on Caltrain (Mountain View <-> SF) 3 months ago. For me it's quite easy to read a book on the onward trip and work on the return trip. The only time I'm annoyed is when I board in the morning, people never stand in a line (like BART). They just approach the coach from all directions and pile inside. But it's empty enough that I can wait till everybody gets in, and I will still get a free seat/comfortable corner.
I did Santa Clara -> 22nd a couple years ago when I had a cofounder who lived in the city. I found I ended up writing a significant amount of code for that startup on the train. It actually worked out pretty well; I'd put the finishing touches on features on the way in, then discuss with my cofounder, figure out what we needed to do next, maybe talk to some potential users in the city, and write the code for new features on the way back. If you've got most of the reference docs for the frameworks you're using downloaded or memorized, or are working on stuff that's mostly thinking and not looking stuff up (I'd imagine capnp fits in that category), you can get a significant amount of work done.
That's a very good point. Working from train often means spotty internet connection. So it's hard to instantly look up something in Google. A good offline documentation helps. For older languages and IDEs like C and C++ and Visual Studio 2005 like 10 years there were CHM help files for everything. Nowadays almost everything is available almost only as web help, if there is even an official full documentation at all. Also like 10 years ago, books came with an CHM file on CD/download. Nowadays books are already outdated when the get published, things like App and Web development is moving very fast. So starting a new project on train is not a good idea, fixing bugs from error/warning messages isn't a good idea, but doing some maintenance/refactoring work is well suited for boring train rides.
It's not absurd to me anymore, I "get" how CA, and the bay area especially, operate now :) I've heard good things about Verizon, and I personally use a Project Fi phone as a hotspot. It's decent enough for hackernews/reddit/google/stackoverflow, except for the tunnels right outside the city.
Unrelated - I just read your `Owner of a LAN-party optimized house` and it seemed quite interesting, have you made any updates since to the architecture or the hardware? The last blog post was in 2011!
A couple months ago I updated the graphics cards to GeForce 1060s (from 560s) and doubled the RAM to 16 GB, which seems sufficient to run modern games at high graphics settings. Otherwise, no changes.
are you planning to have a pure-python implementation at some point ? I understand about the performance... but there are many usecases where a pure python implementation is far more practical.
I don't have specific plans regarding language support -- mostly that's been up to contributors so far, with me focusing on the C++ "reference" implementation. Hopefully, though, at CF we can find more resources. (The author of pycapnp, Jason Paryani, also joined CF with us, FWIW.)
Your blog post doesn't seem to discuss their response very much.
What leads you to believe that CloudFlare was impressively fast and transparent in this case? Especially since statements from Project Zero seem to imply that they were anything but.
CF disabled the problematic feature within hours, on a Friday evening. After that, figuring out what private data was stuck in search engine caches was obviously going to take some time. It seems clear enough that they were working as fast as they could. Tavis is awesome but I think he was being unfairly hard on them in the project zero thread.
(Note: All of the above is based on my external observations, as I was not yet an employee nor did I have any internal access at the time.)
Even someone as popular as Steve Yegge that probably could have told Google to GFY basically switched to PR speak the next day:
> Yegge wrote a mea culpa the next day and praised Google PR for not coming down on him. He took the post offline but let others keep their copies. And then he stopped ranting and presumably went back to work. If only some politicians could learn from his example. When you make a mistake, the more you talk about it, the longer the story lives.
I don't think anything he says has been a PR disaster and I think his POV is genuinely correct (that could be just because he is persuasive :P).
I think the truth is people want the white lie of PR speak because when you stop and try to be honest, many people will try to use it against you later so its safer to use the PR filter that is a white lie than be 100% transparent.
Sad to here sandstorm.io didn't make it into a viable business for you. I can only imagine just how much energy went into it (and hopefully continues to go).
When time allows, I'd very much appreciate a reflection on the economic viability for innovative infrastructure software in general (hosted vs enterprise sales, also taking RethinkDB's demise and that of other "new stack" startups into account, etc.), and I'm sure others here will as well.
My thesis is that the 2016 elections have graduated this question into a pertinent political problem. The Facebooks and Twitters of 2020 will have to be built on something, and the current model of "VCs/existing tech companies/etc" controlling larger and larger portions of the stack is simply inoperable, long-term, for a free society.
Every VC-funded startup is premised implicitly on "full monopoly" as the endgame for exits that demand extremely high returns from investors. This worked really well from 2002 to 2016, and is still a really good model for a lot of software innovation, but we need to start seriously thinking about other funding models -- grants, not-for-profits, institutions like Harvard/MIT/Stanford, etc., to be involved in funding the next generation of web software technology.
I know this isn't a super popular opinion here :). I'm pretty confident in the reasoning, though. The solution to the sociopolitical problems Facebook unearths isn't "another Facebook", it's rigorous rethinking of the relationship between users, companies, data, and new software applications. Sandstorm was exactly the right next step, but unfortunately kentonv & co had to spend most of their time on enterprise sales, because this is the tail end of what the VC-funding-only ideology expects you to do if you want to enact widespread software innovations.
I still think we had a reasonable business model. The problem is, it was enterprise-targeted, and none of us actually knew how to sell to (or even talk to) enterprise customers. It turns out this is a skill that is not easy to learn. We really should have hired for it, rather than trying to figure it out from first principles. But by the time we realized that, it was too late -- enterprise sales cycles are long so we needed a lot of lead time.
With more investment I think we could have made it work, but it turns out that once you run into any kind of "trouble" as a startup, getting further investment becomes incredibly hard -- and we were pretty bad at the fundraising game to start with (it is, essentially, sales, after all).
I don't think we have any particular evidence that Sandstorm wasn't viable as a technology or a business -- we just personally did not have the right skillset for the business model we chose. :/
"The problem is, it was enterprise-targeted, and none of us actually knew how to sell to (or even talk to) enterprise customers. It turns out this is a skill that is not easy to learn. We really should have hired for it, rather than trying to figure it out from first principles."
That's how I predicted it would fail, too. You were at least able to figure out the problem. Many people will blame everything or everyone else. Enterprise sales is it's own beast that's much different from how things operate in the Valley or smaller shops. That's only half the problem. The other half is you usually have to integrate with systems designed not to integrate with 3rd parties easily. Lock-in loving systems. The healthcare startups are particularly being hit hard by this.
I keep thinking there's potential to get a business going that specializes in the sales and/or enterprise features at reasonable price. A public benefit or nonprofit company that forces the charges to be limited to something reasonable. They can help all these new companies bootstrap the enterprise-specific aspects of their goods for a price. What you think?
The investment market for infrastructure software is difficult now. Some might put it another way, but I think it's challenging given the way that current market model choice moves back and forth between a preference for private or public offerings. Consider the evolution steps from MSPs to SaaS.
Of late, offerings that provide a hybrid approach to software seem to be emerging. Gravitational and Relicated are examples of toolsets for these types of modelss. Take logging as a service and the more traditional self hosted log solution. What would work best users would be one piece of software, which could run both on-premise and in the cloud on someone else's servers, or even someone else's service. On-prem, it might be managed by those who wrote it. Hosted, it might represent a segment of a larger market play for intelligent time series analytics and sold as a monthly revenue rollup.
Deploying these types of models will be interesting, but it's totally doable using some type of lightweight federation across services. Sandstorm represented a portion of the infrastructure needed to do simple deployments that felt like a hosted service. A service proving cryptocurrency payments for their APIs might be another piece of the puzzle.
I continue thinking about these new markets as new technology enables them.
When you say hybrid, do you mean hybrid for the same customer (i.e. parts or instances of the service running at the same time on-premise and in the cloud) or for different customers (i.e. a particular customer will either run on-premise or in the cloud, but not both)?
Both. I'm not a big fan of the term hybrid given the past use of it to describe cloud services. I think of it more like managed SaaS services, where a customer might buy software licensing by the month and run it just on-prem, on-prem and hosted, or just hosted. I can also see a third option where there's an MSP like entity running code on your hosted infrastructure, but licensing software both on-prem (running on your stuff) and then bits running in the cloud as a solution to burst-based problems. AI services will definitely manifest this way.
Nice to see a soft landing for this worthwhile project.
EDIT: disregard the above, that this was a careless misunderstanding of the title. TFA says "Sandstorm, for its part, remains an independent entity, and I won’t be working on it during my day job at Cloudflare. However, I will be working on Cap’n Proto."
The people behind Sandstorm are joining Cloudflare and will now have less time to spend on Sandstorm. That's sad, because I was excited in general behind the idea of Sandstorm, and I hope it continues to be worked on, even if it's just during the weekend.
The really big announcement came about six weeks ago, when Kenton announced that Sandstorm for Work was open sourced and not anymore a product that is for sale.
IMHO this is all really good news, because people who were on the fence about Sandstorm before can feel more free to commit resources to developing it, without worrying that Sandstorm the company is going to come along and make decisions against their interest, or try to make their work proprietary, or leverage it for the company to make more profit at their expense. (Not that we thought those things would happen...)
They would certainly have less time to work on Sandstorm if they were not able to go right out and get full-time jobs, once it was determined Sandstorm the company was out of money! IMHO there is nothing but good news in these announcements.
If anyone is looking for alternatives: http://alternativeto.net/software/sandstorm-io/ . I think Cozy, Yunohost, Cloudron, arkOS are the closest alternatives (Bitnami is about 1-click images and DO/linode are about cloud servers). Would be great to hear about other alternatives...
From the alternatives link "Sandstorm makes it easy to run your own server" and from their website "Sandstorm is a self-hostable web productivity suite". With that in mind, the alternatives seem reasonable.
AFAIK Sandstorm was way out ahead of its competitors in this space; it was the only one HN ever deigned to pay attention to, at least.
Given that, and given that it's FOSS, then as long as the community is still interested in the project, the project should keep seeing development, regardless of the amount of time the original developers can put in. Even if that means a LibreOffice- or MariaDB-like fork.
I also think the alternatives link is a bit misleading because those projects aren't really direct alternatives. I have investigated a bit in this space (for my personal hosting):
* Cozy is simply for a personal server for email/contacts. This is more like ownCloud. It's not meant for hosting internet facing websites.
* Sandstorm is sort of trying to build a Google Docs style portal. People share these docs with each other. This is why it has a "frame" around every doc/app. It's also not meant for hosting internet facing websites (by this i mean, it is not optimized for hosting your personal public blog).
* Bitnami is 1-click images for opensource apps. I am guessing their main user base is people who want to install a stack quickly. It's a glorified docker pull or curl.
* Cloudron is about running apps on a server. This is automation (ssl, backup, install/update etc) of what you would be doing today, if you have to setup a server and install apps like rocket.chat, gitlab etc. Cloudron uses docker for the apps. Yunohost does the exact same thing without docker.
> It's also not meant for hosting internet facing websites (by this i mean, it is not optimized for hosting your personal public blog).
It's an Intranet portal, essentially—like the one you get from setting up Apple's Server.app. But it could certainly be used to host an app that builds an Internet-facing website.
For example, you could launch an authenticated-users-only Wordpress instance on Sandstorm, to allow your team to collaboratively create a website. Then you'd install the Simply Static plugin in that Wordpress instance, and use it to shove a static-slug copy of the site out to a public web-enabled S3 bucket or wherever else you like.
In fact, there's probably a good nascent architecture involving Sandstorm + Minio + nginx where you use Sandstorm on the intranet-side, as a multi-product Content Management System for a public-facing site.
We use it for serializing log events between our Edge colos and our data processing datacenters. We chose it because it is a zero copy serialization protocol, as opposed to protobuf which doesn't do zero copy.
We don't currently use the RPC framework specifically though only the serialization.
At present they (uh, we, I suppose!) use the serialization only, for a variety of uses including (publicly disclosed) the log analysis system. I have some ideas where the RPC would be useful, though.
I would much rather use "self hosted" apps like Gitlab, sandstorm, graylog, and the like than SAAS offerings. Your data security and upgrade cycles are in your own hands.
That's part of the reason people use SAAS offerings in the first place. They don't want to have to be responsible for data security and upgrade cycles. These things are hard and when even the professionals get it wrong, what hope does a layman have?
Sandstorm takes apps that normally would need someone to run a server, breaks them down into the smallest shareable unit (a "grain"), and lets you host them all centrally, on the Sandstorm server. Whether that's the Oasis service, your own machines, or something in between.
So instead of running a whole Gitlab server, you'd run a Sandstorm server that hosts Gitlab grains, each consisting of a single repository. Gitlab itself is no longer responsible for sharing and authentication, the Gitlab grain lets Sandstorm handle that. Apps need to be rewired to fit into this paradigm, but once they do, it feels like one coherent platform rather than a bunch of separate individually hosted apps that maybe you glue together somehow.
For Etherpad, instead of running an Etherpad server that hosts many notepads, you create an Etherpad grain that has just one shared pad. For a drawing app, the grain would be just one drawing. For a photo gallery app, maybe it would be just one gallery. (These are all real examples, these grains all exist and work as I'm describing.)
> Whether that's the Oasis service, your own machines, or something in between.
To clarify: does a "grain" just run in one place, or is each "grain" a cluster of containers that can be spread-scheduled between disparate regions (i.e. "the Oasis service" and "your own machines" at the same time)? And, if the latter, do the cluster nodes have awareness of their locations, and can take advantage of that?
I have honestly no idea, other than I know that the grains are not permanent resident processes. IOW when they are not in use, they spin down and do not actively consume resources other than disk.
I think the grain runs in one place, I think to use the container analog it would be one container, but as a point of clarification I believe that it is not using one of the container runtimes like docker or rkt that you will be familiar with, but uses its own sandboxing instead.
I have not tried running the newly opened up Oasis backend so I can't really speak toward how it works. My experience with a single server running Sandstorm has been good, painless automatic upgrades, I would expect from what I know about this team, that since it's their dogfood they made it taste good.
But, here's the things I expect to be repeating a lot:
- Sandstorm is still an independent company, under control of Jade and myself.
- Sandstorm is "my baby" and I'm not going to stop working on it just because I have a day job. Yes, it will slow down a bit -- but on the bright side, we were previously spending the vast majority of our time trying to figure out enterprise sales (rather than actually building cool features), and we don't have to do that anymore. So, the difference may not be as big as you'd think. See: https://sandstorm.io/news/2017-02-06-sandstorm-returning-to-...
- Nevertheless I'm pretty excited about the work I'll be doing at Cloudflare as well -- which includes Cap'n Proto and other fascinating infrastructure tech.