Early adopter signup gives access to the source, which is licensed under GPLv3. However, there is a binding Beta agreement which states that I may not redistribute the code/application.
The VoltDB software is being published as open source software under the Gnu Public License V3. In other words, as Early Release program members, you now have access to the source files for VoltDB and the development tools for browsing the source code and the current issues lists.
Please note, however, that the Beta agreement is still in effect until VoltDB is officially available. So we ask that you not redistribute the source or binaries beyond your own use at this time.
All other non-permissive additional terms are considered “further restrictions” within the meaning of section 10. If the Program as you received it, or any part of it, contains a notice stating that it is governed by this License along with a term that is a further restriction, you may remove that term. If a license document contains a further restriction but permits relicensing or conveying under this License, you may add to a covered work material governed by the terms of that license document, provided that the further restriction does not survive such relicensing or conveying.
Somewhat ironically, the reason for this is that the FSF holds the copyright to the GPL, so only they can change it. Well, you could probably change it, but you couldn't call it the GPL anymore... and it would probably be copyright infringement.
I wonder what would happen if the GPL itself was licensed under the GPL... I suspect it would start raining turtles. :)
From that FAQ: "If someone asks you to sign an NDA for receiving GPL-covered software copyrighted by the FSF, please inform us immediately by writing to license-violation@fsf.org."
Also from that FAQ: "If the violation involves GPL-covered code that has some other copyright holder, please inform that copyright holder, just as you would for any other kind of violation of the GPL."
Right.... So it's probably perfectly binding. If they hold the copyright outright, they can create such conundrums lik e conflicting licenses.
If it incorporates the work of others though , it's tricky.
Looks to me like it's carefully worded to be a non-binding NDA--they "ask" that you not redistribute the source, but they don't provide the code solely upon receipt of that promise, or threaten any penalties for it.
As far as I know, it's both legal and meaningless to ask for anything you like.
Legally, probably not. They aren't demanding that you not redistribute, just asking. And asking politely. They are banking on the good nature of people. Good for them! And us.
Exactly. The Condor Project did this for many years and, miraculously, every one of the hundreds of users to whom we gave the source respected the request and the source never leaked. (It is now fully open.)
I can't speak for them, but in our case we had major users who required open-source (mostly European govt. projects).
All our developers wanted to just post the source online and be done with it, but our boss had security concerns he wanted us to address first -- so in the meantime we gave people who needed it GPL'ed copies, and asked them politely not to redistribute until we were ready.
Probably not, but if people start disrespecting their wishes, and the code is all their own (that is, does not include other parties' GPL'd code), then they can simply change the license and get the same effect.
I don't understand how they could avoid locks. They say that each replica runs independently and hence that avoids locking; what about write updates to the same object from different clients? That would need locking of some form at some level.
They should also explicitly discuss the failure models for which they're guaranteeing Durability. If you're scaling to "web-scale", and running this db in a data-centre, then, a single failure would wipe out a rack of machines. Coordinated failures are not uncommon in data-centres. What about those?
You avoid locks by serializing transactions at any site. Since you're not waiting on disk (in memory DB) and each partition runs on its own block of memory and has its own cpu/thread, you simply don't let two transactions on the same partition run concurrently.
In that case there will need to be some kind of locking, which is probably the reason the whitepaper recommends avoiding transactions that span multiple servers: "For multi-partition transactions, one engine distributes and coordinates work plans for the other engines. VoltDB assumes that an application designer can construct a partitioning/cloning scheme and a transaction design that makes a large majority of the transactions local to a single virtual node. "
I had an idea for handling this situation that exploited the replication functionality. Basically, you prepare for a transaction by migrating the primary copy of all the objects involved to the same node. One would have to avoid deadlocks by sorting and serializing transaction-prep migration blocks. So transactions never span multiple servers, but you get this by possibly dramatically slowing down the time it takes to prepare for such transactions. (With the idea that the application designer is encouraged to avoid such transactions.)
If you do this for every transaction, you'll have more overhead than you'd have if you had just performed distributed consensus. But in principle you're right. It makes sense to put all the primary copies that are often used together on the same node. The problem is that it's hard to decide the ideal placement strategy. VoltDB puts this responsibility into the hands of the developers.
The point is to a) not have to implement distributed consensus and b) require the user to have to design the database for localized transactions if performance is desired.
"Conventional databases experience disk and user stalls within transactions. Rather than let the CPU be idle during the stalls, those DBMSs interleave SQL execution from multiple transactions during the waits so the CPU is always busy. This is what requires much of the complex latching and locking overhead.
VoltDB doesn’t experience user stalls (since transactions happen within stored procedures) or disk stalls (because VoltDB processes data in main memory). Therefore, it is able to eliminate the overhead associated with multi-threading (latching) and locking. Each VoltDB execution engine is single-threaded and contains a queue of transaction requests, which it executes sequentially—and exclusively—against its data. Elimination of stalls and the need for locking and latching overhead and allows typical VoltDB SQL operations to complete in microseconds.
For single-partition transactions, each VoltDB engine operates autonomously. For multi-partition transactions, one engine distributes and coordinates work plans for the other engines. VoltDB assumes that an application designer can construct a partitioning/cloning scheme and a transaction design that makes a large majority of the transactions local to a single virtual node. Many common applications such as order-fulfillment, software as a service (SaaS), Web 2.0 and trading systems have this property."
That still doesn't completely answer the question how it avoids race conditions and the likes without locking. Assume I'm executing multiple long-running transactions, each hitting and modifying a lot (millions) of different objects. How are race conditions avoided in this case ? One engine that distributes and coordinates work plans for the other engines sounds nice, but does this essentially mean that only one of these transactions is able to run at the same time, even when they could be running concurrently when using locking?
From the FAQ: "Durability: VoltDB provides both replication of partitions (known as K-safety) and periodic database snapshots to ensure the availability of the data."
I expect there is a persistent store of some sort, you'll have to register to find out. I'm also unable to find a license without registering.
It does appear from the FAQ that you build your transactions as Java stored procedures. Constraining a generic SQL database that way will make it much easier to distribute and scale.
Their FAQ states:
Durability: VoltDB provides both replication of partitions (known as K-safety) and periodic database snapshots to ensure the availability of the data.
I suppose it depends on your definition of "D". Seems like this might be acceptable if you had different replicas on different power sources, etc.
In-memory DBs can take a lot of excellent shortcuts but I have to wonder if their usability is limited. After all, I'd think most interesting data sets end up quickly eclipsing the available RAM in most off the shelf/affordable servers.
Keep in mind that VoltDB targets the OLTP market, where datasets tend to be not that large. If you build a small cluster of 50 nodes with 256GB RAM each, you have about 4.3TB of storage (with a replication factor of 3).
The VoltDB software is being published as open source software under the Gnu Public License V3. In other words, as Early Release program members, you now have access to the source files for VoltDB and the development tools for browsing the source code and the current issues lists.
Please note, however, that the Beta agreement is still in effect until VoltDB is officially available. So we ask that you not redistribute the source or binaries beyond your own use at this time.
Is this legally enforceable?