For the last few years, the world of NoSQL databases has been filled with exciting new projects, ambitious claims, and plenty of chest beating. The hypesters said the new NoSQL software packages offered tremendous performance gains by tossing away all of the structure and paranoid triple-checking that database creators had lovingly added over the years. Reliability? It's overrated, said the new programmers who didn't run serious business applications for Wall Street banks but trafficked in trivial, forgettable data about people's lives. Tabular structure? It's too hidebound and limiting. If we ignore these things, our databases will be free and insanely fast.
Alas, just as the summer of love ended and reality set in, the boundary-free experimentation with NoSQL databases is slowly being brought down to earth. Oracle, the developer of top-notch, bulletproof SQL databases, has arrived at the hippie fest with a solid, practical, and very Oracle-like NoSQL server. While the crazy dreamers can continue to craft NoSQL data stores, serious people will want to take a look at Oracle's version. It offers many of the features that make NoSQL fun but also the solid performance promises that tend to come from big, serious teams of engineers. NoSQL pioneers will want to tell themselves that imitation is the sincerest form of flattery.
The arrival of this product might be a surprise to NoSQL fans who have listened to old-school DBAs talk with pride about Oracle databases, but Oracle has been slowly moving down this path for some time. Five years ago, the company bought Sleepycat Software, the creators of the open source Berkeley DB, a tool with a long and rich tradition of flexible, key-value storage for C and lately Java programmers. This same Berkeley DB technology is said to be at the core of Oracle NoSQL Database, although it seems to be a complete rewrite.
Oracle NoSQL: Practically ACID
The fun part of Oracle NoSQL is the key-value structure. You don't need to define a schema or lock yourself into a big tabular architecture. You just create keys and attach a bag of bits to them. You might link your key to a string or an image file or anything. The database accepts the bytes and doesn't think much about the contents.
Oracle breaks up the key into major and minor parts. You can think of the major part as the object pointer and the minor part as the fields in the record. So you might put a name and Social Security number into the major parts of the key and other strings like the street address and ZIP code into the minor parts. It's comparable to the way that some other NoSQL tools let you think of the value in the pair as being an object with multiple fields. Oracle just uses the term "minor key" for the names of the fields.
The serious part of Oracle NoSQL is a practical approximation of ACID compliance, the standard that SQL databases like to offer. ACID means "Atomic, Consistent, Isolated, Durable transactions," and there's a robust debate about just what this translates to in excruciating detail. Most NoSQL systems promise a different acronym, BASE, which stands for "Basically Available, Soft State, and Eventually Consistent." In other words, you'll probably get the right answer except when you don't.
There will be plenty of debate about whether Oracle NoSQL offers real ACID compliance. The promises aren't as all-encompassing as they are with SQL databases. You only get an ACID promise when you write data attached to the same major part of the key. For example, you could change the address and ZIP code of the same person and get an ACID guarantee because both parts are stored under the same major key. But you get no guarantee that changes to two separate people will remain consistent. In other words, a bank could use Oracle NoSQL to store personnel records, but not to safely transfer cash between accounts because there's no ACID guarantee that the money won't get lost along the way.
Oracle NoSQL is able to make this promise because it guarantees that one master machine will hold all of the minor keys associated with a major key. Attach any collection of fields to a major key defining a person, and all of this data will end up in the same node in the cluster. But the data from different major keys could end up on different machines, and Oracle NoSQL doesn't have a mechanism to ensure that the data will be written to both simultaneously.
You can also add replication and sharding, which Oracle calls "partitioning." In essence, you arrange the nodes in a rectangle where the sharding occurs along one axis and the replication occurs across the other. If you want more reliability and faster reads, you add more machines along the replication axis. If you want less contention, you add more machines along the partitioning axis. Oracle NoSQL handles most of this configuration for you.
Again, this structure stores data with Oracle-grade seriousness. If you don't want the slacker-grade promise of eventual consistency offered by so many other NoSQL stores, Oracle NoSQL will deliver absolute consistency across all of the machines replicating a node. You'll pay for this in write performance, of course, but it's your choice.
This is more than a binary decision, by the way. You can tell Oracle NoSQL to sign off on the write after one, all, or a simple majority of the nodes are finished sending the data to disk. The documentation calls this feature a durability policy.
Some of this flexibility is available to you, the programmer, if you have the time to worry about it. All of the key-value pairs come with a version number, which you can watch yourself if you want to play your own games with replication. This can be helpful if you're trying to goose performance when modifying records.