Introduction
From our initial list of N candidates, we narrowed the field down to three:
- LightCloud
- Cassandra
- MongoDB
Please see the page on RDBMS Alternatives for a discussion of the other kv store candidates and why they were eliminated.
The Candidates
Light Cloud
Since Light Cloud is based on Tokyo Tyrant, it gets initially high marks -- prior to testing -- for scalability, performance, and concurrency. Likewise, we have reason to believe it will be strong on high availability. It also gets high marks for Python support since Light Cloud itself is written in Python. At this point, we're not sure how easy it will be to use nor what its back-up capabilities are. Although it is used by the micro-blogging site Plurk.com, we don't have the impression that it has a large, active user community behind it.
We see it as having the following advantages :
- Fact that it's built on top of Tokyo Tyrant is a plus
- It offers scripting in Lua
- It appears to offer the benefits of Tokyo Tyrant without needing to implement on our own hashing layer
and risks:
- Plurk is the only company using it
- It's been suggested that TT chokes at about 70GB of data, so we may want to test that.
Cassandra
Since Cassandra is a fully distributed database, we are optimistic about it scalability potential and its high availability. We don't yet know its performance characteristics nor its ability to handle many concurrent reads and writes. We know that Python bindings exist, but it appears harder to use than the other two candidates. It has an active user community.
Additionally, it has the advantage of being by far the most "battle tested" of any of the candidates with the following risks:
- Doesn't seem as easy to use
- Performance and concurrency need to be tested
- Still under heavy development, described as "not polished"
Mongo DB
Mongo is said to have good performance, but we have some concerns about scalability given that sharding is apparently an alpha feature. We're not sure of Mongo's support for high availability nor its ability to handle high concurrency. Python support appears strong and it appears easy to use. We're unsure of its back-up capabilities. The user community appears active and enthusiastic; there are not a lot of big production developments.
We see the following advantages:
- More features than Light Cloud
- Probably easier to use than Cassandra
- Lots of enthusiasm around Mongo suggests that it will continue to be used
- Very nice documentation
and risks:
- No huge production deployments
- Sharding is alpha
- Performance needs to be validated
Testing the Candidates
In order to fill out our understanding of each of the above candidates, we devised a series of tests to try to put them through their paces. The testing we did was designed to be as thorough as possible while being done in a limited amount of time. The tests below were conducted by my colleague, Matt Ernst, and the following write-up is based on the results he compiled.
Basic Installation/Read-Write Test
The first test was just to see how easy it was to get the candidate up and running and working with Python to do basic reads and writes. To this end, we did the following:
- Download the candidate software and Python bindings
- Install the software and bindings
- Get basic reads and writes working
Cassandra
- It was very simple to download a copy of Cassandra 0.4.2 and start it with the default configuration file.
- It took some effort to get the Lazyboy client working. There are several dependencies that need to be satisfied first (this was easier under Ubuntu than under a vm running CentOs 5).
- Adding or changing keyspaces and column families requires restarting the server after altering a configuration file. It was not clear how or if we could add a new column family to meet new data needs without service downtime.
Lightcloud
- It was straight forward to get LightCloud working, though a number of separate source packages had to be downloaded, compiled, and installed first.
- Setting up the client was easy.
- High-availability is available by default.
MongoDB
- Mongo is available in a single prebuilt package.
- The client can be added with easy_install.
Basic Performance Test
Next we did a basic performance test defined as follows:
- Write a Python script that generates performance numbers for the following (I would suggest using N ~ 100,000, but feel free to vary that somewhat as needed):
- Read/write N key/value pairs - keys are longs, values are integers
- Read/write N key/value pairs - keys are strings<10 chars, values are integers
- Read/write N key/value pairs - keys are longs, values are dicts (or dicts rendered as strings), values have byte-length ~ 100 bytes
- Read/write N key/value pairs - keys are longs, values are dicts (or dicts rendered as strings), values have byte-length ~ 10000 bytes
Basic Test Results
Read and write speeds seem to be in competition; no candidate won for both reads and writes. MongoDB won on combined read and write times.
Concurrency Test
After that we did a test of concurrency:
- Write a Python script that uses M (M < 100) clients to write to/read from the db.
- Perform the tests in the basic performance test using these clients (N can be smaller if need be)
Concurrency Results
All of the solutions had somewhat reduced performance with increasing numbers of concurrent clients, like due to the Python client global interpreter lock. If a storage backend was not limited by I/O or CPU with a single client connection it was not used any more fully by concurrent connections.
Failover Test
Next we tested failover:
- Research how each of the candidates handles failover.
- For each candidate, configure the software for failover.
- Using the basic performance test above, run the script and simulate the need to failover.
- Validate that failover works for each candidate.
Failover Results
Cassandra
Configuring replication on Cassandra was straight forward The Lazyboy client uses available Cassandra instances on a round-robin basis. It does not, however, gracefully handle failure if an instance is dead. Adding transparent failure recovery to the client would require some development effort, which we did not do.
Lightcloud
Lightcloud comes configured for high-availability by default. LightCloud's HA works fine when Tokyo Tyrant instances are killed; when instances are stopped instead of killed, performance becomes agonizingly slow.
MongoDB
MongoDB requires some simple effort to set up replication. The tests worked fine when one replica or the other was killed. When a replica was stopped instead of killed, performance was terrible as with Lightcloud. This apparently means it is better for a server or software instance to crash completely than to remain alive but unresponsive.
Final Recommendations
After my colleague, Matt Ernst, finished testing the three candidates, he made the following recommendations:
MongoDB
Mongo is the standout among the 3 options tried in depth. Its speed is very good, especially on writes. The community is active, software releases come at a quick pace (a sign of health if not exactly convenient for maintenance), and the documentation is better than that of the other options. MongoDB offers a structured approach to storage that is much more attractive than a plain persistent key/value store. I think it will be a good fit for many kinds of data that Loomia uses.
Cassandra
Cassandra seems like it might be very powerful, scalable, and resilient. Improved documentation and Python clients would make it much more attractive. It appears to be designed with the most thought toward distributed operation from the beginning. The enormous scale of Facebook's use speaks to the soundness of the design. If I could wait another year for features and clients to be polished, and for someone to write an O'Reilly book about it, it might be the best choice. Right now it has too many rough edges, not enough cohesive documentation.
LightCloud
LightCloud seems to be nice enough software, but it doesn't have enough standout points to triumph over MongoDB and Cassandra. It is even lighter on documentation than Cassandra and iterating over data without knowing keys in advance (important for testing and debugging) is very slow. There is no command line shell available. If it had been available a few years ago it might have been a nice alternative to memcache so we could implement a "bottomless cache," but for a broad SQL alternative it seems slight.
The Winner
We thus opted for MongoDB. We have not yet deployed Mongo in production, but are hoping to do so sometime in the next month.
