LightCloud benchmarks

I am developing a distributed hash table. The promises so far are following:
  • The ring is created using consistent hashing (something that both Amazon Dynamo and Chord use). Finding a node for a key on the ring is O(log(n)) (binary search is used).
  • All nodes in the ring are replicated using master-master replication. This ensures high availability. Replication through data centers is also easily supported.
  • Every node can store 1 million records in 0.7 seconds for hash database and 1.6 seconds for B+ tree database [source].
  • Every node's database size can be up to 8EB (9.22e18 bytes) [source].
  • Using a hash database get, set, delete are O(1) - constant time! For a tree database they are O(log(n)).
  • Using only 20MB of RAM a node can easily handle 10 million records [source].
  • One can dynamically add nodes to the system to scale it upwards.
  • Nodes are expected to fail.

I.e. it's a pretty amazing deal :-)

LightCloud vs. memcached

I have benchmarked LightCloud vs. memcached - not that fair comparison as memcached only works with memory (which should be much faster than working with disks!) Some info about the LightCloud setup:

  • There are 2 lookup nodes (2 master-master pairs - i.e. 4 nodes total)
  • There are 6 storage nodes (6 master-master pairs - i.e. 12 nodes total)

Set benchmark (10.000 set operations on some random words):

LightCloud: Time it took to set 10000 records: 16.6252090931
Memcached: Time it took to set 10000 records: 4.98504710197

So LightCloud is about 3 times slower than Memcached (this is expected as a set does one get and two sets in LightCloud).

Get benchmark (10.000 get operations on some random words):

LightCloud: Time it took to get 10000 records: 4.50872015953
Memcached: Time it took to get 10000 records: 2.83214092255

LightCloud is only a little slower than Memcached when doing get operations. Pretty nice :-)

Announcements · Benchmarks · Code · Python 28. Nov 2008
© Amir Salihefendic. Powered by Skeletonz.