LightCloud benchmarks

I am developing a distributed hash table. The promises so far are following:
  • The ring is created using consistent hashing (something that both Amazon Dynamo and Chord use). Finding a node for a key on the ring is O(log(n)) (binary search is used).
  • All nodes in the ring are replicated using master-master replication. This ensures high availability. Replication through data centers is also easily supported.
  • Every node can store 1 million records in 0.7 seconds for hash database and 1.6 seconds for B+ tree database [source].
  • Every node's database size can be up to 8EB (9.22e18 bytes) [source].
  • Using a hash database get, set, delete are O(1) - constant time! For a tree database they are O(log(n)).
  • Using only 20MB of RAM a node can easily handle 10 million records [source].
  • One can dynamically add nodes to the system to scale it upwards.
  • Nodes are expected to fail.

I.e. it's a pretty amazing deal :-)

LightCloud vs. memcached

I have benchmarked LightCloud vs. memcached - not that fair comparison as memcached only works with memory (which should be much faster than working with disks!) Some info about the LightCloud setup:

  • There are 2 lookup nodes (2 master-master pairs - i.e. 4 nodes total)
  • There are 6 storage nodes (6 master-master pairs - i.e. 12 nodes total)

Set benchmark (10.000 set operations on some random words):

LightCloud: Time it took to set 10000 records: 16.6252090931
Memcached: Time it took to set 10000 records: 4.98504710197

So LightCloud is about 3 times slower than Memcached (this is expected as a set does one get and two sets in LightCloud).

Get benchmark (10.000 get operations on some random words):

LightCloud: Time it took to get 10000 records: 4.50872015953
Memcached: Time it took to get 10000 records: 2.83214092255

LightCloud is only a little slower than Memcached when doing get operations. Pretty nice :-)

Announcements · Benchmarks · Code · Python 28. Nov 2008
6 comments so far

20MB won`t be enough for 10 million records,that means 2 byte per record and you have 16 byte overhead per record in Tokyo Cabinet,so if your record is for example 24 byte you can store only half a million records(as each record is 40 bytes).

Uriel:
Snippet from Tokyo Cabinet's specification page:

"""
... If it is made into a performance index, in order to handle a database containing one million of records, a bucket array with half a million of elements is needed. The size of each element is 4 bytes. That is, if 2M bytes of RAM is available, a database containing one million records can be handled.
"""

It should be noted that I haven't tested this claim.

i think he mean that each element is a pointer to the actual key-value pair,so the index is 2MB for 1 million records + the size of your records

Uriel:
When you are fetching data, then it's the index that is important. E.g. with 10 million records your index size will be 20 MB in memory, your data size can be 50 gigabytes - - but lookups will still be O(1).

Amir,

I'm looking for a DHT with that kind of performance.. Are you planning to open-source it? If it's be able to do what you say it will, I'm extremely interested :).

I hope you are doing well! Your blog looks great (I now subscribe to your RSS feed).

-Ken Keiter

Ken:
It's already open sourced! Check it out here :)

Post a comment
Commenting on this post has expired.
© 2000-2009 amix. Powered by Skeletonz.