Friday, March 08, 2013

MongoDB - How to Easily Pre-Split Chunks

Okay, so I'm doing some performance testing on MongoDB, spinning up a new configuration, and then throwing lots of data at it to see what the characteristics are.

Early on, I discovered that database locking was severely limiting performance.  MongoDB locks the database/collection (as of MongoDB 2.2.3) during every write operation.  However, it only locks it for the specific shard being written to.  So, massive numbers of writes dictates large numbers of shards, to minimize the lock/unlock bottleneck.  That's the theory.

In practice, this indeed proved correct.  We changed from a running (on 4 boxes - 2 primaries and 2 replicas) with 2 shards to running with 48 shards.  YES, they all magically cooperate nicely and share memory and CPU equally.  Actually, the boxes were all 196 GB memory w/ 32 cores each, so we weren't limited on machine capacity.  YMMV.

BTW, at peak operations, we're only seeing CPU of 50% per daemon, so we're not CPU limited. 

But, during my tests, I had to spin up first 4 shards, then 8, then 24, then 48.  Waiting for each test to settle into a steady state took time.  Each shard had to be active and equally used.  Likewise, there had to be enough data for it to be reading and writing at the same time, where the data was read out to send to the replicas.

The startup was handicapped by there being all activity on one shard, and not the others.  I would start things off, and all the activity would just be on 1 shard.  Since I was maxing out the capacity, there wasn't any spare IO available to do the splits and balancing.

To fix this, I found that I could add chunk split-points ahead of time, a process called 'pre-splitting chunks'.  To do this:
  • Turn off the balancer.
  • Run the command 'split' with a middle value, multiple times according to your number of shards.  
  • Turn on the balancer again.

I had 48 shards on a key with range 000-999, so I pretended I had 50 shards and gave each shard 20.

No, you don't have to stop the database from doing anything, like failover to secondaries or anything like that.

So, my commands were:

$ mongo --port 99999999 --host myhost
> use config
> sh.stopBalancer();
> use admin
> db.runCommand({split: "myDb.myCollection" middle:{"key1":000,'key2':1}});

> db.runCommand({split: "myDb.myCollection" middle:{"key1":020,'key2':1}});
> ... (46+ more of these)
> use config
> sh.startBalancer();



NOTE:  Always leave the balancer off more than 1 minute or it doesn't count.  That is, if you're 'bouncing' the balancer, leave it off for a full 60+ seconds or the bounce doesn't do anything.

No comments: