Saturday, 26 November 2011

Overview of MongoDB Java Write Concern Options


I spent sometime this week working with MongoDB write concerns within the Java Driver. A write concern controls the behaviour of your write operation based upon your provided write behaviour requirements.

In this blog I going to attempt to provide an overview of the basic write concerns options available to you however the driver does provide additional options not explained here.

Default Write Concern
Lets imagine you want to persist a simple document with the property 'name' with the value 'test' within in the collection 'customers'. Your first version may look something like this:

BasicDBObject dbObj = new BasicDBObject();
dbObj.put("name", "test");
DBCollection coll = getCollection("customers");
coll.save(dbObj);


The question you should be asking is where did that 'save' put the document? The answer is it wrote it to the driver and returned immediately. By default the driver will perform a write behind to the mongodb server. This is very powerful but remember the data may not be on the server when the method returns. For example if you do a findOne(dbObj) immediately after the save it may return null as the data may not have yet reached the server.

Write Safe
The Java Driver allows you to specify the write concern you require before the save method returns. For example you may decided that instead of just writing to the driver you would like to wait for the server to receive the write operations before returning.   
 
coll.save(dbObj,WriteConcern.SAFE);

Now when save is executed it blocks until the primary node acknowledges it received the write operation. If you do a findOne(dbObj) immediately after the save, the primary will return the saved object as the data reached the server in order to the save to  have returned.

Write Majority
Ensuring the data has reached the primary is useful however your availability requirements may want to ensure that the data has reached the majority of servers within your replica set. For example lets imagine you have a five server replica set, you may want the write to have been replicated to at least three servers before returning.

coll.save(dbObj,WriteConcern.MAJORITY);

One way to think about the MAJORITY option is that it expends the SAFE options by increasing the number of nodes to be the majority of the replica set. The benefit of this it that you know the write has reached many servers however disadvantage is that we had to block for longer as we are now waiting for the majority of severs to acknowledge the write.

Write FSYNC_SAFE
You can increase the write concur option to ask the save operation to wait until the data has been written to the mongodb server data file.

coll.save(dbObj,WriteConcern.FSYNC_SAFE);   
  
There are some use cases where it make sense to wait for the server to flush the writes however this is not as common you may initially think it is. When you horizontally and elastically scale your data nodes within the cloud then the fact the write is on one of the disks really does not mean that much as they machine maybe shutdown or be moved by the cloud platform at anytime. If you have a replica set that is distributed over multiple data centres then you may achieve greater durability by replicating across data centres with a tagged write concern. I may cover MongoDB tagging in a future blog.

Write JOURNAL_SAFE
MongoDB also supports a journal for both a single server and replica set environment. As you would expect the journal provides a list of operations that have been performed on the server. If required you can specify a write concern that blocks until the write has been flushed to the journal log.

coll.save(dbObj,WriteConcern.JOURNAL_SAFE);    

One use case where you may consider using this option is within a single server environment as a means of checking that the write has got to the journal file.

Summary
This blog has only covered the basics of the MongoDB Java Driver Write Concern options. There are many additional options that you could explore.

The API is very powerful and can give you a range of write options however you should really consider your use case, data availability concerns and system setup when deciding which write concern makes sense to your write operations.

4 comments:

  1. Good stuff, and this is a great feature of MongoDB and the drivers. I went through this same survey with MongoDB about a year back. Note that FSYNC-safe adds an order or magnitude to the wait time, but it's appropriate for the occasional critical write (like creating a new user account, vs. just an update). Like most things in life, it's a tradeoff -- just always note the write concern when you're evaluating MongoDB performance.

    ReplyDelete
  2. Thanks for this post Chris

    ReplyDelete
  3. What kind of write concern do you suggest for logging?
    I usually have to write in a log collection. I would like to know what is the recommended.

    ReplyDelete
  4. If i am using w:majority, and data recieved by primary node and suddenly primary gets down then there should be rollback or not ?

    ReplyDelete