Replica set health

Replication lag

Typically, in our replica sets, we monitoring the replication lag, that is a measure of who far a secondary is behind the primary:

 

Exist two types of replication:

Chained replication, configured by default.  A chained replication occurs when a secondary member replicates from another secondary member instead of from the primary. Recommended for reducing the load of the primary.

Primary replication needs to explicit configure. A primary replication occurs when all secondary members replicate from the primary.

In mongo shell we get the information with rs.printSlaveReplicationInfo()

 

 

Oplog window

The other important aspect that we need to be in mind, is the Oplog window, is the time difference between the oldest and newest entries in the oplog.

 

 

The oplog window tells us the amount of time to work with secondary members, for example:

  • Add new secondary member
  • Install new OS patch
  • Install new MongoDB version
  • Full resync of the secondary member
  • Recover a secondary member from a failure

For show information of oplog window in mongo shell rs.printReplicationInfo()

 

Conclusion

It´s very important to monitor the health of our replica sets, the aspects to monitor are:

Oplog lag

OplogWindow

We can use monitoring tools like Nagios, Mongo Compass, Atlas, to automating the monitoring and alert if something goes wrong.

 

Daniel Cruz
Author: Daniel Cruz

Leave a Reply

Your email address will not be published. Required fields are marked *