Fragmentation on MongoDB databases
When we removing a large amount of data from the collection and no plan to replace it, we generate a fragmentation, to deal with it, we have a COMPACT command.
The compact command rewrites and defragments all data and indexes in a collection, depending on what storage engine we have.
Differences between Storage – Engines (included in community edition)
|COMPACT||On MMAPv1, compact defragments the collection’s data files and recreates its indexes. Unused disk space is not released to the system but instead retained for future data.||On WiredTiger, compact attempts to reduce the required storage space for data and indexes in a collection, releasing unneeded disk space to the operating system.|
|COMPACT||Compact requires up to 2 gigabytes of additional disk space to run on MMAPv1 databases.||Compact does not require any additional disk space to run on WiredTiger databases|
|COMPACT||If you wish to reclaim disk space from an MMAPv1 database, you should perform an initial sync.||–|
- Always have an up-to-date backup before performing server maintenance such as the compact operation.
- Compact only blocks operations for the database it is currently operating on. Only use compact during scheduled maintenance periods.
- Compact command do not replicate to secondaries in a ReplicaSet
- Compact each member separately
- Ideally run compact on a secondary, this command forces to enter en recovering state
- Shared Clusters
- Compact only applies to a mongod instance, In a shared environment, run compact on each shard separately as a maintenance operation.
- Index Building
- mongod rebuilds all indexes in parallel following the compact operation
- I recommend applying the compact operation if the result of the next formula is greater than 15%
- 100% – (Data + Indexes) / storage > 15%
- On databases larger than a few gigs can take a while to complete
Compacting and Compressing are two different things, please don´t confuse. These two concepts are totally different in MongoDB.