Fragmentation on MongoDB databases
When we removing a large amount of data from the collection and no plan to replace it, we generate a fragmentation, to deal with it, we have a COMPACT command.
The compact command rewrites and defragments all data and indexes in a collection, depending on what storage engine we have.
Differences between Storage – Engines (included in community edition)
COMMAND | MMAPV1 | WIREDTIGER |
COMPACT | On MMAPv1, compact defragments the collection’s data files and recreates its indexes. Unused disk space is not released to the system but instead retained for future data. | On WiredTiger, compact attempts to reduce the required storage space for data and indexes in a collection, releasing unneeded disk space to the operating system. |
COMPACT | Compact requires up to 2 gigabytes of additional disk space to run on MMAPv1 databases. | Compact does not require any additional disk space to run on WiredTiger databases |
COMPACT | If you wish to reclaim disk space from an MMAPv1 database, you should perform an initial sync. | – |
Considerations:
- Always have an up-to-date backup before performing server maintenance such as the compact operation.
- Compact only blocks operations for the database it is currently operating on. Only use compact during scheduled maintenance periods.
- Compact command do not replicate to secondaries in a ReplicaSet
- Compact each member separately
- Ideally run compact on a secondary, this command forces to enter en recovering state
- Shared Clusters
- Compact only applies to a mongod instance, In a shared environment, run compact on each shard separately as a maintenance operation.
- Index Building
- mongod rebuilds all indexes in parallel following the compact operation
- I recommend applying the compact operation if the result of the next formula is greater than 15%
- 100% – (Data + Indexes) / storage > 15%
- On databases larger than a few gigs can take a while to complete
Compacting and Compressing are two different things, please don´t confuse. These two concepts are totally different in MongoDB.