how to

Understanding MongoDB space usage

By September 18, 2014 August 18th, 2022 No Comments
ObjectRocket skyline

For those of you new to using MongoDB, MongoDB space usage can seem quite confusing. In this article, I will explain how MongoDB allocates space and how to interpret the space usage information in our ObjectRocket dashboard to make judgements about when you need to compact your instance or add a shard to grow the space available to your instance.

First, let’s start off with a brand new Medium instance consisting of a single 5GB shard. I am going to populate this instance with some test data in a database named “ocean”. Here’s what the space usage for this instance looks like after adding some test data and creating a few indexes (for the purposes of this article, I deliberately added additional indexes that I knew would be fairly large in size relative to the test data set):

How is it that 315MiB of data and 254MiB of indexes means we are using 2.1 GiB out of our 5 GiB shard? To explain, let’s begin with how MongoDB stores data on disk as a series of extents. Because our ObjectRocket instances run with the smallfiles option, the first extent is allocated as 16MB. These extents double in size until they reach 512MB, after which every extent is allocated as a 512MB file. So our example “ocean” database has a file structure as follows:

$ ls -lh ocean/

total 1.5G

-rw------- 1 mongodb mongodb  16M Aug 20 22:30 ocean.0
-rw------- 1 mongodb mongodb  32M Aug 20 20:44 ocean.1
-rw------- 1 mongodb mongodb  64M Aug 20 22:23 ocean.2
-rw------- 1 mongodb mongodb 128M Aug 20 22:30 ocean.3
-rw------- 1 mongodb mongodb 256M Aug 20 22:30 ocean.4
-rw------- 1 mongodb mongodb 512M Aug 20 22:30 ocean.5
-rw------- 1 mongodb mongodb 512M Aug 20 22:30 ocean.6
-rw------- 1 mongodb mongodb  16M Aug 20 22:30 ocean.ns
drwxr-xr-x 2 mongodb mongodb 4.0K Aug 20 22:30 _tmp

These extents store both the data and indexes for our database. With MongoDB, as soon as any data is written to an extent, the next logical extent is allocated. Thus, with the above structure, ocean.6 likely has no data at the moment, but has been pre-allocated for when ocean.5 becomes full. As soon as any data is written to ocean.6, a new 512MB extent, ocean.7, will again be pre-allocated. When data is deleted from a MongoDB database, the space is not released until you compact — so over time, these data files can become fragmented as data is deleted (or if a document outgrows its original storage location because additional keys are added). A compaction defragments these data files because during a compaction, the data is replicated from another member of the replica set and the data files are recreated from scratch.

An additional 16MB file stores the namespace, this is the ocean.ns file. This same pattern occurs for each database on a MongoDB instance. Besides our “ocean” database, there are two additional system databases on our shard: “admin” and “local”. The “admin” database stores the user information for all database users (prior to 2.6.x, this database was used only for admin users). Even though the admin database is small, we still have a 16MB extent, a pre-allocated 32MB extent, and a 16MB namespace file for this database.

The second system database is the “local” database. Each shard we offer at ObjectRocket is a three-member replica set. In order to keep these replicas in sync, MongoDB maintains a log, called the oplog, of each update. This is kept in sync on each replica and is used to track the changes that need to be made on the secondary replicas. This oplog exists as a capped collection within the “local” database. At ObjectRocket we configure the size of the oplog to generally be 10% of shard size — in the case of our 5GB shard, the oplog is configured as 500MB. Thus the “local” database consists of a 16MB extent, a 512MB extent, and a 16MB namespace file.

Finally, our example shard contains one more housekeeping area, the journal. The journal is a set of 1-3 files that are approximately 128MB each in size. Whenever a write occurs, MongoDB first writes the update sequentially to the journal. Then periodically a background thread flushes these updates to the actual data files (the extents I mentioned previously), typically once every 60 seconds. The reason for this double-write is writing sequentially to the journal is often much, much faster than the seeking necessary to write to the actual data files. By writing the changes immediately to the journal, MongoDB can ensure data recovery in the event of a crash without requiring every write to wait until the change has been written to the data files. In the case of our current primary replica, I see we have two journal files active:

$ ls -lh journal/

total 273M

-rw------- 1 mongodb mongodb 149M Aug 20 22:26 j._1
-rw------- 1 mongodb mongodb 124M Aug 20 22:30 j._2

MongoDB rotates these files automatically depending on the frequency of updates versus the frequency of background flushes to disk.

So now that I’ve covered how MongoDB uses disk space, how does this correspond to what is shown in the space usage bar from the ObjectRocket dashboard that I showed earlier?

  • NS value, 48MB — the sum of the three 16MB namespace files for the three databases I mentioned, ocean, admin, and local.
  • Data value, 315MiB — the sum of the value reported for dataSize in db.stats() for all databases (including system databases).
  • Index value, 253.9MiB, — the sum of the value reported for indexSize in db.stats() for all databases (including system databases).
  • Storage value, 687.2MiB — the sum of data plus indexes for all databases plus any unreclaimed space from deletes.
  • Total File value, 2.0 GiB —  how much disk we are using in total on the primary replica. Beyond the space covered by the Storage value and the NS value, this also includes any preallocated extents but not the space used by the journal

Given these metrics, we can make some simple calculations to determine whether this instance is fragmented enough to need a compaction. To calculate the space likely lost due to fragmentation, use the following:

100% – (Data + Indexes) / Storage

In the case of our example instance, this works out to 17% (100% – (315MiB Data + 253.9MiB Index) / 687.2MiB Storage = 17%). I would recommend compacting your instance when the fragmentation approaches 20%.

Another calculation we can do is whether we need to add a shard to this instance based on our overall space usage. To calculate your overall space usage do the following:

(Total File / (Plan Size * number of shards)) * 100%

For our example instance, this works out to 40% ((2 GiB / 5 GiB * 1 shard) * 100% = 40%). We generally recommend adding a shard when overall space usage approaches 80%. If you notice your space usage reaching 80%, contact support and we can help you add a shard to your instance.