Company

MongoDB and Piles of SSDs

By January 28, 2013 August 18th, 2022 No Comments
ObjectRocket skyline

In mid 2011 we started playing around with MongoDB on Solid State Disk.

We played with various configurations, using flashcache, using SSD for just the journal, or for just the oplog, or both, or even throwing the entire database on flash storage. We played with various RAID controllers, software RAID, different RAID configurations, schedulers, filesystems and kernels.

We started feeling pretty good about getting really good performance out of MongoDB using SSDs. MongoDB is very sensitive to physical I/O, especially then, when the database both had a single instance wide lock and didn’t yield on fault. If the disk was slow, then locks would pile up and the overall database response time would skyrocket even with modest concurrency. On SSDs however, the disk access was so fast the lock just mattered a whole lot less. We could run highly randomized workloads with lots of concurrency without any problems. Sure, if one could keep a database entirely in memory then this wasnt as much of an issue, but it also wasnt very practical in the real world.

We felt like we might be onto something when we started to drive >500MB/s to disk with MongoDB.

In 2012 we founded ObjectRocket. We knew we wanted to utilize what we knew about MongoDB and SSDs and incorporate it into a kick ass Database as a Service. We decided to build our own infrastructure from the ground up vs using one of the existing infrastructure providers. So we had our shot.

We started building prototypes of the hardware stack we wanted to use. We knew it had to be fast as hell to leapfrog the (then) yet to be released Amazon SSD offering. Our preference has always been to use commodity gear, plus we were a startup on angel cash so we had no budget for any fancy card based SSD offerings. So that meant we had to make do with commodity SSDs using a 2.5″ disk form factor (as opposed to card style storage offerings). We tested with early OCZ Vertex 2s, and 3s, and then finally settled on Intel 320s. These things were fast as hell, cheap as dirt (for SSDs), and had power fault protection by transferring temporary buffers to the NAND on unexpected shutdown. Perfect. We needed the power protection because we weren’t going with a typical RAID controller with battery backed cache, but rather, software RAID. We put in the credit card and used Amazon Prime for our first shipments. Ironic eh?

I remember getting the shipment of drives and thinking, “Holy s$%#, I’ve never seen this many SSDs in one place!”. They were literally in piles, well, ok, stacks.

We made the call to put each customer database entirely on SSD. We would simply just use SSD everywhere. We started building up a hardware profile we called a ‘brick’. A brick was a simple commodity piece of gear with lots of SSD’s and a bunch of other tricks to get MongoDB to haul ass. We settled on RAID10, we wanted to be able to allow ourselves the ability to remove drives and not affect the MongoDB replica set if we wanted to change out failed drives or we hit the wear limits before we calculated we would. This meant we were burning more disk than we wanted to, but we wanted things to be solid. Each customer was getting each disk block on a mirror, then on two other replica’s each with another mirrored disk set. All SSD.

We started to get really really fast performance with very low latencies and tolerance for quite high levels of concurrency. We started to think that this issue actually made the database slower on SSD. We upgraded to MongoDB 2.2 and that helped some along with DB level locking, and some yield on fault improvements. We were coming out of our beta period with customers and things were looking good.

With a couple tweaks we called that version 1.0 of our hardware profile and went production. Whew. We were blowing away 500MB/s, and nearly hit 1GB/s through MongoDB. ObjectRocket was born.