development

Using a polyglot database strategy to stay nimble in the shifting open source landscape

By November 14, 2018 August 18th, 2022 No Comments

Database technology moves pretty fast.

Percona recently posted a blog that lists many alternatives to MongoDB for different use cases. We love talking about options and helping organizations think about polyglot persistence.

A polyglot success story

One of our partners, Untappd, uses multiple data stores for their unique use cases. They found Elasticsearch allowed them more control over their large data set, enabling them to enhance their activity feed for users with 30 days of history instead of just 10 days. With Mongo, they were inserting thousands of document for all of a user’s friends—with Elasticsearch, they only needed one document per check-in, making their app much faster and more efficient. They still use Mongo for parts of their workload, but having a polyglot database approach has been a huge improvement for Untappd’s users and their business objectives. Read the Untappd Case Study >

Knowing the alternatives

You’ve probably heard the saying, “When you have a hammer, everything looks like a nail.” This is very true with databases. Many organizations just use what they’re familiar with. That’s okay, but it’s not great. In order to win in your market, you need the right tool for the job.

This is where the value of database as a service really shines. It’s really hard to keep up with the latest and greatest databases for a given use case. DBaaS allows developers to leverage database expertise so that they no longer have to figure that out on their own. That way, engineers can focus on developing features.

Adding some more color to a few of the options that Percona listed in their Mongo alternatives blog, below are several open source data store options your organization might want to take a closer look at to ensure your application is fast and efficient while also ensuring that you’re not putting all your eggs in one basket.

A primer on the CAP theorem

In theoretical computer science, the CAP theorem states that it is impossible for a distributed data store to simultaneously provide more than two out of the following three guarantees:

Consistency: Every read receives the most recent write or an error
Availability: Every request receives a (non-error) response – without the guarantee that it contains the most recent write
Partition tolerance: The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes

The CAP theorem implies that in the presence of a network partition, one has to choose between consistency and availability. Note that consistency as defined in the CAP theorem is quite different from the consistency guaranteed in ACID database transactions.*

Cassandra

Apache Cassandra is successful for building scalable distributed clusters. It’s popular among companies that work with very large, active data sets. Since it’s a distributed system, there is no single point of failure. A common use case for Cassandra is for use with time series data. The first cloud-native version of the viewing history storage architecture for Netflix used Cassandra for its good support modeling time series data wherein each row can have dynamic number of columns.
Pros: Fast writes, point queries
Cons: Data duplication required to support multiple query patterns, thus more storage
CAP theorem: AP

Couchbase

Couchbase is a multi-model NoSQL database. It’s a hybrid of a key-value store and a key-document store (from its combination of CouchDB and Membase). Couchbase has its own query language called N1QL, a SQL-like query language for JSON.
Pros: Multi-model serves different workloads in the same database.
Cons: May require more horsepower.
CAP theorem: CP

CockroachDB

If you’re looking for a (mostly) SQL-compliant database with ACID transactions that also scales like a NoSQL database, check out CockroachDB. Inspired by Google’s Spanner, CockroachDB aims to provide open source, highly available, massively scalable, and globally distributed databases that offer SQL compatibility. Since it follows the PostgreSQL wire protocol and dialect, it’s also highly compatible with Postgres clients.
Pros: SQL standard and ACID transactions with the scalability of a NoSQL database
Cons: Not fully SQL compliant, still a “young” technology
CAP theorem: CP

Elasticsearch

Yes, Elasticsearch can add lightning fast text search to your site or app, but it’s so much more than search! Elasticsearch has features and capabilities that enable it to act as a document store, time series database, visualization platform, and more. Many of our MongoDB customers have benefitted from moving some of their workload to Elasticsearch. For certain use cases, it is much faster and more cost-effective.
Pros: Fast reads and full-text search. Elastic Stack and broad use cases
Cons: Writes are expensive, sacrifices accuracy for velocity
CAP theorem: AP

Top 5 Elasticsearch Use Cases

InfluxDB

An open-source time series database developed by InfluxData, InfluxDB is written in Go. It was built to handle sensor data, such as Internet of Things (IoT), operations monitoring, application metrics, and real-time analytics. It’s a good choice if you need to log events across multiple applications. InfluxDB solves the challenges of millions of events coming in, that need to be filtered and analyzed.
Pros: High write rates and and efficient storage utilization for time series data
Cons: Limited use cases as it is purely a time series database, no “out-of-the-box” security
CAP theorem: Short answer = AP (read long answer)

MongoDB

The number one document store database is MongoDB. Mongo meets the need for flexible data models that serve unstructured, semi-structured, and structured data types while supporting more programming languages (27) than Cassandra (13) and Couchbase (14).
Pros: Secondary indexes, point queries
Cons: Writes slow down when b-tree doesn’t fit in RAM – read/write locks
CAP theorem: CP

Top Use Cases for MongoDB

MySQL

Almost every customer we have also has some relational/SQL data. Some use cases just require relational data and ACID transactions. Any application that requires multi-row transactions, such as an accounting system, is best suited for a relational database like MySQL. MySQL 8 has great support for JSON and it continues to improve with every maintenance release.
Pros: Secondary indexes, point queries, transactions
Cons: Scaling can be difficult
CAP theorem: AC

Neo4j

If what you’re interested in connections and links between your various documents, you should look at a Graph Database like Neo4j. Though a number of databases offer graph-like capabilities, Neo4j is built from the ground up to perform as a graph database. Graph databases make it easy and fast to analyze how the members of your dataset are connected and ultimately help provide actionable insights on those connections.
Pros: Fast recommendation engine and fraud detection
Cons: Scaling out is a challenge
CAP theorem: AC

Redis

Redis is an in-memory database that is also used as a cache and message broker thanks to its high performance. Cache and session stores are a must for apps, which makes Redis a great complement to a main document store to ensure you doesn’t lose the data if your application goes down. Session data may include user profile information, messages, personalized data and themes, recommendations, targeted promotions, and more.
Pros: Variety of data structures
Cons: Ephemeral data store
CAP theorem: AC

Top 5 Redis Use Cases

I only covered a handful of source data stores above. DB-Engines lists 348 different database management systems. Approximately half of those (173) are open source. There are a lot of options out there. The most important thing to keep in mind is flexibility. Don’t be afraid to experiment with new ways of handling your data layer.

Our customers have seen a lot of success with trying something new. Their app works faster, their users are happy, and they even save money. Using a database management company can help your team stay nimble and allow you to experiment. If you’re interested in giving us a shot for 30 days to check out what we have to offer, contact us. We’ll set up some time to discuss your specific use case and let you know if we’re a good fit for your needs.

*https://en.wikipedia.org/wiki/CAP_theorem