PostgresqlUncategorized

On the second day of Christmas, ObjectRocket gave to me… two HA Postgres replicas!

By December 12, 2019 No Comments

We’re going to keep the holiday theme going and introduce another great new feature on the ObjectRocket service, and that’s High Availability (HA) for our PostgreSQL service. Every datastore we offer on ObjectRocket is built for production workloads, which generally requires HA, so we’ve been working hard over the past few months to deliver PostgreSQL HA just in time for the holidays.

Why High Availability is important

If the terms ‘High Availability’ or ‘HA’ are unfamiliar to you, let’s do a quick review of why HA is important. First and foremost, the three main benefits of HA are:

  • Zero or greatly reduced downtime
  • Protection against data loss
  • Increased database performance

There are a number of methods for implementing High Availability across datastores; even on a single datastore like PostgreSQL there are numerous technologies available. However, a key component to almost any HA solution is a replica of your data. What this means is that you only see one dataset/database, but behind the scenes, there are one or more exact copies (replicas) of that data. In the event that the main database (called the“master” in most replication schemes) encounters an issue like hardware failure, software failure, or corruption, a replica can then be used to replace the master.

That last point touches on the second main component of most HA systems, and that is an automated failover mechanism (or promotion, or election in other schemes). Replication, as described above, ensures that you always have multiple healthy copies of the data, but you need something else to:

  1. Detect that a problem on the master has occurred
  2. Select an appropriate replica to promote to master
  3. Repair the failed master and/or create a new replica (to replace the one that has been promoted)

The final component, (which is sometimes combined with the second) is a device to handle the routing of requests to the right node. If your application is pointed to the master for writing data (since writing to a replica is a ‘no no’), but that master fails, how does your application know to point to the newly promoted master? Once again, there are various ways to solve this, but the most popular are proxies or load balancers; rather than point your application directly to the database server, you point your application to the proxy/load balancer and it determines the right place to send your traffic.

To tie it all together, the automated failover system and proxy/load balancer work together when a failover occurs. When a new master is promoted, the proxy/load balancer is informed to direct traffic to the new master. Nothing changes in your application, and besides a potential blip in responses during promotion, the application doesn’t even need to know a promotion has occurred.

This is a greatly simplified overview of the process, but it covers the fundamental components. Now let’s dive into the technologies that we used for each of those components on our solution.

The Technologies We Used

Now that we’ve reviewed the key components, let’s dive into how we’re providing each of the components above.

Replication

This one was easy, because Postgres supports a number of replication schemes natively. No new tools required for this one. We support a configurable number of replicas; either 1 or 2 is supported today, but we’ll be expanding the options in the future.

One other finer point of replication is the concept of synchronous and asynchronous replication. It can get pretty in-depth, but the key point here is that with synchronous replication the master waits for each replica to confirm that a write has completed to the replica before the master considers the write complete. In asynchronous, the master fires writes to replicas but doesn’t confirm they’ve completed before confirming the write to the application/client. 

The solution that we use (and we’ll get to in the following paragraph) enables us to support both asynchronous and synchronous replication. By default we enable synchronous replication (there are even more settings down this rabbit hole, but we configure replication in our environment to confirm that a write has been written to the Write-ahead Log (WAL) on master and to at least one replica), but you can alter the settings per transaction or per session.

Failover and Promotion

There are a number of open source and third-party tools out there to provide failover and promotion functionality, but the tool we landed on is Patroni. Without going into an in-depth analysis of the array of tools out there, I’ll just summarize why Patroni worked for us:

  • Native Kubernetes support: Our new platform is based on Kubernetes, so the ability to adopt a tool that just plugs into Kubernetes, instead of requiring another state or consensus mechanism, was key.
  • Active development and community: The community that has popped up around Patroni is extremely active and gives us the opportunity to collaborate and contribute our own additions as we add functionality. In addition, there are plenty of resources, from conference talks and documentation, to an operator example from Zalando (though we didn’t end up using it), to help us learn the technology.
  • Simple Architecture: Many of the other tools out there require dedicated resources outside of the Postgres instances themselves to handle load balancing and master promotion. Patroni wraps around Postgres itself and uses native Kubernetes components to handle the other functions, so we weren’t required to spin up additional resources to add HA.

Though your mileage may vary, we’ve found Patroni to be an excellent fit in our environment and easy to configure and maintain. You basically tell Patroni which HA group a node is part of and it does the rest. It handles configuring replication, detection of failed masters, promotion of replicas, creation of new replicas, and even works with Kubernetes on the last part of the process.

Request Routing

The final piece of the puzzle is handled by a combination of a native Kubernetes construct called a “service” and Patroni itself. Put very simply, a service in Kubernetes operates like a proxy that routes traffic based on labels on a pod (another Kubernetes term for a group of containers). 

In the case of Patroni, it handles labeling the active master with a master label, and Kubernetes takes care of routing database traffic only to the pods with the master label. There’s a lot of detail removed from that description, but in practice that’s how it works.This process can also be extended to provide a secondary port that routes database read requests to the replicas (to reduce load on the master).

That’s a quick rundown of the technologies in place, but let us know if you’d like to hear more. We love to talk shop with our customers.

Try It Out Now

I’ll close out this blog by mentioning that you can use this feature now. Our PostgreSQL service is still in Beta (GA coming soon!), so this feature is currently in Beta as well. The Create an Instance screen in Mission Control has a new section under Step 2 called “Customize Your Instance” (in our API, we call these add ons).  There you can click the box to turn the arrow green, which will add HA replicas of your PostgreSQL instance. You can also select the number of replicas (1 or 2) by clicking the ellipses (…) in the upper-right. 

By default, we recommend 2 replicas, so that even during an outage you have redundancy, but you can select 1 if you wish.

Check it out and let us know what you think!

Steve Croce

Steve Croce

Steve Croce is currently a Senior Product Manager and Head of User Experience at ObjectRocket. Today, Steve leads the UX/UI team through rebuilding out the platform’s user interface, scopes the company’s product and feature roadmap, and oversees the day to day development for ObjectRocket's Elasticsearch and PostgreSQL offerings. A product manager by day, he still likes to embrace his engineer roots by night and develop with Elasticsearch, SQL, Kubernetes, and web application stacks. He's spoken at KubeCon + CloudNativeCon, OpenStack summit, Percona Live, and various ObjectRocket events.