Anybody who looks at the ObjectRocket for Elasticsearch page should (hopefully) notice that we mention dedicated containers and our high-performance hardware environment a number of times. Last week we posted a performance comparison between ourselves and a couple of the other Elasticsearch services to give you some insight into the results of our design. Now that you've seen the results, I thought it was about time that we walked through the architecture, where we designed for performance, and the other considerations we had to keep in mind.
In this blog I'll provide a quick overview of the resources you get with every ObjectRocket for Elasticsearch cluster, the function of those resources and why we've designed things the way we have. First up is the different roles that make up an Elasticsearch cluster.
If you've only run Elasticsearch on a local machine for testing purposes, you may not have realized that it has the ability baked in to split out various roles to independent hosts. By default, Elasticsearch will run every role on each host, but this can be configured per host as you grow. This is covered in pretty good detail on the Node page in the official Elasticsearch docs, but the summary of the roles we use on ObjectRocket are:
- Data node: When this role is enabled, the node will store data and perform data-related operations.
- Master-eligible node: Nodes with this role enabled are eligible to become master. A master node mainly performs cluster-wide actions and coordinating tasks, like keeping track of which nodes are in a cluster and where shards get allocated.
- Ingest node: This role is new to Elasticsearch 5.x and enables nodes to process incoming data with ingest pipelines.
- Coordinating only node: Coordinating only nodes do just that. They accept requests and route them to the appropriate places. These are beneficial in larger clusters as they take load off of the data and master nodes.
There's definitely a lot more to those roles, but the information above should give you enough basis to understand our architecture and how we use each node type.
What's in an ObjectRocket for Elasticsearch Instance?
When you create an "Instance" in the ObjectRocket UI, what you're actually getting is an 11-node cluster.
That's right 11 nodes minimum, for every plan size, on their own containers on different hosts. We split these hosts up four ways:
- 4 Client nodes: In Elasticsearch terms, these are technically coordinating only nodes, but in our architecture they do a lot more. Each one of these nodes acts as a coordinating only node, a layer of security (ACL enforcement and user authentication), and ingest node. These are the only nodes in your Elasticsearch cluster with internet, Rackspace ServiceNet, and AWS Direct connect access.
- 3 Master nodes: These three nodes are dedicated to performing the master role in the Elasticsearch cluster.
- 2 Data nodes: To start, every cluster has a minimum of two data nodes. When you pick a "Plan" for your ObjectRocket instance, these are the nodes that get those specs. As noted above, these nodes store the actual data for your instance and perform the majority of the data-related operations of your cluster. When you need to grow your cluster, these are the nodes that scale out and up.
- 2 Kibana nodes: To support Kibana, we also include two dedicated Kibana nodes, which run Kibana in an active-passive redundant setup. Like the client nodes, these nodes also serve as a layer of security by enforcing ACLs and user authentication for the Kibana nodes themselves.
With all of that being said, I'd like to address a few question about why we designed the service this way:
Why so many nodes?
The big reasons are performance, scale, and stability. When you pick your plan size, we want to make sure that all of that RAM and CPU go straight to search performance. If we had left coordinating or master workload on the data nodes, you could end up with resource contention in your cluster that could drive down overall performance while still leaving some nodes underutilized.
This also helps the cluster scale better, because the dedicated client and master nodes are sized to support many more data nodes than you may start with. This makes scaling as easy as adding a data node.
Finally, dedicated nodes for these different traffic types leads to a more stable cluster. You've got HA for every function of the Elasticsearch and Kibana cluster. You also don't have to worry about a search traffic spike on an extra busy node choking out master functions. This is best practice in many production Elasticsearch clusters and that's our target at ObjectRocket.
Why do I need three dedicated master nodes?
When you're talking clustering, three's the magic number. One leaves you with a single point of failure, two leaves you at risk of split brain. Three is the smallest number that makes sense for a stable cluster.
What about the four clients?
Back when we created this layout, ingest nodes did not exist, so these nodes managed security and acted as coordinating nodes. The reason we start with so many is redundancy and to ensure that in ingest scenarios that you have plenty of end points to spread your traffic across. There's very little queueing on each coordinating node, so having multiple nodes to spread traffic to helps avoid the problem of a coordinating node backing up.
This also creates the opportunity in use cases where you want a dedicated endpoint for one part of your application to avoid congestion, while the rest of the application uses the other endpoints. Most Elasticsearch clients handle multiple endpoints out of the box, so the only configuration required on your end is dropping all of the client hostnames into your connection code.
Now, with the addition of ingest nodes in Elasticsearch 5.x, there are new ways to split up these clients. All client nodes are capable of accepting ingest pipelines and traffic, but you can only choose to point your application at a subset. This gives you a nice easy way of giving some dedicated horsepower to your ingest pipelines, while not bogging down the rest of your traffic.
How does this setup impact security?
One of the other advantages of this arrangement is that we're able to keep all of the data on internal private networks. Only the client nodes and Kibana nodes are accessible from the public internet, AWS Direct Connect, or Rackspace ServiceNet. Both the Kibana and client nodes manage their own set of access control lists (ACLs) and user authentication to ensure that our service is secure at the edge. All of the data nodes and master nodes are isolated on private internal networks.
If you've read the whole blog, I applaud you and thank you for your time. The bottom line is that our goal at ObjectRocket is for everyone to have a fast, stable cluster that can scale. Elasticsearch has a lot of the tools to enable that baked in, but we've tweaked where we can, and that's by putting together a cluster design that makes Elasticsearch and Kibana fly for you and scale easily as you grow. You focus on your app, we've got the data.