Since all ObjectRocket for Elasticsearch clusters come with multiple client nodes, one of the common questions we get from our customers is why we don’t provide a load balancer to connect to all of them.
Many see a single connection to a single node as key point of failure for the cluster, so they’ve requested some kind of load balancer to manage a pool of connections.
The simple answer is that Elasticsearch is designed specifically to not need a load balancer. In this blog, let’s walk through all of the ways that Elasticsearch allows you to use all of the clients we provide.
Load balancing is built into the client
Almost every key Elasticsearch client allows you to provide multiple client endpoints and handles the load balancing internally. The method used can differ from client to client, but for the most part, they will use a round-robin scheme to loop through the available clients
Let’s look at how you set this up in Python first:
from elasticsearch import Elasticsearch Import certifi es = Elasticsearch(['dc-port-0.es.objectrocket.com', 'dc-port-1.es.objectrocket.com', 'dc-port-2.es.objectrocket.com', 'dc-port-3.es.objectrocket.com'], http_auth=('YOUR_USERNAME', 'YOUR_PASSWORD'), port=12345, use_ssl=True, verify_certs=True, ca_certs=certifi.where(), )
You can see in the first argument, that it accepts a list of hosts. That’s it, and the client takes it from there. The setup is similar for other tools, like Beats:
output: elasticsearch: hosts: ["https://dc-port-0.es.objectrocket.com:port", "https://dfw-port-1.es.objectrocket.com:port", "https://dfw-port-2.es.objectrocket.com:port", "https://dfw-port-3.es.objectrocket.com:port"] # HTTP basic auth username: "YOUR_USERNAME" password: "YOUR_PASSWORD"
Alternatives for applications that won’t take multiple hosts
Though it’s uncommon, there are a few applications that don’t take a list of hosts. The most notable is Kibana, which can only accept a single host. To work around this, there are a couple of alternatives.
Point each part of your app at a different client
For non-mission-critical uses and applications, it is perfectly acceptable to point at a single client node. Unless the application has a super-high request rate, a single client should be able to easily manage the load. If you have a number of these types of applications, just point each one at a different client to balance the load.
Load balance locally
In the few cases where the client/application doesn’t support it and you really need redundancy in client connections, another option is to set up some local load-balancing You can do this with something like nginx and HAProxy, or you can just set up a local hostname in DNS that will round-robin between the Elasticsearch clients. Once again, we see very few cases where this is absolutely needed, but there are solutions available when it does come up.
The bottom line is that almost every scenario you come across will allow you to supply a list of hosts and handle the balancing for you, but in a pinch, there are examples out there to help you load balance locally when you need to.
Contact us to set up a chat about your data infrastructure and how we can help you get the most out of Elasticsearch.