how toredis

Analyzing Redis performance results

By April 21, 2015 August 18th, 2022 No Comments
ObjectRocket skyline

In the previous installment I discussed topics and approaches to preventing your Redis instance from becoming slow. Now it’s time to examine ways of measuring performance.

What To Measure

For this installment, we’re looking at command latency and its components. Why? Because the number of commands you can push through a Redis server/library is a result of the speed of each command.

Quick and Easy: The CLI

The first way to check your command latency is to use the command line client redis-cli. It’s quick and gives you an immediate checkpoint to start from. The examples here will be using localhost for simplicity, but in your setup you should be using the -h <host> options and if needed -a <auth> and -p <port>.

Basic Latency

To get the basic latency results run redis-cli –latency. This will output a set of numbers indicating min, max, avg, and the number of iterations. These iterations consist of running the Redis command PING against an open connection. Consequently it does not include any time to connect to the server. If your code connects on every command you’ll see much worse performance than this command will indicate.

This command will run until you kill it—easily done via CTRL-C. The latency numbers are in milliseconds. Thus a result of:

min: 0, max: 1, avg: 0.11 (11369 samples)

means the minimum latency was less than 1 millisecond, the maximum was 1 millisecond, with an average per-PING latency of 0.11 milliseconds. Yes, this is a localhost connection. These numbers are a result computed from 11,369 “samples”, or PING commands.

Alternatively, you could use –latency-history instead. This option allows you to break it out into temporal slices. Running redis-cli –latency-history will use a default window of 15 seconds. The result will look something like this:

min: 0, max: 1, avg: 0.11 (1475 samples) -- 15.01 seconds range
min: 0, max: 1, avg: 0.10 (1474 samples) -- 15.01 seconds range

Like –latency, this command will run until you kill it. If you want to run with a different interval you can pass -i . For example:

redis-cli --latency-history  -i 10
min: 0, max: 1, avg: 0.11 (984 samples) -- 10.01 seconds range
min: 0, max: 1, avg: 0.11 (983 samples) -- 10.01 seconds range
min: 0, max: 1, avg: 0.11 (983 samples) -- 10.01 seconds range

Since these operations simply use the PING command this should work against any version of Redis.

Intrinsic Latency

At first glance, the name might lead you to believe internal latency mode is measuring the intrinsic latency of the server. It isn’t. If you look at the redis-cli source on GitHub you’ll find it doesn’t even talk to the server:

start = ustime();
compute_something_fast();
end = ustime();
latency = end-start;

According to the source comments on intrinsic latency mode:

Measure max latency of a running process that does not result from syscalls. Basically this software should provide a hint about how much time the kernel leaves the process without a chance to run.

What this is doing is testing intrinsic latency on your client host. This is useful for knowing if the problem actually lies on the client side rather than the server itself.

So while useful it should probably not be in your first steps at checking your latency. Update: After speaking with Salvatore he has confirmed the intention is to run this command from the server. This means it can be useful if you have shell access to your Redis server, but not useful otherwise—which is commonly the case if you use a Redis service provider.

Using Commissar

Commissar is a small suite I am working on for tracking and testing Redis latency and overall performance. Warning: it’s early in development and much will likely change. For this reason, only use it if you feel comfortable using non-production quality software. Commissar is in The Github Commissar Repository.

Specifically for this article we will be looking at the latency tool/directory.

This tool, configured via environment variables, will connect to a given Redis instance and run the same “time the PING command” mechanism as the redis-cli does. I did this to maintain parity of the test. Where it differs in it’s ability to output more information and (currently) store it in MongoDB—with more stores to come.

A run of latency might look like this, though the output is likely to have changed since this article was written:

./latency 
Connected to <host:ip>

100000 iterations over 53698us, average 536us/operation

Percentile breakout:
====================
99.00%: 3,876.99us
95.00%: 640.00us
90.00%: 514.00us
75.00%: 452.00us
50.00%: 414.00us

Min: 243us
Max: 44,686us
Mean: 536.98us
Jitter: 764.37us

Notice the temporal unit is ‘us’ (i.e. microseconds) in this run. Also, you can see it will give you the percentile breakout. This lets you get a better picture of what the latency looks like compared to a simple min/max/avg. In this context ‘Jitter’ is the standard deviation of the sample.

This output tells us that overall we’re looking pretty good. This particular example is over a private network. We can see the peak is much higher than the mean, but 95% of the iterations come in at or below 640 microseconds. What does this tell us?

It could tell us there is a sporadic network issue, a sporadic server-side delay, or even something like garbage control in the test client affecting the output. This is why the distribution of the requests are made available: so you can see how common a “high” result is.

In order to determine where the problem is, you will want to first look at how long the commands on the server are taking. This will tell us if the server is the problem.

In this case, for example, you might want to check the redis info output, looking specifically at the Commandstats section:

redis-cli info commandstats
# Commandstats
cmdstat_ping:calls=32497193,usec=22213723,usec_per_call=0.68

Here we can see that the average time to execute PING on the server is 0.68 microseconds. Clearly the ping command itself is not likely the culprit. However, there could be other commands executed during the test which backed the Redis process up causing the delay.

A recent addition to the latency tool is the ability to run concurrent latency tests. By setting the environment variable in LATENCY_CLIENTCOUNT, you can run multiple connections to your Redis server and see how much concurrency affects your latency results. For example you can see the difference a simple Redis PING command sees when you have 10 clients versus 100, or even 1000 concurrent clients. This can be useful to see the difference you might be experiencing between development/staging and production.

Testing from Containers

Prefer to run your tests from Docker containers? I now push built Docker containers that have just the golatency tool in them to the public Docker registry. If you do a docker pull therealbill/golatency you will get the image, ready to run in a matter of seconds (it currently weighs in at less than 4MB image size).

Simply call docker run with the environment variables to connect and configure as the ReadMe indicates. Then, pull the output from the stdout or, if run as a daemon, via docker logs, and you don’t have to build anything. Whenever I make significant changes to the tool, the Docker repo is updated with a new image. Alternatively, there is a Dockerfile in the repo, allowing you to customize as you desire.

Final Thoughts

Analyzing your performance with Redis can be tricky. Fortunately, with an understanding of latency and throughput and how they affect your application, you can focus more on analyzing your application’s use of Redis rather than Redis itself. You’ll improve your usage and understanding of how best to use Redis by keeping in mind the Zen of Redis’s phrase “The key to performance is not slowing Redis down,” while produce smarter, more efficient applications by isolating network issues and data structure alternatives, resulting in better performance.