No such thing as average latency

Latency is  the time taken for a message to travel from one system to another. Consequently the average latency is the sum of all latencies over the total number of messages processed (i.e the inverse of the throughput, which is total number of messages processed over total time taken to process these messages).

…Right ?

wrongcat

Wrong, in most cases. The above reasoning does not take into account the distribution of the latencies. The arithmetic mean / average when applied to a skewed distribution can be meaningless at best, and misleading at worst.

EXAMPLE

Two competing systems process 200 messages each in 1000 ms

It takes 5 ms for System A to process each of the 200 messages.
Throughput = 200/1000 = 0.2msg / msec
Latency = 1/Throughput = 5 msec

It takes 1 ms for System B to process each of 199 messages, and a further 801 ms to process the 200th message.
Throughput -= 200/1000 = 0.2 msg/ msec
Latency = 1/Throughput = 5 msec

…which system is “better” depends on the expectations of the client but clearly the average latency and throughput are identical even though these systems exhibit significantly different performance characteristics.

HOW TO FIX

Method 1)

The simplest and quickest method to get a more accurate representation of the “typical” latency is to take the median, which is better suited than the arithmetic mean for skewed distributions

median for system A = 5ms,
median for system B = 1ms

Method 2)

Use an histogramYou can build your own or re-use an existing one. The code below uses the Histogram class which is part of the Disruptor package to print out the upper bound within which 99% of observations fall. 

final long intervals = new long [] {1,2,5,10, 50, 100, 1000};
Histogram h = new Histogram(intervals);
for (int i=0; i<200;i++){
   h.addObservations(5);
}
System.out.println("System A" + h.getUpperBoundForFactor(0.99d)+ " ms");

prints “System A:5 ms”

final long intervals = new long [] {1,2,5,10, 50, 100, 1000};
Histogram h = new Histogram(intervals);
for (int i=0;i<199;i++){
   h.addObervations(1);
}
h.addObservations(801);
System.out.println("System B:"+h.getUpperBoundForFactor(0.99d) + " ms");

prints “System B:1 ms”

Update:
——–
Method 3) even better than methods 1 and 2 above – use the Codahale metrics library to get access to meters, histograms, timers (and more) without re-inventing the wheel.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s