PERFORM!

Troubleshooting performance issues can be fun (not)! Effective troubleshooting requires you to be able to scale up and down as needed. It is important to avoid red tape, otherwise, it can take a long time to identify the source of your issue. The hardware you use must be able to perform without interference from other sources.

As an example, network latency can cause huge ripples in your performance data. Thus – its best to do tests on the LAN network vs Wifi. CPU utilization from another application could skew throughput rates. Thus – its best to do tests in an environment that is dedicated and unaffected by other applications. My number one priority is almost always to replicate the performance issue in an environment that is unaffected by external factors.

This approach suited me while trying to reproduce a customer’s performance issue. Read below to see why!

The customer has 4 Gb of RAM per Docker instance assigning 2 Gb to the max heap. Each Docker instance has 2 CPU cores. We expect the platform to handle a max throughput of 150 transactions per second.

The customer’s transaction throughput for the following was:

  • Test 1
    • Average of 3 tests
      • 15 Docker instances – 1.6 transactions per second
  • Test 2
    • Average of 3 tests
      • 1 Docker Instance – 1.0 transactions per second
  • Test 3
    • Average of 3 tests
      • 2 Docker Instances – 1.1 transactions per second
  • Test 4
    • Average of 3 tests
      • 4 Docker Instances – 1.6 transactions per second

Based on the result – we note something is most definitely wrong! I thus decided to replicate the customer’s environment in AWS using EC2 for each Docker instance, EFS for shared cache items and needed to use the Classic ELB to distribute work.

My results were all over the show! I could not replicate tests or come to any conclusion from one test to another. Looking at the performance of my home network (the place from where I was doing all these tests), I noticed that network speed is inconsistent. My network speed deviated from 20 Mbps to 10 Mbps consistently.

I thought that this deviation was causing my results to be inconsistent. Thus the next step was to eliminate my poor network, so I moved my testing application (JMeter) to a Windows desktop on EC2 and ran the same tests. My testing results were spot on! Every prediction was almost 100% accurate. It seems network latency and network related issues played a BIG role in the performance of my clustered, networked, rest API dependent application. My tests using only 2 Docker instances produced 22 transactions per minute.

The AWS platform was so good that in one test the throughput was 13.88888889 transactions per second, the next was again 13.88888889 transactions per second. Another test was 12.82051282 and again 12.82051282. It was almost as if the AWS hosts did not even notice us… Everything was set up and just performing as it should in almost a highly predictable scientific manner. AMAZING! I landed on a minimum throughput for 2 Docker instances of 22.72727273 and a maximum of 27.39726027.

Clumsy
Clumsy in Action

Next, I wanted to manually control the network performance by causing some latency issues, drops, and so forth.

With some searching, I found an amazing tool called Clumsy. I installed Clumsy ran the same tests – this time my performance dropped. My tests using only 2 Docker instances produced 3.9 transactions per minute – previously it was in the 20’s transactions per minute.

Conclusions –

  • I proved the appliances can scale and perform
  • The network can play a huge factor in performance.
    • It is worth noting the CPU will not hit a high utilisation if there is a bottleneck like networking, memory or disk I/O.
CPU
CPU performance using htop on the Linux Docker instance.

Happy troubleshooting!

About Anto

Hi, my name is Anto! I am a cloud computing hobbyist! Give me anything to do with the cloud, and I am interested. I work for a Cloud computing company by day and as a Cloud computing hobbyist by night! My projects use PHP, NodeJs, Ubuntu, MySQL and of course Amazon Web Services. Hopefully, my blog aids your cloud journey! Feel free to post a comment and share your thoughts.

View all posts by Anto →

Leave a Reply

Your email address will not be published. Required fields are marked *