Server Performance Metrics: 8 You Should Be Considering


With the DevOps movement entering the spotlight, more and more developers concern themselves with the end-to-end delivery of web applications. This includes the deployment, performance, and maintenance of the application.

As an application gains more users in a production environment, it’s increasingly critical that you understand the role of the server. To determine the health of your applications, you may find it useful to gather performance metrics for the servers running your web applications.

All different types of web servers (like Apache, IIS, Azure, AWS, and NGINX, for example) have similar server performance metrics. Most of my experience in this realm lies in Microsoft Azure, which provides an easy-to-use interface for finding and collecting data. Working with Microsoft Azure gives the capability to host applications in either Azure App Services (PaaS), or Azure Virtual Machines (IaaS). This setup gets you a view of the different metrics for the application or server running.

Because of all this experience I’ve had in the last few months, I’ve found what I think is eight of the most useful server performance metrics. These metrics can be divided into two categories: **app performance metrics and user experience metrics.

Let’s start by looking at the metrics under the app performance umbrella.

App performance metrics

App performance metrics are specific to the speed of the web applications that are running. If you’re having issues with an application performing slowly, these metrics are a good place to start.

Metric 1: Requests per second

Requests per second (also called throughput) is just like it sounds—it’s the number of requests your server receives every second. This is a fundamental metric that measures the main purpose of a web server, which is receiving and processing requests. Large-scale applications can reach up to about 2,000 requests per second.

Given enough load, any server can fall. When considering the impact, remember that requests are just that: a single request to the server. This metric doesn’t consider what’s happening in each of these requests.

This brings us to our next metric.

Metric 2: Data in and data out

The next metric I suggest you look at is your data in and data out. The data in metric is the size of the request payload going to the web server. For this metric, a lower rate is better (lower meaning that small payloads being sent into the server). A high data-in measurement can indicate the application is requesting more information than it needs.

Data out is the response payload being sent to clients. As websites have been getting larger over time, this causes an issue especially for those with slower network connections. Bloated response payloads lead to slow websites, and slow websites will dissatisfy your users. With enough slowness, these users abandon the website and move on. Google suggests pages that take three or more seconds for mobile users to load have about a 53% chance of users abandoning before load completion.

Metric 3: Average response time

Defined directly, the average response time (ART) is the average time the server takes to respond to all requests given to it. This metric is a strong indicator of the overall performance of the application, giving an impression of the application usability. In general, the lower this number is, the better. But there are studies showing that the ceiling limit for a user navigating through an application is around one second.

When considering ART, remember what the acronym stands for—it’s just an average. Like all metrics determined with an average, high outliers can throw the number off completely and make the system seem slower than is. ART is most helpful when used alongside our next metric on the list.

Metric 4: Peak response time

Similar to the average response time, the peak response time (PRT) is the measurement of the longest responses for all requests coming through the server. This is a good indicator of performance pain points in the application.

PRT will not only give you an idea of which portions of your applications are causing hangups; it will also help you find the root cause of these hangups. For example, if there’s a certain slow web page or a particularly slow call, this metric can give you an idea of where to look.

Metric 5: Hardware utilization

Next, let’s talk about overall hardware utilization. Any application or server running is limited by the resources allocated to it. Therefore, keeping track of the utilization of resources is key, primarily to determine if a resource bottleneck exists. You have three major aspects of a server to consider:

When considering these, you’re looking for what can become a bottleneck for the whole system. As any physical (or virtual!) computer running with these components will show, performance is only as strong as its weakest link. This metric can tell you what the bottleneck is and what physical component can be updated to improve performance.

For example, you may run into issues when trying to render data from a physical hard drive. That will cause a bottleneck in the I/O interactions between gathering files and presenting them to the user. While the hard drive spins and gathers data, the other physical components do nothing. An upgrade to a solid-state drive would improve the performance of the entire application because the bottleneck will disappear.

Metric 6: Thread count

The next metric—the thread count of a server—tells you how many concurrent requests are happening in the server at a particular time. This metric will help you understand what the general load of a server looks like from a request level. It will also give you an idea of the load placed on the server when running multiple threads.

A server can generally be configured with a maximum thread count allowed. By doing this, you’re setting a max limit of requests that can happen at one time. If the thread count passes this maximum value, all remaining requests will be deferred until there’s space available in the queue to process them. If these deferred requests take too long, they’ll generally time out.

It’s worth noting that increasing the max thread count generally relies on having the appropriate resources available for use.

User experience metrics

Now that we’ve covered the app performance metrics, let’s discuss a few that are user experience centered. These server performance metrics can measure your users’ overall satisfaction when using your web applications.

Metric 7: Uptime

Although not directly related to its performance, the uptime of the server is a critical metric. Uptime is the percentage that the server is available for use. Ideally, you’re aiming for 100% uptime, and you’ll see many cases of 99.9% uptime (or more) when looking at web hosting packages. It’s not uncommon for software projects to abide by a service level agreement that dictates a particular server uptime rate.

If uptime metrics checking isn’t something your server can provide built in, there are plenty of third-party services, such as, that can do it for you. These services can even give you a visual depiction of their report:

And here’s an interesting fact. Calculating the monthly allowed downtime shows

Metric 8: HTTP server error rate

The HTTP server error rate is a performance metric that doesn’t directly relate to application performance, but it’s a very critical one. It returns the count of internal server errors (or HTTP 5xx codes) being returned to clients. These errors are returned from malfunctioning applications when you have an exception or other error not being handled correctly.

A good practice is to set up an alert whenever these kinds of errors occur. Because 500 errors are almost completely preventable, you can be certain you have a robust application. Being notified of all HTTP server errors allows you to stay on top of any errors occurring. This prevents the issue of having errors build up in the application over time.

How to measure server performance

Measuring server performance with an Application Performance Monitoring (APM) tool like [Raygun APM]( is the easiest and most accurate way of measuring the health of your software. APM should be, giving your team greater context and diagnostic tools into your biggest application performance questions. Discover and pinpoint the root cause of performance issues with greater speed and accuracy than traditional APM solutions.

Keep your finger on the pulse

These are the server performance metrics I’ve personally found to be the most valuable. If you collect and monitor this kind data on both your users’ experience and your app performance, very little will fall between the cracks.

Did I mention any metrics that you’re not currently using? Consider trying them out. After all, metrics are your best way to keep your eye on your server performance—and, by extension, your application’s health.