So you are on a performance test engagement and your boss asks how many people concurrently executing certain transactions like buying a book or doing a search. He wants is a measure of active concurrency – how many people are doing certain transaction. This should not be confused with Passive concurrency like how many people are logged in. Before we go anyfuther lets clarify that in this example a transaction is a request to the test system and a response back it does not include any think time. Now before you start getting out the virtual terminal server and incrementing counters at the start of the transaction and decrementing counters at the end. There is an easier way.
You can work this all out from your performance test results, without the need for code. Using a mathematical formula (it’s very simple so don’t panic) called Littles Law. Littles Law was first used to analyse the performance of telephone exchanges in 1969 by John Little.
Little’s law allows us to relate the mean number of items in the system in our case concurrent users with the mean time in the system (response time) as follows:
Number of Items in the system = Arrival Rate x Response Time
There is one rule to remember before you use little law you must make sure the system is balanced. That is the arrival rate into the system is the same at the exit rate.
I will begin with a none computer example the “Black Horse Pub” has a mean arrival rate of 5 customers per hour that stay for on average half an hour. Using Little’s law we can calculate the mean number of customers in the pub as Arrival Rate x Response Time = 5 x 0.5 = 2.5 customers.
To apply little law to a performance test we must first make sure that we are taking measurements from when the system under test is balanced. Remember a balanced system the rate of work entering the system matches the rate of work leaving the system. This for a typical load testing tool is after the ramp up period and the number of virtual users remains constant and response times have stabilised and the transaction per second graph is level. To capture this period of time in LoadRunner for example you would need to select the time period in the Summary report filter or under the Tools -> Options.
So record the average response time for the transaction of interest and the number of times per second the transaction is executed.
So from the example above the response time is 43.613 seconds. The arrival rate is the number of transactions executed divided by the duration. The duration for this example was a 10 minute period as can be confirm by the LoadRunner summary below.
This gives you an arrival rate of 2.005 calculated by taking the count 1203 divided by the duration 600.
So the concurrent number of users waiting for a search to return is 87.44
There you go from your performance test results you can easily calculate the concurrency for a particular transaction.
The bottleneck in a system may not be obvious. (Life would be easier but less fun if there where always easy to find). This is because there are two types “hard” and “soft”. Hard bottlenecks are the ones where a resource such as a CPU is working flat out which limits the ability of the system to process more transaction. While a soft bottleneck is some internal limit such at number of threads or connections that once all used limit the ability to process more transaction. Therefore, how do you find know if you have a bottleneck. If you are looking at the results from a single load test you may not know you will need to run multiple load tests at different numbers of virtual users and then see if you number of transactions per second increase with each increase in virtual users. The results can be seen in the two graphs below. The first shows how the throughput (transaction per seconds) increases and levels off when saturated and the second shows the response time. You will probably have heard the express below the knee of the curve and this is an the point that is to the left of the bend in the response time graph.
The graphs above where actually generated using a spreadsheet model for the performance of a closed loop model. This is like LoadRunner and other testing tools where the are a fixed number of users that use the system then wait and return to the system. The reality is that the performance graphs may look different from the expected norm. An example is shown below from a LoadRunner test the first graph shows how the number of VUser where increased during the test and the second graph shows the increase in response times. In this case the jump in response time is dramatic. However, in some cases the increase in response time will be less dramatic as the system will start to error at high loads which will distort the response time figures.
Having discovered there is a bottleneck in the system then you have to start looking for it.
Scalability can be defined in many ways. However, in general it is the relationship of how an output increases with a change in input. Typically we may think of how throughput changes as we increase the number of CPUs. In a perfect world we would like to have linear scaling. . I came across a good example of non-linear scaling. It is from a presentation presented by Peter Hughes. It is where you are having a dinner party and have 1 meter square tables each table seats 4 people. As it is a dinner party you want to have everybody facing each other as much as possible. So with one table you can sit 4 people.
To increase the number of guests you need 4 tables but you can now only sit 8 people.
To increase the number of guests again you need 9 tables but you can now only sit 12 people.
If you plot the relationship between guests and tables on a graph is looks like the one below.
It is a common problem that performance testing is often carried out on smaller scale test environments but project managers want to know that the system will scale and response times will not be degraded. Therefore can the performance test results be extrapolated? My view on extrapolation is it is a great technique when used properly but it does not guarantee that the system you tested will work well on the full sized production environment. The two main reasons for failure are
1) You have made a mistake in the creation of your model. These mistakes could be simply just a poorly built model or a bad assumption. However, with plenty of time and expertise you can overcome some of these limitations by building a good model.
2) There are “soft” bottlenecks in the system that are only detected at high load. A common example might be a piece of software may be limited to a certain number of threads that once all used, limit scalability. Some of these “soft” limits might be know by developers before hand and can be investigated with the model and the test environment but it the unknown unknowns that will be the problem on go live day
However, this does not mean that extrapolation is bad or should be avoided. Where as it cannot guarantee that the system will work in production is can be used to show that the system will fail and as we all know avoiding a costly failure is often worth the effort. Using modelling techniques you can estimate the needed hardware configuration for the production system which can be compared to what is expected to be deployed and if the deployed hardware is undersized you have a made a friend with the project manager.
Hi this is a blog that I have started for “fun” about my work as a performance engineer. For some a performance engineer is a performance tester that can help fix performance problems. A wider definition of a performance engineer is one that can help achieve the performance goals of a project throughout the lifecycle of the project through development and into production. I surpose I like to feel I am more the latter type of performance engineer. I particularly like performance modelling and prediction. However, we must remember “performance prediction is easy ,ts getting it right that the hard part” (Thanks to Dr Ed Upchurch for that quote)