Category Archives: Uncategorized

Seven Stages of Performance Testing Denial

As you may know, many of the ancient religions have such doctrines as “the 5 pillars of wisdom” or the “4 noble truths” that lead humble pilgrims to true enlightenment. Although I am not suggesting we start a new Performance Test religion that perhaps worships the god “Mercury”, I have noticed that there are the 7 levels of denial that developers/system architects/managers (aka pilgrims) seem to have to go through before they realise or admit that they have performance problems (i.e.  true enlightenment).

Like following a religion, this is a personal journey and a unique path is followed by each pilgrim, with no two making the same realisations or decisions at the same point in the project lifecycle, some having to repeat parts of the journey several times (as I am writing this, Mike is explaining again to another set of pilgrims how LoadRunner works).
 
My Experience suggests there are 7 levels of denial, as follows:

1) The load test tool must be wrong – you may be using the industrial standard performance test tool costing £100K, but the quick test the pilgrims did with the free tool downloaded from the internet was better. When you ask whether HTTP 500 status messages were trapped or if the data returned was validated the pilgrims look confused. So, take a deep breath, explain the benefits of a proper tool and move on.
 
2) The performance scripts or workload model must be wrong – you have only been a performance tester for 5 years and worked on countless projects, so its nice to be told you are stupid. Take a deep breath, walk them through the code and enjoy their look of surprise as they suddenly realise what correlation is.
 
3) The system is not finished so doesn’t need to be tested – pilgrims can believe that the last 5% of functionality will increase performance so that a poorly performing system will obviously get better in a minute.. Explain politely how adding new code won’t magically improve the performance of the old.
 
4) It’s not our system, it’s the network etc- blaming the supplier of the components is a common area of denial. To correct this misconception, feign a look of surprise and then arrange a test to show that the offending component runs super fast.

5) We JUST need to configure parameter X – the pilgrim often has the belief that the correct setting of a single magic parameter will solve any problem. What is annoying is the condescending tone often in which the pilgrim states that this is surely the problem and that you the tester must be a complete donkey for not setting this. Of course you smile politely, say lets give it a go and, when nothing changes be dutifully diplomatic. Often you will iterate on “number 5” as several pilgrims in the development team search for the “silver bullet” (or is it “Rocking horse pooh”). An obvious attraction to pilgrims of this approach is that the solution is “only a test cycle away”. Many a manager pilgrim has followed this route.
 
6) Throw hardware at the problem – although this does often have an effect, adding an extra processor to a DB server that is crippled by too many stored procedures doing table scans is as useful as a chocolate tea pot. Take another breath, particularly when you are told the lead time to order the components and re-install the software. Just stay alert because the pilgrim will be very happy believing they are solving the problem and can relax while they await the new hardware.
 
7) We JUST need to tune a small part of the system – there is often a hope that only a small store procedure or code element, once tuned, will yield the magical performance improvement or that all the performance problems can be found in one part of the architecture. At last the pilgrims are getting somewhere on their journey allowing you to progress too – some measurements at least have to be taken to identify the bad boy item. You can smile now; the journey’s nearly over..

Journey’s End – Wow! we do have a performance problem – at last your bunch of pilgrims have made the journey to true enlightenment. Like any religious journey it is full of self-doubt and distractions along the way, but at last you can finally start to solve the problem. Just hope now that a new project manager doesn’t get involved and you have to start at step 1 again.

Some projects have less potential areas for deviating from the true path, and some have more, but each pilgrim has to find his own way. Your job is to guide and educate!

Small vs Large Scale Performance Test Environments

I have just added to the website a presentation that looks at sizing and extrapolation techniques for people considering building a small scale performance test environment instead of a large full scale performance test environment. In the paper several approaches are considered.
Factoring – This is where the architecture is easily scaled and therefore the performance test can be undertaken on a subset of the hardware.

Dimensioning – The architecture has known bottlenecks that drive the performance such as a central DB. The performance test environment must contain the bottleneck component but other components may not need to be representative of a full sized environment.

Modelling – This examines the use of modelling to take results from a small scale environment and predict the results for a larger scale environment.

Flipping – This looks at creating test environment that can be have the correct amount of resources allocated to them for a “full scale” performance test for example during off hours and then revert to a smaller scale performance test environment at other times.

Full Scale – Finally the advantages and disadvantages of a full scale performance test environment are discussed.

Finally the caveat for these techniques is that for any testing on a small scale performance test environment does not guarantee that all performance problems will be discovered due to application/scalability constrains that may only appear in the full sized environment!

You can download the presentation from here.

How do you know if your Load Test has a bottleneck

The bottleneck in a system may not be obvious. (Life would be easier but less fun if there where always easy to find). This is because there are two types “hard” and “soft”. Hard bottlenecks are the ones where a resource such as a CPU is working flat out which limits the ability of the system to process more transaction. While a soft bottleneck is some internal limit such at number of threads or connections that once all used limit the ability to process more transaction. Therefore, how do you find know if you have a bottleneck. If you are looking at the results from a single load test you may not know you will need to run multiple load tests at different numbers of virtual users and then see if you number of transactions per second increase with each increase in virtual users. The results can be seen in the two graphs below. The first shows how the throughput (transaction per seconds) increases and levels off when saturated and the second shows the response time. You will probably have heard the express below the knee of the curve and this is an the point that is to the left of the bend in the response time graph.

Throughput Graph

Throughput Graph

Response Time Graph

The graphs above where actually generated using a spreadsheet model for the performance of a closed loop model. This is like LoadRunner and other testing tools where the are a fixed number of users that use the system then wait and return to the system. The reality is that the performance graphs may look different from the expected norm. An example is shown below from a LoadRunner test the first graph shows how the number of VUser where increased during the test and the second graph shows the increase in response times. In this case the jump in response time is dramatic. However, in some cases the increase in response time will be less dramatic as the system will start to error at high loads which will distort the response time figures.

Example LoadRunner VUser Graph

Example LoadRunner Graph Showing Increasing Response Times

Having discovered there is a bottleneck in the system then you have to start looking for it.

Scalability

Scalability can be defined in many ways. However, in general it is the relationship of how an output increases with a change in input. Typically we may think of how throughput changes as we increase the number of CPUs. In a perfect world we would like to have linear scaling. . I came across a good example of non-linear scaling. It is from a presentation presented by Peter Hughes. It is where you are having a dinner party and have 1 meter square tables each table seats 4 people. As it is a dinner party you want to have everybody facing each other as much as possible. So with one table you can sit 4 people.

scalability11

To increase the number of guests you need 4 tables but you can now only sit 8 people.

 

scalability2

To increase the number of guests again you need 9 tables but you can now only sit 12 people.

  

scalability3

 

If you plot the relationship between guests and tables on a graph is looks like the one below.

 scalability-graph 

 

 

 

Extrapolation of Load Test Results is it worth it?

It is a common problem that performance testing is often carried out on smaller scale test environments but project managers want to know that the system will scale and response times will not be degraded. Therefore can the performance test results be extrapolated? My view on extrapolation is it is a great technique when used properly but it does not guarantee that the system you tested will work well on the full sized production environment. The two main reasons for failure are

1) You have made a mistake in the creation of your model. These mistakes could be simply just a poorly built model or a bad assumption. However, with plenty of time and expertise you can overcome some of these limitations by building a good model.

2) There are “soft” bottlenecks in the system that are only detected at high load. A common example might be a piece of software may be limited to a certain number of threads that once all used, limit scalability. Some of these “soft” limits might be know by developers before hand and can be investigated with the model and the test environment but it the unknown unknowns that will be the problem on go live day

However, this does not mean that extrapolation is bad or should be avoided. Where as it cannot guarantee that the system will work in production is can be used to show that the system will fail and as we all know avoiding a costly failure is often worth the effort. Using modelling techniques you can estimate the needed hardware configuration for the production system which can be compared to what is expected to be deployed and if the deployed hardware is undersized you have a made a friend with the project manager.

Welcome to my Performance Engineering Blog

Hi this is a blog that I have started for “fun” about my work as a performance engineer. For some a performance engineer is a performance tester that can help fix performance problems. A wider definition of a performance engineer is one that can help achieve the performance goals of a project throughout the lifecycle of the project through development and into production. I surpose I like to feel I am more the latter type of performance engineer. I particularly like performance modelling and prediction. However, we must remember “performance prediction is easy ,ts getting it right that the hard part” (Thanks to Dr Ed Upchurch for that quote)