Correlation in performance testing is used to account for dynamic values. Many web applications have dynamic data that changes every time the user runs that web application. Web applications often need to track user’s interactions as they navigate through their website while preserving their state between navigations. Session ids for example are used by server engines such as ASP, ASP.NET, JSP and PHP to manage sessions. These session ids will change each time the page is loaded.
Users typically use recording mechanisms to record their scripts. In some cases, they will not be able to replay the recorded script without modifications. It doesn’t matter how good or sophisticated the load testing tool is. It is impossible to create a script recorder that can accurately record and replay every scenario. This should not be taken to mean that you should not use the recording capabilities of these load testing tools since recording scripts is a great way to create a starting script. This considerably reduces the amount of work needed to create a working script.
Here is an example to illustrate correlation and dynamic data. Consider the web page below.
The link (A tag) contains a href value that is dynamic (i.e. changes every time the page is loaded). The next time the page is loaded the HTML may look something like this
If you attempt to record and replay this script, unless you modify the script it will fail. The href values of the link (‘a’ tag) are dynamic and need correlation to work correctly. The only way to make this script work is to edit it and dynamically extract the href value i.e. /784768, /792187, etc. and use that value to request the subsequent page.
Load Testing Tools
But hey, you might say, what about all these load testing tools that record actual browser interactions. Their users don’t need to worry about correlation and dynamically changing data since they are replaying actual user interactions (in a manner similar to functional testing tools). The problem with these kinds of load testing tools is that they will have tough time running more that around 10 or so virtual users on one machine. These tools will instantiate actual running instances of a web browser. This means that they also load the huge footprint of a normal web browser to accomplish a task that typically only requires a small fraction of that functionality. It is thus easy to see why once you run more than 10 or so virtual users, the machine will be completely maxed out. In this case a 1000 virtual user load will require around 100 computers, which is excessive by any standards (ordinarily, you would be able to run 1000 virtual users on single computer). Keep in mind that automated web load tools can typically automate loads of thousands of virtual users and keep it there for hours if not days. Moreover using techniques like sync points that allow virtual users to perform the same task simultaneously, you can stress the web server to an even higher degree.
Dynamic data in load testing requires correlation to enable load testing scripts to run accurately. It is important to keep this in mind when you use recording to generate a script. If the script is unable to replay accurately, there is a good chance that this is because dynamic data is involved. The way to address this is to use correlation techniques.