Tuesday, December 2, 2008

Chapter 7: Application Benchmarking

Software Performance Testing Handbook

A Comprehensive Guide for Beginners



What is benchmarking?

In general, Benchmarking is the process of determining who is very best, who sets the standard, and what that standard is. When it comes to the Performance Testing context, most of the time we often use the word benchmark. Benchmarking is the process of the determining the relative performance of the software program by running the standard set of tests. We can benchmark the hardware or software performance by comparing its relative performance figures. For example, we can benchmark the software application across various application servers by deploying it in Websphere server, Weblogic server and JBoss server to compare the performance of the application in each server. We can benchmark the hardware requirement for the software by running the tests on Dell server, HP Proliant server and IBM servers to compare the CPU, Disk and Memory utilization details.

Why should we benchmark the applications?

Unless we compare ourselves with others against a common measure (for example say height) which is a measure applicable for both entities, we cannot say who is best. It becomes very tough if we don’t know what is the standard measure that can be used to compare oneself against the other. The same is applicable for the software systems. One always needs to know where their competitors stand. In order to compare different software or hardware performance, we need to have the common standard measurements.

Industry Standards for Benchmarking


It becomes impossible for the application owners to test the application performance against variety of the server machines available in the market and to choose among them due to cost factor. There are organizations available in the market which does this benchmarking. The application owners could refer to these industrial benchmarks to decide on their infrastructure requirements. Transactions Processing Performance Council (TPC), Standard Performance Evaluation Council (SPEC), Synchromesh benchmarks, etc are the industry standard benchmarks available. There are other open source and vendor specific benchmarks available in the market. These organizations perform the testing on different servers with varied hardware configurations and provide the performance figures. As we all know we need to have some common measure to compare with the competitors, these industry standards provide the list of measures which is common across server platforms.


Transaction processing Performance Council (TPC)

The TPC is a non-profit corporation founded in 1980’s to define transaction processing and database benchmarks and to disseminate objective, verifiable TPC performance data to the industry. The TPC benchmarks are widely used today in evaluating the performance of computer systems.

The TPC benchmarks involve the measurement and evaluation of computer functions and operations through transaction as it is commonly understood in the business world. A typical transaction, as defined by the TPC, would include updating to a database system, a set of operations including disk read/writes, operating system calls, or some form of data transfer from one subsystem to another. There are different types of benchmarks available in TPC. It includes TPC-App, TPC-C, TPC-E and TPC-H. The following information is taken from the TPC site - http://www.tpc.org/.

Standard Performance Evaluation Council

The SPEC is a non-profit organization that aims to produce fair, impartial and meaningful benchmarks for the computers. SPEC was founded in 1988 and the goal is to ensure that the marketplace has a fair and useful set of metrics to differentiate candidate systems. Its member organizations include leading computer and software manufacturers. SPEC benchmarks are widely used today in evaluating the performance of computer systems; the results are published on the SPEC web site - http://www.spec.org/ .

Mistakes in Benchmarking

One needs to understand the fact that benchmarks provided by organizations like TPC, SPEC are based on the workload created by a typical application used for the performance testing. If the application owners develop applications with a specific workload which is totally different from the typical application workload used for benchmarking, then it is not an apple to apple comparison. But we don’t have any choice as it is impossible to test our product across different configurations to choose the best one. Hence we can refer to the benchmarks and add appropriate risk buffer if the workload of the application under test is completely different from the one used for benchmarking.

1 comment:

Anonymous said...

Hi Ramya

Really good book for a beginner ,But certain information is inaccurate and wrong for example on Page 91,Pages per second (Memory: Pages/sec) – A consistent value of more than 5 indicates the memory problem.

pages/sec if at 5 is not an issue,and page-in & page-out & swap-in should be checked before we conclude any memory issue.

Any how the book is good & informative.

Regards
Aravind.S
Sr.Software Engineer