SrvLB-L4-4

Here is SrvLB-L4-4 at a glance.

Workload Name: SrvLB-L4-4
Polygraph Version: 2.6
Configuration: workloads/srvlb-l4-4.pg (under development)
Parameters: peak request rate
How-Tos: Available
Results: TBA
Synopsis: workload for testing layer 4 server load balancers.

Table of Contents

1. Background
2. Feature overview
3. Details
    3.1  Phase schedule
    3.2  Servers configuration
    3.3  Robots configuration
    3.4  Content types
    3.5 WAN latency and packet loss
4. Parameters
    4.1 Peak Request Rate
    4.2 Bench IP Addresses
    4.3 Bench Address Masks

1. Background

The SrvLB-L4-4 workload is designed to test devices like Layer 4 server load-balancing switches. Many known load-balancer benchmarks are designed to test performance of a production SLB solution involving 'real' servers. While such a test is of great value in determining the fitness of a specific setup (including the actual server's abilities), it is of little use when comparing different SLB solutions -- since results include the unknown factor of the 'real' servers' performance.

SrvLB-L4 is our attempt to provide the load-balancing community with quality, and high-performance benchmark capable of comparing different load-balancing solutions effectively, as well as analyzing switch performance under conditions that are difficult to test in a lab with real servers.

A predecessor to this workload is WebAxe-1. WebAxe workload was designed to test reverse-proxy (a.k.a., server accelerator) devices. The SrvLB-L4-4 workload tests load balancing devices rather than accelerators.

2. Feature overview

The SrvLB-L4 environment models many key Web traffic characteristics, including the following.

3. Details

3.1 Phase schedule

The workload schedule consists of 3 phases. The schedule includes 3 phases with time-based goals. The total test duration (based on the time-based goals) is about 2.50hour.

Phase Factors (%) Other
Populus Recurrence Special Msgs
beg end beg end beg end
ramp 1 100 100 100 100 100  
plat 100 100 100 100 100 100  
exit 100 1 100 100 100 100  

Phase "ramp" lasts for 30.00min. During this phase, the robot population size increases from 1.00% to 100.00%. The offered per-robot load remains stable at 100.00% of its peak level. The recurrence level remains stable at 100.00% of robot recurrence ratios. The portion of special messages remains stable at 100.00%.

Phase "plat" lasts for 1.50hour. During this phase, the robot population size remains stable at 100.00% of its peak level. The offered per-robot load remains stable at 100.00% of its peak level. The recurrence level remains stable at 100.00% of robot recurrence ratios. The portion of special messages remains stable at 100.00%.

Phase "exit" lasts for 30.00min. During this phase, the robot population size decreases from 100.00% to 1.00%. The offered per-robot load remains stable at 100.00% of its peak level. The recurrence level remains stable at 100.00% of robot recurrence ratios. The portion of special messages remains stable at 100.00%.

3.2 Servers configuration

The workload defines 1 server type.

Server "SrvLb-L4-4-Srv" hosts the following 4 content types: "image" (65.00% of all hosted content), "HTML" (15.00%), "download" (0.50%), and "other" (19.50%). The following 3 content types can be accessed directly: "HTML", "download", and "other". Server "think time" distribution is set to norm(150msec, 50msec). This server uses persistent connections. The number of transactions per connection is distributed as zipf(64). Idle persistent connections are closed after a 15.00sec timeout. Only basic reply types are used.

3.3 Robots configuration

The workload defines 1 robot type.

Robot "SrvLb-L4-4-Clt" is a "constant request rate" robot with request rate of 0.70 requests per second. About 95.00% of requests refer to URLs in a globally shared, public working set. This robot revisits 95.00% of previously requested URLs (offering a hit when a URL is cachable). About 100.00% of embedded objects will be loaded.

This robot is not allowed to open more than 4 connections at any given time, even if that limit causes request rate decrease or memory exhaustion. Moreover, waiting transaction queue can grow without bounds. Robot's private cache is limited to 1000 entries. This robot uses persistent connections. The number of transactions per connection is distributed as zipf(16). Idle persistent connections are never closed by this robot. The following 3 request types are used: "IMS" (10.00% of all possible request types), "Reload" (5.00%), and "Basic" (85.00%).

"SrvLb-L4-4-Clt" robots direct 10.00% of all requests to 1.00% of the working set, using popUnif() popularity distribution.

3.4 Content types

The workload uses 4 unique content types.

Type Reply Size Cachability Extensions
image exp(4.500KB) 80.00% .gif, .jpeg, and .png
HTML exp(8.500KB) 90.00% .html and .htm
download logn(300.000KB, 300.000KB) 95.00% .exe, .zip, and .gz
other logn(25.000KB, 10.000KB) 72.00%  

The size distribution for "image" content type is exp(4.500KB). About 80.00% of "image" objects are cachable. This content type does not contain other types. The following 3 extensions may appear at the end of URLs: ".gif", ".jpeg", and ".png".

The size distribution for "HTML" content type is exp(8.500KB). About 90.00% of "HTML" objects are cachable. This content type is a container. Objects may contain (embed) the following 1 content type: "image" (100.00% of all embedded content). The number of embedded objects per container is distributed as zipf(13). The following 2 extensions may appear at the end of URLs: ".html" and ".htm".

The size distribution for "download" content type is logn(300.000KB, 300.000KB). About 95.00% of "download" objects are cachable. This content type does not contain other types. The following 3 extensions may appear at the end of URLs: ".exe", ".zip", and ".gz".

The size distribution for "other" content type is logn(25.000KB, 10.000KB). About 72.00% of "other" objects are cachable. This content type does not contain other types.

3.5 WAN latency and packet loss

For SrvLB-L4 tests, Polygraph client machines are configured to use FreeBSD's DummyNet feature.

We configure Polygraph clients with 100 millisecond delays (per packet, incoming and outgoing), and with a 0.1% probability of dropping a packet. Server think times are normally distributed with a 150msec mean and a 50msec standard deviation. Note that the server think time does not depend on the oid. Instead, it is randomly chosen for every request.

We do not add packet delays or packet loss on Polygraph servers.

4. Parameters

Here is an explanation of some of the workload parameters.

4.1 Peak Request Rate

This parameter specifies the request rate for the Plateau phase of the test.

4.2 Bench IP Addresses

The primary IP addresses of physical hosts running Polygraph are set in the workload file. The number of machines determines the maximum request rate that a test can generate.

4.3 Bench Address Masks

The address mask settings affect the aliases created by Polygraph. Polygraph robots bind to the created aliases at the beginning of the test. Default address masks are just fine unless they conflict with other bench IP addresses.

Here is a sample workload configuration.

Bench TheBench = {
            client_side = {
        addr_mask = 'lo0::10.44.0.0';  // aliases use 10.X/16 network
        hosts = [ '172.16.44.61-70' ]; // real IPs use 172.16/16
    };
    server_side = {
        addr_mask = '172.16.44.0:80';
        hosts = ['172.16.44.191-200'];
    };
    // maximum rate given the number of hosts above
    peak_req_rate =
        client_side.max_host_load * count(client_side.hosts);
};