WebAxe-4

Here is WebAxe-4 at a glance.

Workload Name: WebAxe-4

Polygraph Version: 2.7

Configuration: workloads/webaxe-4.pg

Parameters: peak request rate, fill rate, cache size, working set size

How-Tos: available

Results: available

Synopsis: workload for testing surrogates (aka Web accelerators or reverse proxies), fourth generation.

1. Background
2. Feature overview
3. Details
    3.1 Phase schedule
    3.2 Servers configuration
    3.3 Robots configuration
    3.4 Content types
    3.5 WAN latency and packet loss
4. Parameters
    4.1 Peak request rate
    4.2 Fill request rate
    4.3 Proxy cache size
    4.4 Working set size
5. Addresses
    5.1 Robot addresses
    5.2 Server addresses
    5.3 Proxy address

1. Background

WebAxe-4 is designed for the fourth TMF cache-off, where it will be offered along with PolyMix-4 tests. Earlier generations of WebAxe workloads are usable but not polished due to the lack of interest.

2. Feature overview

The WebAxe environment models many key Web traffic characteristics, including the following.

a mixture of content types

varying offered load, depending on the test phase

a working set of URLs that changes its content with time but can preserve its size

all distributed clients can share information about the global URL set

hot subsets simulating flash crowds

virtually infinite number of different objects that are added to the working set as needed

object life-cycles (expiration and last-modification times)

persistent connections

network packet loss

reply sizes

server-side latencies

a mixture of cache hits and cache misses

a mixture of cachable and uncachable responses

object popularity (recurrence)

request rates and interarrival times

embedded objects and browser behavior

a mixture of request types and methods (IMS, reloads, HEAD, POST, etc.)

3. Details

This section describes individual components of the WebAxe-4 workload and is mostly auto-generated from PGL configuration files. The configuration files should be consulted whenever a conflict in documentation is suspected.

3.1 Phase schedule

The workload schedule consists of 10 phases. The schedule includes 9 phases with time-based goals and 1 other phase. The total test duration (based on the time-based goals) is about 10.33hour.

Phase Factors (%) Other

Populus Recurrence Special Msgs

beg end beg end beg end

framp 0.04 50.00 9.09 9.09 10.00 10.00

fill 50.00 50.00 9.09 9.09 10.00 10.00 wait for WSS to freeze

fexit 50.00 0.04 9.09 100.00 10.00 100.00

inc1 0.04 100.00 100.00 100.00 100.00 100.00

top1 100.00 100.00 100.00 100.00 100.00 100.00

dec1 100.00 10.00 100.00 100.00 100.00 100.00

idle 10.00 10.00 100.00 100.00 100.00 100.00

inc2 10.00 100.00 100.00 100.00 100.00 100.00

top2 100.00 100.00 100.00 100.00 100.00 100.00

dec2 100.00 0.04 100.00 100.00 100.00 100.00

Phase "framp" lasts for 20.00min. During this phase, the robot population size increases from 0.04% to 50.00%. The offered per-robot load remains stable at 100.00% of its peak level. The recurrence level remains stable at 9.09% of robot recurrence ratios. The portion of special messages remains stable at 10.00%.

Phase "fill" does not have a time-based duration configured. During this phase, the robot population size remains stable at 50.00% of its peak level. The offered per-robot load remains stable at 100.00% of its peak level. The recurrence level remains stable at 9.09% of robot recurrence ratios. The portion of special messages remains stable at 10.00%. 1 samples of per-transaction statistics are collected. The phase will continue until working set size is frozen.

Phase "fexit" lasts for 20.00min. During this phase, the robot population size decreases from 50.00% to 0.04%. The offered per-robot load remains stable at 100.00% of its peak level. The offered recurrence level increases from 9.09% to 100.00% of robot recurrence ratios. The portion of special messages changes increases from 10.00% to 100.00%.

Phase "inc1" lasts for 20.00min. During this phase, the robot population size increases from 0.04% to 100.00%. The offered per-robot load remains stable at 100.00% of its peak level. The recurrence level remains stable at 100.00% of robot recurrence ratios. The portion of special messages remains stable at 100.00%.

Phase "top1" lasts for 4.00hour. During this phase, the robot population size remains stable at 100.00% of its peak level. The offered per-robot load remains stable at 100.00% of its peak level. The recurrence level remains stable at 100.00% of robot recurrence ratios. The portion of special messages remains stable at 100.00%. 1 samples of per-transaction statistics are collected.

Phase "dec1" lasts for 20.00min. During this phase, the robot population size decreases from 100.00% to 10.00%. The offered per-robot load remains stable at 100.00% of its peak level. The recurrence level remains stable at 100.00% of robot recurrence ratios. The portion of special messages remains stable at 100.00%.

Phase "idle" lasts for 20.00min. During this phase, the robot population size remains stable at 10.00% of its peak level. The offered per-robot load remains stable at 100.00% of its peak level. The recurrence level remains stable at 100.00% of robot recurrence ratios. The portion of special messages remains stable at 100.00%.

Phase "inc2" lasts for 20.00min. During this phase, the robot population size increases from 10.00% to 100.00%. The offered per-robot load remains stable at 100.00% of its peak level. The recurrence level remains stable at 100.00% of robot recurrence ratios. The portion of special messages remains stable at 100.00%.

Phase "top2" lasts for 4.00hour. During this phase, the robot population size remains stable at 100.00% of its peak level. The offered per-robot load remains stable at 100.00% of its peak level. The recurrence level remains stable at 100.00% of robot recurrence ratios. The portion of special messages remains stable at 100.00%. 1 samples of per-transaction statistics are collected.

Phase "dec2" lasts for 20.00min. During this phase, the robot population size decreases from 100.00% to 0.04%. The offered per-robot load remains stable at 100.00% of its peak level. The recurrence level remains stable at 100.00% of robot recurrence ratios. The portion of special messages remains stable at 100.00%.

3.2 Servers configuration

The workload defines 1 server type.

Server "PolyMix-4-srv" hosts the following 4 content types: "image" (65.00% of all hosted content), "HTML" (15.00%), "download" (0.50%), and "other" (19.50%). The following 3 content types can be accessed directly: "HTML", "download", and "other". Server "think time" distribution is set to norm(2.50sec, 1.00sec). This server uses persistent connections. The number of transactions per connection is distributed as zipf(16). Idle persistent connections are closed after a 15.00sec timeout. Only basic reply types are used.

3.3 Robots configuration

The workload defines 1 robot type.

Robot "PolyMix-4-rbt" is a "constant request rate" robot with request rate of 0.40 requests per second. About 50.00% of requests refer to URLs in a globally shared, public working set. This robot revisits 91.67% of previously requested URLs (offering a hit when a URL is cachable). About 100.00% of embedded objects will be loaded.

This robot is not allowed to open more than 4 connections at any given time, even if that limit causes decrease in request rate or memory exhaustion. Moreover, waiting transaction queue can grow without bounds. Robot's private cache is limited to 1000 entries. This robot uses persistent connections. The number of transactions per connection is distributed as zipf(64). Idle persistent connections are never closed by this robot. The following 3 request types are used: "IMS" (20.00% of all possible request types), "Reload" (5.00%), and "Basic" (75.00%).

"PolyMix-4-rbt" robots direct 10.00% of all requests to 1.00% of the working set, using popUnif() popularity distribution.

3.4 Content types

The workload uses 4 unique content types.

Type Reply Size Cachability Extensions

image exp(4.500KB) 80.00% .gif, .jpeg, and .png

HTML exp(8.500KB) 90.00% .html and .htm

download logn(300.000KB, 300.000KB) 95.00% .exe, .zip, and .gz

other logn(25.000KB, 10.000KB) 72.00%

The size distribution for "image" content type is exp(4.500KB). About 80.00% of "image" objects are cachable. This content type does not contain other types. The following 3 extensions may appear at the end of URLs: ".gif", ".jpeg", and ".png".

The size distribution for "HTML" content type is exp(8.500KB). About 90.00% of "HTML" objects are cachable. This content type is a container. Objects may contain (embed) the following 1 content type: "image" (100.00% of all embedded content). The number of embedded objects per container is distributed as zipf(13). The following 2 extensions may appear at the end of URLs: ".html" and ".htm".

The size distribution for "download" content type is logn(300.000KB, 300.000KB). About 95.00% of "download" objects are cachable. This content type does not contain other types. The following 3 extensions may appear at the end of URLs: ".exe", ".zip", and ".gz".

The size distribution for "other" content type is logn(25.000KB, 10.000KB). About 72.00% of "other" objects are cachable. This content type does not contain other types.

3.5 WAN latency and packet loss

Polygraph client and server machines are configured to use FreeBSD's DummyNet feature.

We configure Polygraph clients with 40 millisecond delays (per packet, incoming or outgoing), and with a 0.05% probability of dropping a packet (incoming or outgoing). Server think times are normally distributed with a 300 millisecond mean and a 100 millisecond standard deviation. Note that the server think time does not depend on the object ID. Instead, it is randomly chosen for every request.

We do not use packet delays or packet loss on Polygraph servers.

4. Parameters

Here is an explanation of some of the workload parameters.

4.1 Peak request rate

This parameter specifies the request rate for the plateau of "top" phase of the test. The minimum request rate is 0.4 req/sec. The maximum request rate (given WebAxe-4 address allocation scheme rules) is probably around 15500 req/sec.

4.2 Fill request rate

This parameter specifies the request rate for the "fill" phase of the test. The fill rate must be within 10% to 100% of the peak request rate.

4.3 Proxy cache size

Proxy cache size is the configured cache size plus the total amount of RAM that the proxy box has. Configured cache size is whatever is specified in proxy configuration file or the best approximation of that. High/low water marks for garbage collection and other proxy-specific settings and algorithms should not affect this parameter. Proxy cache size is used to determine the duration of the fill phase and does not have direct effect on other phases (though there may be performance side-effects, of course).

4.4 Working set size

Working set size (WSS) should be set to 1GB. Other settings will not break the workload, but will cause incompliance with the rules and lead to incomparable results. Larger working set sizes require large caches to maintain close-to-ideal hit ratio. For custom tests, the configured size should reflect, approximately a typical size of real Web sites that the product under test accelerates. It is believed that the vast majority of Web sites have less than 100MB content and even most popular and large Web sites rarely exceed 1GB in size.

5. Addresses

This section describes algorithms and rules used to allocate domain names and IP addresses used in WebAxe-4. Most of the addresses are computed automatically based on request rates and address space parameters specified in the workload file.

5.1 Robot addresses

The number of WebAxe-4 robots is determined by peak request rate. Each robot is capable of producing 0.4 req/sec load. The total number of robots is adjusted so that every client-side host has the same number of robots (other similar minor adjustments are also made). The number of hosts is determined based on the maximum host load of 500 req/sec.

Two robots share the same IP address. All IP addresses use /22 subnet.

To allocate IP addresses for robot pairs, Polygraph iterates through the client-side addr_space array and gives the next robot pair the next IP address, until enough IP addresses are allocated for a host. Polygraph then skips remaining IP addresses that belong to the same /22 subnet (if any), and starts allocation for the next host (if any).

The above scheme ensures that individual IPs do not "migrate" from one host to another when request rate changes. Instead, only the number of IPs "enabled" on each host changes.

Robot addresses are bound to loopback interfaces. The bench setup must provide appropriate routes for robots to be able to communicate with the world.

WebAxe-4 uses lo0::10.X.0-123.1-250/22 client-side address space, where X is the bench ID that can vary from 100 to 199.

5.2 Server addresses

Server-side IP addresses are set to the real addresses of the server-side PCs. The number of WebAxe servers must be 4 or the number of client-side machines, whichever is larger.

5.3 Proxy address

Surrogate address must be set to 172.16.X.32 where X is the bench ID. Robots will contact that address as if it was an origin server. The surrogate is required to communicate with the actual origins.

Home · Search · Print · Help

WebAxe-4

Table of Contents

1. Background

2. Feature overview

3. Details

3.1 Phase schedule

3.2 Servers configuration

3.3 Robots configuration

3.4 Content types

3.5 WAN latency and packet loss

4. Parameters

4.1 Peak request rate

4.2 Fill request rate

4.3 Proxy cache size

4.4 Working set size

5. Addresses

5.1 Robot addresses

5.2 Server addresses

5.3 Proxy address