The PolyMix environment has been modeling the following Web traffic
characteristics since PolyMix-2.
- a mixture of content types
- varying offered load, depending on the test phase
- a working set of URLs that changes its content with time
but can preserve its size
- all distributed clients can share information about the
global URL set
- object life-cycles (expiration and last-modification times)
- persistent connections
- network packet loss
- reply sizes
- server-side latencies
- a mixture of cache hits and cache misses
- a mixture of cachable and uncachable responses
- object popularity (recurrence)
- request rates and interarrival times
- embedded objects and browser behavior
- virtually infinite number of different objects
that are added to the working set as needed
These features were added for PolyMix-3.
- integrated fill and measurement phases into a single run
- cache validation (IMS requests)
- forced cache validations (reloads)
- hot subsets simulating flash crowds
- improved URL working set handling
Still absent from the cache-off workload are.
- DNS-lookup latencies
- aborted requests
- real content (HTML, images, etc.)
- client-side latencies, bandwidth limits
- non-HTTP traffic
- different popularity characteristics among servers
While the last four features are already supported in a Polygraph
environment, they are prohibitively CPU intensive or require
further improvement.
1.1 Phase schedule
The following table describes all the important phases in a PolyMix-3
test. Not counting the fill phase, the test takes about 12 hours.
Filling the cache usually takes an additional 3-12 hours, depending on the
product.
Phase Name |
Duration |
Activity |
framp |
30 min |
The load is increased from zero to the peak fill rate. |
fill |
variable |
The cache is filled twice, and the working set size is frozen. |
fexit |
30 min |
The load is decreased to 10% of the peak fill rate. At the same time,
recurrence is increased from 5% DHR to its maximum level. |
inc1 |
30 min |
The load is increased during the first hour to
reach its peak level. |
top1 |
4 hours |
The period of peak ``daily'' load. |
dec1 |
30 min |
The load steadily goes down, reaching a period
of relatively low load. |
idle |
30 min |
The ``idle'' period with load level around
10% of the peak request rate. |
inc2 |
30 min |
The load is increased to reach its peak
level again. |
top2 |
4 hours |
The second period of peak ``daily'' load. |
dec2 |
30 min |
The load steadily goes down to zero. |
Most reliable/interesting measurements are usually taken from the
top2 phase when the proxy is more likely to be in a steady
state.
1.2 Fill phase caveats
As mentioned previously, PolyMix-3 combines the fill and
measurement phases into a single workload. The benefit to this
approach is that the device under test is more likely to have steady
state conditions during the measurement phases. Also, a larger URL
working set can now be formed without increasing the duration of a
test. Under PolyMix-2, the fill phase was an isolated test. That
meant that the measurement phase could not request objects used during
the fill phase.
A downside to integrating the fill phase is that it is now difficult to
skip the fill phase and go right to measuring. For some products, half of
the testing time is spent in the fill phase. The total duration of the
test remains similar to the PolyFill-2 plus PolyMix-2 sequence, decreasing
for some products.
The old PolyFill-2 workload used Polygraph's best-effort request
submission model, and vendors could choose how many robots to use for the
fill. Some participants apparently found that a small number of robots
left the disk system in a higher performing state than did a larger
number. Now, PolyMix-3 uses the same number of robots during all of its
phases, and the participants can specify fill rate directly just as they
specify the peak request rate.
One of the rules of PolyMix-3 is that the request rate during the
fill phase must not be greater than the peak rate (as used in
top1 and top2). Otherwise, users are free to choose
virtually any fill rate the like. Usually, the selected fill rate is at
least 50% of the peak request rate.
PolyMix-3 limits fill request rate to peak request rate to prevent
test participants from specifying very high fill rate that causes the
cache to reject or bypass some of the incoming fill traffic,
effectively reducing the amount of content stored by the cache. Some
products used this (now illegal) trick with PolyFill-2 to cache less
data during the fill and optimize their dataplacement layout for the
measurement phases. Ideally, Polygraph should check how much data is
actually cached instead of imposing artificial request rate
limits.
1.3 Reply sizes
Object reply size distributions are different for different content
types (see the table below). Reply sizes
range from 300 bytes to 5 MB with an overall mean of about
11 KB and a median of 5 KB. The reply size depends only on
the object ID (oid). Thus, the same object always has the same reply size,
regardless of the number of requests for that object.
1.4 Cachable and uncachable replies
Polygraph servers mark some of their responses as uncachable.
The particular probability varies with content types (see the table below). Overall, the workload
results in about 80% of all responses being cachable.
The real world cachability varies from location to location.
We have chosen 80% as a typical value that is close to many
common environments.
A cachable response includes the following HTTP header field.
Cache-Control: public
An uncachable response includes the following HTTP header fields.
Cache-Control: private,no-cache
Pragma: no-cache
Object cachability depends only on the oid. The same oid is always
cachable, or always uncachable.
1.5 Life-cycle model
Web Polygraph is capable of simulating realistic (complex)
object expiration and modification conditions using
Expires: and Last-Modified: HTTP headers.
Each object is assigned a ``birthday'' time. An object goes
through modification cycles of a given length. Modification
and expiration times are randomly selected within each cycle.
The corresponding parameters for the model are drawn from the
user-specified distributions.
The Life-cycle model configuration in PolyMix-3 does not utilize all
the available features. We restrict the settings to reduce the possibility
that a cache serves a stale response. While stale objects are common in
real traffic, caching vendors strongly believe that allowing them into the
benchmark sends the wrong message to buyers.
Consecutively, all Polygraph responses in PolyMix-3 carry
modification and expiration information, and that information
is correct. The real-world settings would be significantly
different, but it is difficult to accurately estimate the
influence of these settings on cache performance.
1.6 Content types
PolyMix-3 defines a mixture of content types. Each content type has
the following properties.
- popularity
- content size distribution
- cachability percentage
- life-cycle parameters
- file name extensions distribution
The approximate parameters for the first four properties are given in the
table below. For exact definitions, see the workload files.
Type |
Portion |
Reply Size |
Cachability |
Expiration |
Image |
65.0% |
exp(4.5KB) |
80% |
logn(30day, 7day) |
HTML |
15.0% |
exp(8.5KB) |
90% |
logn(7day, 1day) |
Download |
0.5% |
logn(300KB,300KB) |
95% |
logn(0.5year, 30day) |
Other |
19.5% |
logn(25KB,10KB) |
72% |
unif(1day, 1year) |
1.7 Latency and packet loss
PolyMix-3 uses the same latency and packet loss parameters that we used
for PolyMix-2. The Polygraph client and server machines are configured to
use FreeBSD's DummyNet feature.
We configure Polygraph servers with 40 millisecond delays (per
packet, incoming and outgoing), and with a 0.05% probability of dropping a
packet. Server think times are normally distributed with a
2.5 second mean and a 1 second standard deviation. Note that
the server think time does not depend on the oid. Instead, it is randomly
chosen for every request.
We do not use packet delays or packet loss on Polygraph clients.
1.8 If-Modified-Since requests
About 20% of PolyMix-3 requests contain an ``If-Modified-Since'' HTTP
header. Polygraph robots cache object timestamps to generate those
headers. When an IMS request has to be made but the corresponding
timestamp is not available, the value of ``Thu Jan 1 00:00:00 UTC 1970''
is used. In fact, the majority of requests end up using that value so that
the percentage of ``304 Not Modified'' responses from an ideal cache is
only around 5%, far less than 20% of IMS requests.
1.9 Cache hits and misses
PolyMix-3 workload has a 58% offered hit ratio. In the
workload definition, this is actually specified through the recurrence
ratio (i.e., the probability of revisiting a Web object). The
recurrence ratio must account for uncachable responses and special
requests. In PolyMix-3, a recurrence ratio of 72% yields an offered hit
ratio of 58%. Note that to simplify analysis, only ``basic'' requests are
counted when hit ratio is computed; special requests (If-Modified-Since
and Reload) are ignored because in many cases there is no reliable way to
detect whether the response' was served as a cache hit.
Polygraph enforces the desired hit ratio by requesting objects that
have been requested before, and should have been cached. There
is no guarantee, however, that the object is in the cache. Thus, our
parameter (58%) is an upper limit. The hit ratio achieved by a proxy may
be lower if a proxy does not cache some cachable objects, or purges
previously cached objects before the latter are revisited. Various HTTP
race conditions also make it difficult, if not impractical, to achieve
ideal hit ratios.
1.10 Object popularity
PolyMix-3 introduces a ``hot subset'' simulation into the
popularity model. At any given time, a 1% subset of the URL working
set is dedicated to receive 10% of all requests. As the working set
slides with time, the hot subset may jump to a new location so that
all hot objects stay within the working set. This model is designed
to simulate realistic Internet conditions, including ``flash
crowds.'' We have not yet fully analyzed the effect of this
hot subset model.
1.11 Simulated robots and servers
A single Polygraph client machine supports many simulated
robots. A robot can emulate various types of Web clients, from a
human surfer to a busy peer cache. All robots in PolyMix-3 are configured
identically, except that each has its own IP address. We limit the number
of robots (and hence IP aliases) to 1000 per client machine.
A PolyMix-3 robot requests objects using a Poisson-like stream, except
for embedded objects (images on HTML pages) that are requested simulating
cache-less browser behavior. A limit on the number of simultaneously open
connections is also supported, and may affect the request stream.
PolyMix-3 servers are configured identically, except that each has its
own IP address.
1.12 Persistent connections
Polygraph supports persistent connections on both client and server
sides. PolyMix-3 robots close an ``active'' persistent connection right
receiving the N-th reply, where N is drawn from a
Zipf(64) distribution. The robots will close an ``idle''
persistent connection if the per-robot connection limit has been reached
and connections to other servers must be opened. The latter mimics browser
behavior.
PolyMix-3 servers use a Zipf(16) distribution to close
active connections. The servers also timeout idle persistent connection
after 15 sec of inactivity, just like many real servers would
do.
In theory, the algorithm for assigning IP addresses to servers and robots
should not affect the results of the tests. However, knowing the IP allocation
scheme may be important for those who rely on IP-based redirection
capabilities of their network gear. Reachability of servers and robots is also
an issue.
The IP allocation scheme for PolyMix-3 is the same as for PolyMix-2.
However, Polygraph is now capable of automatically computing required IP
addresses based on the bench configuration specified using PGL.
2.1 Allocation scheme
The following allocation scheme is used for the tests. Each robot or server
within a testing Cluster is assigned a 10.C.x.y/N IP address. The
values of C, x, y, and N are defined below.
The number of IP addresses per testing Cluster (and hence the number of
robots and servers) is proportional the the maximum requested load. Enough
physical hosts are provided to ensure Polygraph is not a bottleneck. At the
time of writing, we expect to be able to handle at least 1,000 IP
addresses per host.
The values of C, x, y, and N are defined
below.
C is the testing Cluster identifier (constant for all
IPs within a cluster).
For robots, x is in the [1,127] range.
For servers, x is in the [129,250] range.
For robots and servers, y is in the [1,250] range.
10.C.0.1 is the proxy address known to robots (if robots
are configured to talk to a proxy). Participants can use other
10.C.0.0/24 and 10.C.128.0/24 addresses if needed.
Polyteam will use other IP addresses as needed for monitoring
and other purposes.
Moreover, exactly two schemes are supported.
Switched network: For robots, servers, and a known
proxy address (if any), the subnet /N is set to /16 (a class
B network, 255.255.0.0 netmask).
Routed network: For robots, servers, and a known
proxy address (if any), the subnet /N is set to /17 (a
255.255.128.0 netmask). Client side machines point their
default routes to 10.C.0.253.
The routed network configuration may be useful for router-based redirection
techniques such as WCCP.
2.2 Number of clients and servers
Given the total request rate RR req/sec, we allocate
(R = RR/0.4 = 2.5*RR) robot IP addresses (0.4 is request
rate for an individual robot). The number of server IP addresses is then
0.1*R+500. Robot and server IPs are distributed evenly and
sequentially across Polygraph machines (see example).
We place a limit of 1000 robots per client machine. The number of
server machines is equal to the number of client machines.
When the calculations produce non-integer value V, we round
towards the closest integer greater than V.
2.3 Configuration example
Here is an example of a possible configuration. Let's assume we want to test
a product under 800 req/sec peak load.
An 800 req/sec setup requires 2,000 robots and 700
servers. We must utilize two machines running polyclt and two
machines running polysrv. The robot and server IP addresses are
allocated as follows (assuming test Cluster id 100).
Host | Switched network | Routed network |
First client side host: | 10.100.1-4.1-250/16 | 10.100.1-4.1-250/17 |
Second client side host: | 10.100.5-8.1-250/16 | 10.100.5-8.1-250/17 |
First server side host: |
10.100.129-130.1-175/16 |
10.100.129-130.1-175/17 |
Second server side host: |
10.100.131-132.1-175/16 |
10.100.131-132.1-175/17 |
|
Thus, each client-side host is assigned 1,000 IP addresses while
each server host gets 350 IPs.
2.4 Other provisions
All robots must be able to ``talk'' HTTP to all servers at all times,
even if a proxy is not in the loop (a no-proxy test). No changes in
robot or server configuration are allowed for a no-proxy test. Only
unplugging the proxy cable is allowed. Consequently, if a proxy relies on
robots and servers being on different subnets during performance tests, a
no-proxy run must be feasible without changing the subnets of the robots and
servers. Providing for intra-robot or intra-server communication is not
required.
Participants must provide unlimited TCP/UDP connectivity from a single
dedicated host (a monitoring station maintained by Polyteam, one per testing
Cluster) to all robots and servers. The monitoring station has a single
network interface.
There is no DNS server and other global services reachable from a
testing Cluster. There is no permanent inter-cluster connections.
Not all operating systems can [efficiently] support large number of IP
addresses per host. We patch the kernel to ensure that FreeBSD can support
thousands of addresses.