| Home · Search · Print · Help  
 Simple tests1. For the impatient
polysrv \
    --config /usr/local/polygraph/workloads/simple.pg \
    --verb_lvl 10
polyclt \
    --config /usr/local/polygraph/workloads/simple.pg \
    --verb_lvl 10
# watch console output then kill polysrv and polyclt
 2. Introduction
Polygraph distribution includes a client and server simulators called
polyclt and polysrv. You need to run both programs to
simulate the desired workload. The server(s) should be launched first. We will
show the command line for polysrv (polyclt) followed by the
polysrv (polyclt) output generated in our environment. For simplicity, we will start both simulators on one machine. To run the
tests below, your machine must be allowed to connect to itself via
127.0.0.1 IP address. If you want to start Polygraph on a different
port or host, you must adjust configuration files accordingly. Polygraph workloads are specified using command line options and
configuration file written in Polygraph Language (PGL). The tests and workloads described here are not meant to be used for
production benchmarking! They only illustrate basic Polygraph
usage. 3. Hello, world
This simple run will test basic Polygraph functionality. Polygraph distribution already contains a simple workload specification.
The specs can be found in simple.pg. Other
workload examples can be found in the polygraph/workloads/ directory
as well. Normally, the workload description includes several phases.
Polygraph stops when all phases reach their goals. For these simple tests, we
will use the default phase that does not have any goal. Hence we would have to
terminate polyclt and polysrv manually by pressing
^Cor sending an interrupt signal. Let's start the server. > polysrv \
    --config /usr/local/polygraph/workloads/simple.pg \
    --verb_lvl 10
000.00| Content distribution on server S101:
        content        planned%         likely%          error%
   some-content          100.00          100.00            0.00
expected average cachability: 80.00%
expected average object size: 13331.30Bytes
bin/polysrv: warning: no run phases were specified; ...
000.00| fyi: no bench selected with use(); ...
000.01| Command: polysrv 
    --config /usr/local/polygraph/workloads/simple.pg --verb_lvl 10
000.01| Configuration:
    version:            2.6.0b5
    host_type:          i386-unknown-freebsd3.4
    verb_lvl:           10
    dump:               <none>
    dump_size:          1.000KB
    notify:             <none>
    label:              <none>
    fd_limit:           15878
    config:             /usr/local/polygraph/workloads/simple.pg
    cfg_dirs:           
    console:            -
    log:                <none>
    log_size:           -1
    sample_log:         <none>
    sample_log_size:    -1
    stats_cycle:        5.00sec
    sync_phases:        on
    file_scan:          poll
    priority_sched:     5
    new_oids_per_msg_max:16
    fake_hosts:         
    idle_tout:          5.00min
    rng_seed:           1
    unique_world:       on
    new_oids_history:   2048
    ign_urls:           off
000.01| Phases:
     phase load_beg load_end  rec_beg  rec_end smsg_beg smsg_end     goal
      dflt     1.00     1.00     1.00     1.00     1.00     1.00 <none>    
000.01| StatsSamples:
    static stats samples:   0
    dynamic stats samples:  0
000.01| FDs: 16384 out of 16384 FDs can be used; safeguard limit: 15878
000.01| resource usage: 
    CPU Usage: 7msec sys + 1.52sec user = 1.53sec
    Maximum Resident Size: 7.922MB
    Page faults with physical i/o: 0
000.01| group-id: 0480bcf4.2e7d019d:00000002 pid: 413
000.01| current time: 987891522.111529 or Sat, 21 Apr 2001 22:18:42 GMT
000.01| fyi: PGL configuration stored (3054bytes)
000.01| fyi: current state (1) stored
000.01| starting 1 HTTP agents...
000.01| starting S101[1 / 0480bcf4.2e7d019d:00000004] 
    on 127.0.0.1:9090
000.10| i-dflt      0   0.00     -1  -1.00   0    1
000.18| i-dflt      0   0.00     -1  -1.00   0    1
000.26| i-dflt      0   0.00     -1  -1.00   0    1
000.35| i-dflt      0   0.00     -1  -1.00   0    1
...
 As you can see, polysrv process created one server agent bound to
localhost (127.0.0.1), port 9090.  Polysrv complained
about no phases found in simple.pg and generated a ``default''
infinite phase. Since there is no client running yet, the run-time statistics
are all zeros.  For details about console output format look elsewhere. Polysrv will continue to do nothing until we start the client.
You may need to open a new window or virtual terminal to do get a second
command line prompt. > polyclt \
    --config /usr/local/polygraph/workloads/simple.pg \
    --verb_lvl 10
000.00| Content distribution on server S101:
        content        planned%         likely%          error%
   some-content          100.00          100.00            0.00
expected average cachability: 80.00%
expected average object size: 13331.30Bytes
bin/polyclt: warning: no run phases were specified; ...
000.00| fyi: no bench selected with use(); ...
000.01| Command: bin/polyclt --config workloads/simple.pg --verb_lvl 10
000.01| Configuration:
    version:            2.6.0b5
    host_type:          i386-unknown-freebsd3.4
    verb_lvl:           10
    dump:               <none>
    dump_size:          1.000KB
    notify:             <none>
    label:              <none>
    fd_limit:           15878
    config:             /usr/local/polygraph/workloads/simple.pg
    cfg_dirs:           
    console:            -
    log:                <none>
    log_size:           -1
    sample_log:         <none>
    sample_log_size:    -1
    stats_cycle:        5.00sec
    sync_phases:        on
    file_scan:          poll
    priority_sched:     5
    new_oids_per_msg_max:16
    fake_hosts:         
    idle_tout:          <none>
    rng_seed:           1
    unique_world:       on
    proxy:              <none>
    ports:              <none>
    icp_tout:           2.00sec
    new_oids_prefetch:  256
    ign_false_hits:     on
    ign_bad_cont_tags:  off
    prn_false_misses:   off
000.01| Phases:
     phase load_beg load_end  rec_beg  rec_end smsg_beg smsg_end     goal
      dflt     1.00     1.00     1.00     1.00     1.00     1.00 <none>    
000.01| StatsSamples:
    static stats samples:   0
    dynamic stats samples:  0
000.01| FDs: 16384 out of 16384 FDs can be used; safeguard limit: 15878
000.01| resource usage: 
    CPU Usage: 23msec sys + 1.51sec user = 1.53sec
    Maximum Resident Size: 8.203MB
    Page faults with physical i/o: 18
000.01| group-id: 0480bd09.3a2501a0:00000002 pid: 416
000.01| current time: 987891543.872993 or Sat, 21 Apr 2001 22:19:03 GMT
000.01| fyi: PGL configuration stored (3054bytes)
000.01| fyi: current state (1) stored
000.01| starting 1 HTTP agents...
000.01| starting R101[1 / 0480bd09.3a2501a0:00000004] on 127.0.0.1
000.01| fyi: server scan completed with all local robots ready ...
000.10| i-dflt   3296 659.19      1   0.00   0    2
000.18| i-dflt   6585 657.79      1   0.00   0    2
000.26| i-dflt   9891 661.19      1   0.00   0    2
000.35| i-dflt  13109 643.54      1   0.00   0    2
000.43| i-dflt  16380 654.05      1   0.00   0    2
000.51| i-dflt  19667 657.29      1   0.00   0    2
000.60| i-dflt  22972 660.94      1   0.00   0    2
000.68| i-dflt  26228 651.18      1   0.00   0    2
000.76| i-dflt  29500 654.33      1   0.00   0    2
000.85| i-dflt  32775 654.91      1   0.00   0    2
000.93| i-dflt  35992 643.38      1   0.00   0    2
001.01| i-dflt  39223 646.19      1   0.00   0    2
001.10| i-dflt  42479 651.19      1   0.00   0    2
001.18| i-dflt  45687 641.55      1   0.00   0    2
001.26| i-dflt  48954 653.37      1   0.00   0    2
001.35| i-dflt  52189 646.95      1   0.00   0    2
001.43| i-dflt  55418 645.79      1   0.00   0    2
001.51| i-dflt  58691 654.51      1   0.00   0    2
001.60| i-dflt  61955 652.77      1   0.00   0    2
001.68| i-dflt  65304 669.72      1   0.00   0    2
001.76| i-dflt  68593 657.71      1   0.00   0    2
...
 Now we can see some traffic on both client side (above) and server side
(below). The console output tells us that Polygraph is doing around 650
requests per second with response times of 1msec, and that there are no
hits or errors. Note that we are running a very simple back-to-back workload.
Your numbers will differ depending how powerful your OS and hardware are. We will kill the experiment now by pressing Control+C in client
and server windows. Here is the rest of the server output. ...
000.51| i-dflt   5436 656.53      0   0.00   0    2
000.60| i-dflt   8724 657.52      0   0.00   0    2
000.68| i-dflt  11985 652.20      0   0.00   0    2
000.76| i-dflt  15224 647.73      0   0.00   0    2
000.85| i-dflt  18499 654.98      0   0.00   0    2
000.93| i-dflt  21818 663.77      0   0.00   0    2
001.01| i-dflt  25075 651.38      0   0.00   0    2
001.10| i-dflt  28370 658.96      0   0.00   0    2
001.18| i-dflt  31631 652.18      0   0.00   0    1
001.26| i-dflt  34865 646.75      0   0.00   0    2
001.35| i-dflt  38092 645.31      0   0.00   0    2
001.43| i-dflt  41333 648.19      0   0.00   0    2
001.51| i-dflt  44551 643.51      0   0.00   0    2
001.60| i-dflt  47814 652.55      0   0.00   0    2
001.68| i-dflt  51047 646.56      0   0.00   0    2
001.76| i-dflt  54281 646.80      0   0.00   0    2
001.85| i-dflt  57526 648.96      0   0.00   0    2
001.93| i-dflt  60774 649.50      0   0.00   0    2
002.01| i-dflt  64103 665.70      0   0.00   0    1
002.10| i-dflt  67429 665.18      0   0.00   0    2
002.18| i-dflt  70705 655.17      0   0.00   0    2
002.25| SrvConnMgr.cc:77: error: 1/1 (c16) connection closed ...
002.26| i-dflt  73421 543.14      0   0.00   0    1
002.35| i-dflt  73421   0.00     -1  -1.00   0    1
002.43| i-dflt  73421   0.00     -1  -1.00   0    1
^Cgot shutdown signal (2)
002.44| noticed shutdown signal (2)
002.44| resource usage: 
    CPU Usage: 24.75sec sys + 29.63sec user = 54.37sec
    Maximum Resident Size: 8.824MB
    Page faults with physical i/o: 0
002.44| fyi: current state (2) stored
002.44| server 127.0.0.1:9090 is closing listen socket 3 
    after 73421 xactions
002.44| got 73421 xactions and 0 errors
002.44| shutdown reason: got shutdown signal
 And the rest of the client output. ...
001.85| i-dflt  71839 649.16      1   0.00   0    2
^Cgot shutdown signal (2)
001.89| noticed shutdown signal (2)
001.89| resource usage: 
    CPU Usage: 25.18sec sys + 34.89sec user = 1.00min
    Maximum Resident Size: 8.879MB
    Page faults with physical i/o: 18
001.89| fyi: current state (2) stored
001.89| got 73421 xactions and 0 errors
001.89| shutdown reason: got shutdown signal
 Now it is a good time for you to look through the simple.pg file
and PGL documentation to see what kind of workload we were using during this
simple test. 4. Adding a proxy
In the previous test, polyclt was talking directly to
polysrv running on port 9090. Now we want to introduce a proxy
into the setup. If your proxy runs in a transparent mode, you will probably need to run
polyclt and polysrv on different hosts and move
polysrv to port 80 so that Polygraph traffic will get
automagically redirected to the proxy. We will not demonstrate transparent
setup here. Our proxy is running on host 10.44.0.100 and listening for HTTP
queries on port 9090. We need to tell polyclt process which
proxy to connect to using the --proxy command line option. Since our proxy is not running on the same machine as polysrv, we
can no longer use loopback interface and have to move our server agent to an
address that a proxy can connect to. The IP address and port number (currently
'127.0.0.1:9090') for the server agent are specified in the simple.pg
configuration file.  Below are the relevant lines. Server S = {
    kind = "S101"; 
    contents = [ SimpleContent ];
    direct_access = contents;
    addresses = ['127.0.0.1:9090' ]; // where to create these server agents
};
 You will need to edit those lines to change 127.0.0.1 address to the IP of
your machine. Our machine has 10.44.128.61 address. We recommend that you do
not edit simple.pg but create and edit its copy instead. Let's call
that copy file my-simple.pg. Here is how the modified part of
my-simple.pg looks in our case. Server S = {
    kind = "S101";
    contents = [ SimpleContent ];
    direct_access = contents;
    addresses = ['10.44.128.61:9090' ]; // new server address
};
 With recent versions of Polygraph, a similar change of IP address is
required for the robot as well. This is because Polygraph robots now always
bind to the specified address. If a robot remains bound to 127.0.0.1, it will
not be able to receive responses from the proxy without special routes or NAT.
Change the addresses field of your robot specification to contain the
primary address of the polyclt machine. In our case, that address is
10.44.128.61 because we use the same machine for both client- and server-side
processes (which is a bad idea for production tests!). Finally, we want to log detailed run-time statistics into
/tmp/clt.log and /tmp/srv.log files using the --log
command line option. A sample from the server-side output is below. ...
000.01| Command: bin/polysrv 
    --config /tmp/my-simple.pg --verb_lvl 10 
    --log /tmp/srv.log
000.01| Configuration:
    config:             /tmp/my-simple.pg
    log:                /tmp/srv.log
    ...
000.01| starting 1 HTTP agents...
000.01| starting S101[1 / 0480cdae.630701cc:00000004] 
    on 10.44.128.61:9090
000.10| i-dflt      0   0.00     -1  -1.00   0    1
000.18| i-dflt     33   6.60      0   0.00   0    1
000.26| i-dflt     58   5.00      0   0.00   0    1
000.35| i-dflt     97   7.80     40   0.00   0    1
000.43| i-dflt    101   0.80    374   0.00   0    1
000.51| i-dflt    123   4.40     57   0.00   0    1
000.60| i-dflt    139   3.20      0   0.00   0    1
000.68| i-dflt    179   8.00     33   0.00   0    2
000.76| i-dflt    182   0.60    442   0.00   0    1
000.85| i-dflt    190   1.60      5   0.00   0    1
000.93| i-dflt    194   0.80    344   0.00   0    1
001.01| i-dflt    207   2.60    114   0.00   0    2
001.10| i-dflt    227   4.00    237   0.00   0    1
001.18| i-dflt    238   2.20      0   0.00   0    1
001.26| i-dflt    259   4.20      0   0.00   0    1
001.35| i-dflt    263   0.80      3   0.00   0    1
001.43| i-dflt    263   0.00     -1  -1.00   0    1
001.51| i-dflt    263   0.00     -1  -1.00   0    1
^Cgot shutdown signal (2)
001.52| noticed shutdown signal (2)
001.52| resource usage: 
    CPU Usage: 118msec sys + 1.67sec user = 1.79sec
    Maximum Resident Size: 8.848MB
    Page faults with physical i/o: 0
001.52| fyi: current state (2) stored
001.52| server 10.44.128.61:9090 is closing listen socket 4 
    after 263 xactions
001.52| got 263 xactions and 0 errors
001.52| shutdown reason: got shutdown signal
 And here is the polyclt output. ...
000.01| Command: bin/polyclt 
    --config /tmp/my-simple.pg --verb_lvl 10 
    --proxy 10.44.0.100:3128 --log /tmp/clt.log
000.01| Configuration:
    config:             /tmp/my-simple.pg
    log:                /tmp/clt.log
    ...
000.01| starting 1 HTTP agents...
000.01| starting R101[1 / 0480cdb5.5c1701cd:00000004] 
    on 10.44.128.61
000.01| fyi: server scan completed with all local robots ready ...
000.10| i-dflt     98  19.60     50  58.16   0    2
000.18| i-dflt    163  13.00     69  49.23   0    2
000.26| i-dflt    221  11.60     77  56.90   0    2
000.35| i-dflt    272  10.20    117  60.78   0    2
000.43| i-dflt    288   3.20    281  62.50   0    2
000.51| i-dflt    353  13.00     82  47.69   0    2
000.60| i-dflt    392   7.80    106  46.15   0    2
000.68| i-dflt    419   5.40    221  62.96   0    2
000.76| i-dflt    420   0.20   3111   0.00   0    2
000.85| i-dflt    433   2.60    454  61.54   0    2
000.93| i-dflt    466   6.60    137  48.48   0    2
001.01| i-dflt    519  10.60    121  54.72   0    2
001.10| i-dflt    555   7.20    128  58.33   0    2
001.18| i-dflt    586   6.20    140  67.74   0    2
^Cgot shutdown signal (2)
001.20| noticed shutdown signal (2)
001.20| resource usage: 
    CPU Usage: 331msec sys + 2.00sec user = 2.33sec
    Maximum Resident Size: 8.902MB
    Page faults with physical i/o: 0
001.20| fyi: current state (2) stored
001.20| got 588 xactions and 0 errors
001.20| shutdown reason: got shutdown signal
 Let's concentrate on client-side console output. Note that various proxy
and network overheads increased transaction response time to more than
100msec, causing request rate drop to less than 10 req/sec (the correlation is
due to the best-effort mode of simple robots; production workloads virtually
never use best-effort robots). Also, polyclt is now getting some hits
(hit ratio is about 55%) The measurements reported on the console are unstable
due to relatively low request rate (not enough sample data in a 5-second stats
window). If you are not getting any hits from a proxy while everything else works as
expected, it is possible that the proxy under test is picky about object
expiration time and other freshness info. Some proxies would not cache an
object without certain HTTP header fields. The simple.pg workload
does not have ``Object Life Cycle'' model configured, and servers generate no
freshness headers. To get hits with picky proxies, you can either use an
advanced workload such as PolyMix or modify your workload specs to
include Object Life Cycle model. Here is a simple modification to our workload that makes Squid to cache
objects. We added an olcStatic object of type ObjLifeCycle
and used that object in SimpleContent configuration.  No other
changes were made. ObjLifeCycle olcStatic = {
    birthday = now + const(-1year); // born a year ago
    length = const(2year);            // two year cycle
    variance = 0%;                  // no variance
    with_lmt = 100%;                // all responses have LMT
    expires = [nmt + const(0sec)];  // everything expires when modified
};
// we start with defining content properties for our server to generate
Content SimpleContent = {
    size = exp(13KB); // response sizes distributed exponentially
    cachable = 80%;   // 20% of content is uncachable
    obj_life_cycle = olcStatic;
};
 The olcStatic definition was borrowed from the
contents.pg file distributed with Polygraph. Instead of copying the
definition into your workload, you can simply #include that file like
most standard workloads do. The contents.pg file contains other, more
sophisticated Object Life Cycle configurations. 4.1 Looking at binary logsThe binary logs created during the last test can be analyzed with the
lr (``Log Reader'') and lx (``Log Extractor'') tools
included in the Polygraph distribution. For example, let's get response time
histogram and mean on the client side (after the experiment is over). > lx --objects rep.rptm.hist /tmp/clt.log
rep.rptm.hist:
# bin   min   max   count     %   acc% 
    3     2     2      26  4.42   4.42
    4     3     3     160 27.21  31.63
    5     4     4      89 15.14  46.77
    6     5     5      81 13.78  60.54
    7     6     6      18  3.06  63.61
    8     7     7      21  3.57  67.18
    9     8     8      30  5.10  72.28
   10     9     9      20  3.40  75.68
   11    10    10      25  4.25  79.93
   12    11    11      21  3.57  83.50
   13    12    12      17  2.89  86.39
   14    13    13      10  1.70  88.10
   15    14    14      10  1.70  89.80
   16    15    15       3  0.51  90.31
   17    16    16       5  0.85  91.16
   18    17    21       5  0.85  92.01
   23    22    58       6  1.02  93.03
   72    71   234       6  1.02  94.05
  365   364  1267       7  1.19  95.24
  830  1268  1351       6  1.02  96.26
  854  1364  1475       5  0.85  97.11
  882  1476  2807       6  1.02  98.13
 1123  2832  2975       6  1.02  99.15
 1142  2984  3431       5  0.85 100.00
> lx --objects rep.rptm.mean /tmp/clt.log
rep.rptm.mean:           119.35
 As you can see, 79.93% of responses had response time less than 11msec, but
about 5% of transactions took more than a second, increasing mean response
time to 119msec. You can get most of the aggregate stats collected during the experiment by
running lx with no --objects option. > lx /tmp/clt.log
 Finally, you can generate a full-blown report using the binary log and
Report Generator tools that come with Polygraph. You will probably want to run
a longer test to get better graphs though. 5. Specifying request rate
Simple robots are best-effort robots. A best-effort robot submits
the next request right after receiving a response to the previous one.
Best-effort robots are useless for most benchmarking tasks because you do not
want request rate to be tied to transaction response time. In real traffic,
the two are usually orthogonal characteristics. The following instructions will require modifying the workload file.  We
strongly recommend that you copy simple.pg to a different file and
modify only that copy. Always keep distributed workload files unmodified. It
may not matter in this simple case, but it is a pain to spend hours debugging
a workload only to find out that you "temporary" modified a file that workload
is using but never reversed the changes. It is simple to tell the robot to emit a realistic Poisson request stream
with a given mean rate. All you need to do is to add req_rate setting
to the robot configuration. Let's use 1 request per second load (per
robot). We will also increase the number of robots to 10 by cloning robots address
10 times. Here is the new robot configuration. Robot R = {
    kind = "R101";
    public_interest = 50%;
    pop_model = { pop_distr = popUnif(); };
    recurrence = 55% / SimpleContent.cachable; // adjusted to get 55% DHR
    req_rate = 1/sec;
    origins = S.addresses;      // where the origin servers are
    addresses = ['10.44.128.61' ** 10 ]; // use clone operator
};
 Try using this new robot and see how console output changes. You should see
a cumulative request rate of about 10 requests per second. There should be
more concurrent connections now because each robot can open several
connections (if response time is more than one second), and there are ten
robots. The response time may change as well. If your device under test cannot handle 10 req/sec load, decrease per-robot
request rate or decrease the number of robots. Most production Polygraph workloads use thousands of robots with very low
individual request rates (e.g., 0.4/sec) to simulate large end-user
populations. However, as the above examples demonstrate, you can create a
workload that matches your testing needs. We still recommend starting with
standard workloads so that you gain experience using what has been proven to
work before experimenting with custom designs. 
 Home · Search · Print · Help  
 |