The aliases option specifies what alias or aliases you
want to have on the given interface. Aka recognizes IP
addresses in PGL address format, including IP ranges. Aka
will try to specify the subnet or you can use an explicit subnet
specification.
The number of aliases you can set depends on your OS. Moreover,
some OSes may support large number of aliases (more than 1000)
but with a significant performance penalty. In our FreeBSD
environment, 500 aliases seems to be the limit after which
noticeable network performance degradation occurs.
Note that you can just put alias specs after all other options
and the interface name (see aka's usage line).
Aka will delete all old aliases before setting new ones.
If you do not specify the new aliases, the old ones will still be
deleted (handy for cleaning up after yourself).
polyclt --cfg_dirs <directory_names> |
polysrv --cfg_dirs <directory_names> |
polypxy --cfg_dirs <directory_names> |
The --cfg_dirs option specifies the list of directories
that are searched for root configuration file as well as PGL
#include files.
polyclt --config <filename> |
polysrv --config <filename> |
polypxy --config <filename> |
The --config option specifies configuration file for
polygraph clients and/or servers.
polyclt --console <filename> |
polysrv --console <filename> |
polypxy --console <filename> |
The --console option redirects console output to the
specified file.
The --count specifies how many samples Polygraph should
take before producing the final histogram. Highly skewed or heavy
tailed distributions usually require more samples to get a nice
histogram.
distr_test --distr <numDistr> |
The --distr option specifies the distribution to sample.
The syntax is identical to any other option that requires a
distribution value. You must use plain numbers (i.e., no time or size
scale) to specify distribution parameters (if any).
Note that drawn random values are truncated to integers before
being accounted in a histogram. This approach mimics the usual
Polygraph run-time behavior. However, a special care should be taken
to specify large enough numbers as the parameters for the
distribution.
polyclt --dump <strings> |
polysrv --dump <strings> |
polypxy --dump <strings> |
The --dump option controls what messages and what parts of
messages should be printed on the console. Possible message types are:
req[uest], rep[ly], and err[or]. Possible
message parts are hdr (for ``header'') and body.
Dumping of message bodies is not supported at the time of writing.
Here are some examples.
Dump requirement |
Dump option |
all request headers |
--dump req-hdr |
all requests and the headers of erroneous messages |
--dump req-hdrs,err-hdrs |
everything about errors |
--dump errs |
all headers |
--dump hdrs |
Note that sometimes Polygraph does not have the required data at
the time of dump. Polygraph will try to at least provide some meta
information about the message then.
If a message part matches both negative (error) and positive (reply
or request) type masks, the part may be printed twice.
polyclt --dump_size <size> |
polysrv --dump_size <size> |
polypxy --dump_size <size> |
The --dump_size option limits the size of individual
message dump. Particularly useful when dumping message bodies.
polyclt --fake_hosts <IPs> |
polysrv --fake_hosts <IPs> |
polypxy --fake_hosts <IPs> |
The --fake_hosts option instructs Polygraph to use given
addresses instead of looking up real network interfaces for available
local addresses.
Polygraph configuration file binds robots and servers to specific
IP addresses. Normally, Polygraph scans the list of network interfaces
available on the host to determine which robots and servers to start.
However, the default scan relies on semi-portable system calls and may
not work correctly (or at all) on some platforms.
To disable the network interface scan, use the
--fake_hosts option. The specified list of IP addresses will
be used instead of the one obtained from the operating system.
polyclt --fd_limit <int> |
polysrv --fd_limit <int> |
polypxy --fd_limit <int> |
The --fd_limit option decreases default filedescriptor
limit.
Polygraph determines the maximum level of available file
descriptors using getrlimit(RLIMIT_NOFILE) system call.
It then attempts to set the current level to that maximum using the
setrlimit(RLIMIT_NOFILE) system call. The return value of
the latter (actually about 97% of it) is then used as a
Polygraph internal FD limit.
Polygraph will not attempt to create more TCP sockets than its
internal limit. However, some OSes are known to be unhappy when a
process is close to the limit. In a non-production benchmarking
environment, there may be also a competition for file descriptors with
other processes. The fd_limit option can be used to
lower the internal limit even further.
Polygraph should stop opening new connections if the internal FD
limit is reached.
One cannot raise the FD limit using the fd_limit
option! The original limit is reported by the operating system and
must be changed first. Different OSes will require different
techniques for raising the file descriptor limits. Some well-known
hacks can be found elsewhere.
Remember to reconfigure and recompile Polygraph from scratch if you
change OS limits. At configuration time, Polygraph will try to open as
many files as possible to find out actual OS limitations.
polyclt --file_scan <string> |
polysrv --file_scan <string> |
polypxy --file_scan <string> |
The --file_scan option selects the system call to use for
scanning ready files. Two valid values are: select and
poll. Poll is used by default, if available.
Most Unix operating systems have at least two system calls to
detect ``ready'' file descriptors: poll(2) and
select(2). See manual pages for your OS for details about
these system calls.
File scanning method may affect performance of Polygraph under
heavy loads or when working with large number of file descriptors.
The effect is probably limited to how fast Polygraph can scan all
ready files. The ``best'' system call to use depends on the OS and
the load on Polygraph. We suspect that performance differences are
marginal in many general cases. Experiment if you want to double check
your environment.
polyclt --help |
polysrv --help |
polypxy --help |
lr --help |
lx --help |
ltrace --help |
distr_test --help |
The --help option displays command line usage summary.
distr_test --hist_step <%> |
The --hist_step option tells distr_test the size
of a histogram bin (in percents of the total contribution). For
example, a value of 1% would lead to 100 lines per
histogram while a 5% step results in 20 lines.
polyclt --host_type |
polysrv --host_type |
polypxy --host_type |
The --host_type option displays build environment
information of the executable.
polyclt --icp_tout <time> |
The --icp_tout option specifies how long to wait for an
ICP_HIT reply before declaring an ICP miss condition.
polyclt --idle_tout <time> |
polysrv --idle_tout <time> |
polypxy --idle_tout <time> |
The --idle_tout option specifies a finite time a Polygraph
process should wait for some network activity. If no network activity
happens within the specified time, Polygraph will stop the simulation
with an ``inactivity timeout'' message.
Polygraph processes will usually stop simulation when all phases
reach their goals. Sometimes a phase has a goal that cannot be
reached. Sometimes network or other external conditions stall all
pending transactions. In these and similar situations it is often
desirable to stop the simulation even if not all phases are
completed.
Specifying idle timeout on the client side is usually not a good
idea because robots create their own traffic, never allowing the
timeout to happen, regardless of the network conditions.
Starting with version 2.6, polysrv uses an idle timeout of
5 minutes by default.
The --if option specifies the name if the network
interface (e.g., fxp0 or eth1). On many operating
systems, you can get a list of all available interfaces by running
ifconfig -a command.
Interface name must be specified for aka to work.
polyclt --ign_bad_cont_tags <bool> |
The --ign_bad_cont_tags option tells robots to ignore bad
content tags that they may find inside response bodies. Polygraph
uses semi-custom markup tags to identify embedded objects (similar to
<img> tags in HTML). When the content contains tags that confuse
Polygraph (e.g., realistic content simulation is enabled on the server
side), you might want to use this option.
polyclt --ign_false_hits <bool> |
polypxy --ign_false_hits <bool> |
The --ign_false_hits option instructs robots to ignore
false hits.
Polygraph knows what objects it has requested during the current
test. If a robot detects a hit on an object that was requested only
once (the current request), it can complain about a ``false
hit'' and register a transaction error.
However, there are many situations when false hits are not really
``false''. For example, two requests for a previously unseen object
may be submitted very close to each other. If a proxy or server
reorders the replies, Polygraph will think that the first reply is a
false hit.
By default, Polygraph will not complain about false hits.
polysrv --ign_urls <bool> |
The --ign_urls option instructs polysrv to
generate content regardless of the URLs in the requests
polysrv receives. This mode is useful when URLs are not
generated by Polygraph robots or otherwise inappropriate for the server
to interpret.
polyclt --label <string> |
polysrv --label <string> |
polypxy --label <string> |
The --label option allows you to assign a string label to
a run. The label gets logged (as any other option value) and is also
included into notification messages. The latter is useful if you are
running several experiments and want polymon to distinguish
them by a short ``name''.
Note that notification messages may truncate labels.
polyclt --log <filename> |
polysrv --log <filename> |
polypxy --log <filename> |
The --log option tells Polygraph to preserve detailed
measurements and various messages in a binary log file. The file can
then be analyzed by tools such as lx and
ltrace.
The log file size depends primarily on the duration of the test and
on the number of simulated agents in the test. Long, large scale tests
may easily produce logs of 5 - 10MB in size (per Polygraph
process).
polyclt --log_buf_size <size> |
polysrv --log_buf_size <size> |
polypxy --log_buf_size <size> |
The --log_buf_size option specifies buffer size for the
binary log. Polygraph periodically flushes logged data to disk and can
resize logging buffers on-demand, so large buffer sizes are not
needed.
The only known case when this option might be useful is when
Polygraph runs out of memory when trying to log phase statistics at
the end of the test (phase stats objects are large and may require
buffer resizing that may lead to insufficient memory).
polyclt --notify <addr> |
polysrv --notify <addr> |
polypxy --notify <addr> |
The --notify option instructs Poly to send status messages
to a daemon on the specified host. The messages are small (about 128
bytes) and are sent using UDP protocol ensuring negligible overhead.
The messages are emitted every stats cycle.
Polygraph distribution comes with a listening daemon
(udp2tcpd) and an interactive monitoring program
(polymon).
Monitoring capabilities are very handy if you want to watch your
experiments closely, but do not want to create extra load on the
Polygraph machines. Polymon is also helpful in monitoring
several independent concurrent experiments.
The --objects option specifies the names of objects to
extract from the binary log file. Binary logs store a lot of
information. Lx calls a self-contained piece of info an
object. Objects may be as simple as a single integer
number and as complex as a distribution histogram.
To extract all known objects, omit the --objects
option or use a magic object name ``All''.
lr --out <filename> |
lx --out <filename> |
ltrace --out <filename> |
The --out option specifies a file where the results of the
program execution should be sent.
The --phases option selects which portion of the log
(corresponding to a phase in the PGL schedule) will be analyzed by
lx.
polyclt --ports <port_range> |
The --ports option instructs polyclt to
explicitly bind(2) a socket to a specific port before making
a connect(2) request. The actual port is selected from the
specified range, using LRU approach. If the bind(2) call
fails, the port is marked as ``used'' and is never tried again unless
there are no ``unused'' ports left. Polygraph also keeps a map of the
ports it is currently using to avoid binding to the same port
twice.
The default port range used by OSes for ephemeral ports is often
rather small. Thus, an application is likely to run out of available
ports if request rate is high and explicit binding is not used.
To reduce the number of run-time conflicts, Polygraph pre-scans the
given port range to find invalid ports. The scan may add a couple of
seconds to polyclt start time.
polyclt --priority_sched <int> |
polysrv --priority_sched <int> |
polypxy --priority_sched <int> |
The --priority_sched option specifies priority level
for urgent socket operations. Higher levels allow Polygraph to
scan just the active sockets more often (at the expense of
potentially delaying processing for sockets that used to be
inactive but changed their status).
The default should be acceptable for most environments.
polyclt --prn_false_misses <bool> |
polysrv --prn_false_misses <bool> |
polypxy --prn_false_misses <bool> |
The --prn_false_misses option dumps reply headers of false
misses. Polygraph knows what objects it has requested during current
test. If a robot detects a miss on an object that was requested
before, it marks the transaction is a ``false miss''. False
misses are not errors, but a possible indication that a proxy did not
cache an object when it had a chance to do it (or purged a cached
object).
False miss information is often helpful for debugging a proxy or
workload. However, because many false misses are a part of the normal
HTTP operation in a distributed environment, it may take some time to
find real proxy mistakes in a large trace.
The --proxy option instructs all robots of the
polyclt process to use proxy address as the next-hop address
of all requests and to use Robot::origins names in request
URLs.
When the --proxy option is not given, robots use
Robot::origins addresses as the next-hop addresses and use
paths component only in request URLs.
Note that the origin address is always copied to the
Host: HTTP header.
The presence of the --proxy option is often the only
Polygraph-side configuration difference between ``forward proxying''
and ``transparent redirection'' bench setups. The option is also
useful for running no-proxy tests to verify bench setup.
polyclt --rng_seed <int> |
polysrv --rng_seed <int> |
polypxy --rng_seed <int> |
After version 2.6.2 was released, the --rng_seed option
was removed in favor of --glb_rng_seed
amd --lcl_rng_seed
options.
The --rng_seed option initializes general purpose r.n.g.
with a specified seed. By varying the seed, one can test how
susceptive to random noise the simulation is.
polyclt --glb_rng_seed <int> |
polysrv --glb_rng_seed <int> |
polypxy --glb_rng_seed <int> |
The --glb_rng_seed option initializes ``global'' r.n.g.
with a specified seed. Global r.n.g. affects objects with global scope
(i.e., objects that have to be the same regardless of with Polygraph
process is generating them). For example, a URL extension, while
"random", must be the same for the same object ID regardless of the
process that generates the URL. Thus, all processes within a test
must have the same global r.n.g. seed.
The default seed value is 1. By varying the seed, one can test how
susceptive to random noise the simulation is.
polyclt --lcl_rng_seed <int> |
polysrv --lcl_rng_seed <int> |
polypxy --lcl_rng_seed <int> |
The --lcl_rng_seed option initializes ``local'' r.n.g.
with a specified seed. Local r.n.g. affects events with local scope
(i.e., events that should differ from one process to another). For
example, a "think time" delay after receiving 100th request should be
different on different servers. Using equal seeds may lead to a
step-lock behavior among cooperating processes. Thus, all processes
within a test should have different local r.n.g. seeds.
The default seed value is 1. By varying the seed, one can test how
susceptive to random noise the simulation is.
polyclt --sample_log <filename> |
polysrv --sample_log <filename> |
polypxy --sample_log <filename> |
The --sample_log option specifies the name for a
stand-alone binary log file that captures PGL-configures stat
samples.
polyclt --sample_log_buf_size <size> |
polysrv --sample_log_buf_size <size> |
polypxy --sample_log_buf_size <size> |
The --sample_log_buf_size option specifies buffer size for
the sample log. See --log_buf_size
description for related caveats.
lx --side <string> |
ltrace --side <string> |
The --side option specifies the name of the `side' to
extract. Valid values are ``clt'', ``srv'', and ``all''. This option
is only useful for polypxy logs because other logs have just
one ``side'', and log extracting tools can guess what that side
is.
ltrace --smooth_slide <bool> |
If set, the --smooth_slide option instructs
ltrace to use sliding window (with a sliding step of one log
entry) for averaging log entries as opposed to jumping from one group
of entries to the next. This option is useful for building smooth,
albeit less precise, graphs.
polyclt --stats_cycle <time> |
polysrv --stats_cycle <time> |
polypxy --stats_cycle <time> |
The --stats_cycle option specifies the duration of a
statistical interval cycle (5sec by default). Shorter cycles give more
precise statistics but result in larger binary logs.
polyclt --sync_phases <bool> |
polysrv --sync_phases <bool> |
polypxy --sync_phases <bool> |
The --sync_phases option instructs Polygraph to
synchronize phase schedules among remote processes. For
synchronization to make sense, all processes must use the same PGL
phase schedules and, ideally, the same configuration files.
Synchronization is implemented on top of HTTP transactions; it works
fine when all processes are running and when transactions have
reasonable response times. Polygraph is likely to stuck in a phase if
one of the processes quit or experiences severe performance
problems.
Phase synchronization is on by default.
ltrace --sync_times <bool> |
The --sync_times option tells ltrace to adjust
local log time as if all logs started at once. The adjustment happens
as the logs a read and before stats are reported, not log modification
is performed.
This option is useful for processing logs from machines with
de-synchronized clocks.
ltrace --time_unit <time> |
The --time_unit option has two effects: ltrace
uses time since test start when reporting timestamps and that relative
time is reported in the specified units or scale. By default, absolute
time (seconds since Unix epoch) is reported.
For example, to get ltrace to display relative timestamps
at one minute scale, use --time_unit 1min.
polyclt --unique_world <bool> |
polysrv --unique_world <bool> |
polypxy --unique_world <bool> |
The --unique_world option instructs Polygraph to generate
URLs that are very unlikely to be used by other Polygraph invocations.
This option is on by default. The only known practical reason to
disable unique worlds is when two tests must produce the set of the
same URLs. In the latter case, the random number generator seed
essentially identifies the set.
polyclt --verb_lvl <int> |
polysrv --verb_lvl <int> |
polypxy --verb_lvl <int> |
The --verb_lvl option specifies how much info will be
printed to the console during a test. Normally, level zero will have
only a couple of lines per run. Level one will allow for interval and
phase stats lines to be printed. A negative level will disable any
output.
Most errors are reported with level zero.
Most tests can be run with level 5 or lower verbosity, but it is a
good idea to raise verbosity level to 10 if you are having
problems.
Regardless of verb_lvl setting, all console messages
are duplicated in the binary log.
polyclt --version |
polysrv --version |
polypxy --version |
The --version option displays Polygraph distribution
version.
The --win_len option specifies the length of an averaging
window (in terms of time) that ltrace is using with the
--smooth_slide option.