Home · Search · Print · Help
Types
PGL supports many generic and domain-specific types.
AddrMap, AddrScheme, Agent, Bench, BenchSide, Cache, ClientBehavior, Content, DnsResolver, DutState, Goal, Mime, ObjLifeCycle, Phase, PolyMix3As, PolyMix4As, PopDistr, PopModel, Proxy, Robot, Range, SingleRange, MultiRange, Rptmstat, Server, Session, SpreadAs, SrvLb4As, SslWrap, StatSample, TmSzStatSample, HrStatSample, KerberosWrap, AggrStatSample, LevelStatSample, StatsSample, WebAxe4As, addr, DynamicName, array, bool, bwidth, distr, float, int, list, rate, selector, size, Socket, string, time, uniq_id
Detailed descriptions for supported types are given below. Most types are
"structures" containing several fields. PGL has no facility to declare new
types.
AddrMap objects provide mapping of network addresses
(domain names or IPs) to IP addresses. The former are usually the
addresses that origin servers are visible as (e.g., a VIP address of a
L4 switch doing origin server load balancing). The latter are usually
IP addresses of simulated server agents.
AddrMap map1 = {
zone = "hosting.com";
addresses = '192.168.0.1-10:8080';
names = 'host[1-10].hosting.com:8080';
};
...
use(map1);
The zone field is not used by Polygraph run-time code, but
can be used by external programs such as dns_cfg to build
zone files based on a PGL configuration file.
Many non-overlapping maps can be use()d in one experiment.
The names field may contain IP addresses. The
addresses field must contain IP addresses only.
AddrMap vip1 = {
addresses = '192.168.1.1-10:8080';
names = '10.0.0.1:80';
};
AddrMap vip2 = {
addresses = '192.168.2.1-10:8080';
names = '10.0.0.2:80';
};
...
use(vip1, vip2);
Currently, only 1:1 and 1:N mappings are supported. An unmapped
name maps to itself by default (it has to be an IP address in that
case, of course).
Needless to say, your DNS server should be able to resolve the
names used in your PGL file.
More information about using domain names and configuring your DNS
server is available elsewhere.
AddrScheme is a base type for various algorithms that are
able to compute agent addresses based on the workload type and bench
configuration. There is at least one *As addressing scheme
type per each workload that supports automatic address calculation
(e.g., PolyMix4As type for PolyMix-4 workload).
The kind field is used as a label to distinguish
addressing schemes when the exact scheme type is unknown.
Th following address schemes are supported: SpreadAs, PolyMix4As, WebAxe4As, SrbLb4As, PolyMix3As.
Agent is a base type for PGL robots, servers, and proxies.
In other words, agents have properties common to those three
types. Usually, you will not use the agent type directly, but
knowing its properties helps in robot and server
manipulation.
The kind field is a label used for information
purposes only.
The xact_think field determines "transaction think
time". Servers "think" after accepting a connection and before
reading request headers. Client-side "think time" is not supported
in favor of request rate or request interarrival time settings.
The http_versions selector determines
agent's HTTP version. Two versions are supported: "1.0" and "1.1".
The latter is the default. The selection is sticky for the lifetime
of an agent. The version affects protocol version in request-lines of
HTTP requests generated by Polygraph robots and status-lines of HTTP
responses generated by Polygraph servers. This knob has no effect on
other defaults. For example, you still need to explicitly enable
persistent connections, even if you are using HTTP/1.1 agents.
This knob is available starting with Polygraph version 2.8.0.
An HTTP connection will never have more than pconn_use_lmt
requests. Persistent connections are disabled by default. To
explicitly disable persistent connections set use limit to
const(1). To have virtually no limit on the number of
requests per connection, set use limit to const(2147483647).
Note that a connection may be closed for reasons other than
pconn_use_lmt.
The idle_pconn_tout field specifies the delay
after which an idle persistent connection (i.e., a
connection with no pending messages) will be closed.
The abort_prob field specifies the probability that an
HTTP transaction will be aborted. To abort a transaction, an agent
closes the corresponding HTTP connection. At the time of writing,
aborts are supported when handling HTTP message bodies only. If the
transaction is to be aborted, the agent selects abort offset using a
uniform distribution. The transaction is then aborted when at least
offset bytes of the message body are read and/or written (from the
application point of view). Aborts are not considered errors on
aborting side but are likely to look like ones for the agent on the
other side of the transaction.
Aborted client connections get into a TIME_WAIT state and may exhaust
TCP source ports and other resources on untuned client drones.
The addresses field tells Polygraph what IP
addresses the agent should bind itself to. Essentially, the
agent will duplicate itself to have one self-sustained clone
per IP address. An address may be repeated to start several
agents (agent clones) bound to the same address.
Pop_model affects various URL selection algorithms. For
example, Polygraph robots use this model to select an old URL that
should be repeated (to produce a hit). Servers use the model to select
old URLs to put in the Location: field of redirection
responses (e.g., "302 Found").
The socket field specifies TCP/IP socket options
for TCP sockets used by the agent.
The world identifier is used to mark agent-specific
URLs or content. Manually setting this field may help to
reproduce the exact conditions of past experiments, but there are
better ways to do that.
Cookie_sender probability determines the chances that a
given Polygraph agent sends cookies. The selection of a cookie-sending
status is done at agent start time and is sticky (does not change).
HTTP servers send cookies using the Set-Cookie header. HTTP
clients (Polygraph robots) send cookies using the Cookie
header. Both servers and robots have parameters that further affect
cookie handling, but the cookie-sending status is always checked
first.
If cookie sending probability is zero, no agents within the given
configuration group will send cookies. If cookies sending probability
is 50% then roughly half of the agents will be sending
cookies (agent-specific parameters permitting).
Cookie sending functionality has been added to Polygraph version
3.0. The default value for Polygraph versions older than 4.0.7 is
zero. For Polygraph version 4.0.7 and newer, the default value of the
cookie_sender parameter depends on the agent type. For
robots, it is 100%. For servers, the default depends on the
cookie_set_prob parameter. If cookie_set_prob is set
and is positive, the default for cookie_sender is
100%. Otherwise it is zero.
Proxy agents currently ignore all but the addresses
field of their parent type.
The Bench maintains information about the benchmarking
environment (e.g., the number of physical hosts available for the test)
and test parameters such as peak request rate. Information is maintained
on a per-side basis.
As any other PGL object, an object of type Bench must appear
(directly or indirectly) as an argument of a PGL function or procedure
call to be of any affect.
BenchSide maintains configuration information about
client, server, or proxy side of the bench.
The max_host_load field specifies the maximum load
(requests/responses per unit of time) that a physical host should
generate/sustain. Given peak_req_rate of the bench, this field
determines the number of hosts required for the simulation (on one
"side" of the bench).
The max_agent_load field specifies the maximum load
(requests/responses per unit of time) that a simulated agent should
generate/sustain. Given max_host_load, this field determines the
maximum number of agents per host on one "side" of the bench. The actual
number of agents depends on the peak_req_rate.
Addr_space defines an array of addresses for various address
allocation schemes to pick agent addresses from. For example, a PolyMix-4
addressing scheme may pick the first 500 addresses from the provided space
to assign to agents on the first test box. The space addresses often
include interface names and subnet information to assist Polygraph in
creation of the corresponding IP aliases.
Addr_mask is used by various old address allocation schemes to
generate agent addresses. Only the first two octets (aka "network
number") of the mask are honored. Use addr_space instead if
possible.
The addresses field defines a list of IP aliases that
Polygraph should create. These aliases should have the interface name and
subnet information. In most cases, this field is not needed as Polygraph
can get the same information by concatenating the agent addresses fields.
See Run-time address creation section on the Addresses page for more
information.
The Cache type is used to configure a proxy cache.
The capacity field specifies the maximum size of the
cache. When the sum of content lengths of all cached objects exceeds
the configured capacity, some objects may be purged to free space for
the incoming traffic. Setting capacity to zero effectively disables
the cache.
When set, icp_port instructs the cache object to listen
for ICP queries on the specified port and reply to those queries
according to the cache contents. At the time of writing, misses are
replied with the miss-no-fetch ICP opcode.
Cache admission policy admits every cachable object at most
capacity in size. The replacement policy is LRU.
Polygraph allocates about 80 bytes of housekeeping
information per cache entry and assumes that average object size is
10KB. It is a good idea to make sure that your benchmarking
environment has more than enough memory for the configured cache
capacity.
Polygraph cache does not store object content, of course. If
needed, "cached" content can be generated from scratch, using the
corresponding origin server configuration. This content regeneration
is the responsibility of proxy's server side. If you are using the
cache, make sure that the origin servers in the PGL proxy
configuration file are exactly the same as the origin servers used in
the experiment!
The ClientBehavior type is used by the
client_behavior field of Content objects to
configure content-driven Robot behavior. Workloads using
content-driven Robots are discussed elsewhere.
ClientBehavior fields are a subset of Robot
fields. Please refer to the Robot PGL type
reference for their documentation.
In the future, more Robot fields may be added to
ClientBehavior. Please submit patches or let developers know
if you are interested in particular ClientBehavior
properties.
This PGL type is available starting with Polygraph v4.3.0. Support
for content-driven recurrence is available since Polygraph
v4.4.0.
The Content type accumulates details about such Web object
properties as MIME type, size, cachability, etc.
The checksum field specifies probability that an entity
will have an MD5 checksum computed and attached to the response using
HTTP Content-MD5 header field. For all HTTP responses with Content-MD5
headers, Robots calculate an MD5 checksum from scratch and compare it
with the value in the header. Mismatches are reported as errors.
Since MD5 computation is CPU-intensive, setting the checksum
field to high values may slow down server and client processes.
Please note that standard MD5 algorithm (no secret salt) is used and
that Robots trust the received Content-MD5 headers. Thus, an
intermediary can attach its own header to cause verification on the
client side or can alter the content and the header to avoid checksum
mismatch errors. Using checksum may be useful when a proxy
is suspected of accidently (unknowingly) altering the content.
The recurrence field is ignored. Use
bhr_discrimination setting of the popularity model
instead.
The may_contain field specifies embedded types that the
content type may contain. For example, HTML objects may contain
various images and audio files.
The embedded_obj_cnt distribution is used to determine the
number of embedded objects in the container of the corresponding
content type.
Several content options deal with simulating realistic content
using Polygraph's CSM model. The content_db field specifies
the filename of the content database (a file produced with the
cdb tool). Inject_db holds the name of the file
where the strings to be injected into the generated content are
stored. Individual injections appear approximately inject_gap
apart if possible. Infect_prob specifies probability that a
generated object will be infected (i.e., will contain at least one
injection).
The encodings strings specify supported content codings
and are used for enabling content compression features.
The client_behavior object specifies expected
Robot behavior when this content is selected. This field is
available starting with Polygraph v4.3.0.
When a Polygraph agent has to resolve a domain name, it contacts DNS
servers based on the DnsResolver information.
The servers field contains DNS servers to contact.
The timeout field specifies the maximum delay after which a
still unacknowledged DNS query is considered failed.
The DutState objects are used as a part of conditional calls
in the Watchdog feature. The latter is described elsewhere.
The rptm_min and rptm_max fields contain minimum
and maximum levels for measured mean response time.
Fill_size_min and fill_size_max fields contain
minimum and maximum levels for cumulative fill size (volume).
Xactions_min and xactions_max fields contain minimum
and maximum levels for cumulative transaction counts.
Rep_rate_min and rep_rate_max fields contain minimum
and maximum levels for averaged measured response rate.
Errors_min and errors_max fields contain minimum
and maximum levels for cumulative number of errors.
Error_ratio_min and error_ratio_max fields contain
minimum and maximum levels for average error ratio.
Dhr_min and dhr_max fields contain minimum
and maximum levels for average document hit ratio.
Goal specifies one or more simulation goals for a given
phase. Individual sub-goals are ORed together. That is,
reaching one sub-goal is enough to reach the entire goal.
All sub-goals except errors are called "positive"
sub-goals. Specifying errors or a "negative" sub-goal is
somewhat tricky. If errors value is less than 1.0 than
it is treated as error ratio. Otherwise, it is treated as
error count. For example, a value of 0.03 would mean
that getting at least 3% of errors is enough to reach the goal,
while the value of 3 would mean that at least 3 errors
are enough.
Mime type groups together Web object properties related to
MIME standard. Properties related to URL path generation are also
encapsulated in the Mime type, but that is likely to change.
The type field specifies the string to be used for the
Content-Type: HTTP header.
Strings from the prefixes array are appended (with a specified
probability) to the address part of the URL, before the start of
Polygraph-specific URL path. The prefix string is always prepended with a
slash character. However, no special delimiter is used between the prefix
and URL path; a delimiter (if any) must be a part of the prefix string
(e.g., "images/").
Strings from the extensions array are appended (with a
specified probability) to the Polygraph-specific URL path. No special
delimiter is used to append an extension; a delimiter (if any) must be a
part of the extension string (e.g., ".html").
ObjLifeCycle specifies the parameters for the Object Life Cycle model.
Here is a sample configuration.
ObjLifeCycle olc = {
length = logn(7day, 1day); // heavy tail, weekly updates
variance = 33%; // highly unpredictable updates
with_lmt = 100%; // all responses have LMT
expires = [nmt + const(0sec)]; // everything expires when modified
};
See the distribution
type for a list of supported qualifiers for time distributions
(lmt, now, nmt, etc.).
The birthday field is ignored in recent Polygraph
versions.
Most Polygraph measurements are collected on a phase
basis. Phases also allow to vary the overall load and other "global"
characteristics to model complex workload patterns.
Phase name is used for informational purposes only. Do not
use name "All" which is an lx macro that stands for "all
phases". Also, if you are going to make graphs based on console
output (rather than binary logs), you want to avoid phase names with
whitespaces. The latter will effectively change the number of columns
in console stats lines and confuse plotting tools.
Phase goal specifies the duration of the phase and/or
other phase termination conditions.
Populus factors affect the number of robots alive. Population size
can be varied from 0% to 100%, relative to the total
number of individual robots configured for the test. The latter is
determined as the total number of addresses of all use()d robots.
Note that a live robot can be idle or busy, depending on its session
configuration and state. Polygraph can vary population size starting
with version 2.7.0.
Load factors affect the load generated by Polygraph robots. Load
level can be varied from 0% to 100% and beyond, relative
to the load generated by an individual robot. In other words, load
factor tells each robot to adjust its activity accordingly. Varying
robot population size is preferred to varying robot load levels as it
produces more realistic workloads.
Other factors behave in a similar fashion. Recur_factor is
applied to the recurrence_ratio of a Robot.
Special_req_factor is applied to the portion of "special
requests" such as "IMS" or "Reload". The latter can be specified
using the "req_type" field of a robot.
If factor_beg is not equal to factor_end, then
the current factor is adjusted linearly during the phase. That is, the
factor is increased(decreased) from factor_beg to
factor_end. Such adjustments require a positive phase
goal.
There are a couple of simple "factor preservation" rules that
make load factors easy to specify. All these rules apply only when a
factor is not explicitly defined.
- For undefined factor_beg, use factor_end of
the previous phase.
- For undefined factor_end, use factor_beg of
the current phase.
- If a factor is still undefined, it is set to 100%.
These rules eliminate repetitions of factor entries for consecutive
phases. Only changes in load levels have to be specified.
The log_stats flag tells Polygraph whether statistics
collected during the phase should be recorded in a log file. This flag
defaults to true.
The primary flag tells Polygraph reporter tool whether the phase
should be used for the executive summary and the baseline report. If
any of the scheduled phases have this flag set, then those phases and
only those phases are used for the executive summary and the baseline
report. This flag defaults to false. It can be overwritten by
Reporter's --phases command-line option. The primary
flag has been supported since Polygraph version 4.0.6.
PolyMix3As type represents addressing scheme for PolyMix-3
workload.
PolyMix4As |
| int | agents_per_addr |
PolyMix4As type represents addressing scheme for the PolyMix-4 workload.
The number of PolyMix-4 hosts and robots is determined by the peak
request rate. The total number of robots (servers) is adjusted so that every
client- (server-) side host has the same number of agents. Other minor
adjustments are also made.
To allocate IP addresses for robots, Polygraph iterates through the
client-side addr_space array and gives the next robot the next IP address,
until enough IP addresses are allocated for a host. Polygraph then skips
remaining IP addresses that belong to the same subnet (if any), and starts
allocation for the next host (if any).
The above scheme ensures that individual IPs do not "migrate" from one
host to another when the request rate changes. Instead, only the number of
IPs "enabled" on each host changes.
Server-side IP allocation algorithm is very similar to the client-side
algorithm described above. The only significant difference is that the total
number of server agents is computed as 500 + 0.1*R, where R is the total
number of robots.
The PopDistr type is similar to the distribution type. Popularity
distribution specifies how to select the next object to be requested
from a group of objects that were requested before. In other words, it
specifies which objects are more popular than others (i.e., requested
more often) within a certain group of objects.
PopModel R;
R.pop_distr = popZipf(0.6);
The following popularity distributions are supported.
- popUnif() -- Uniform: all objects have equal chance of being selected
- popZipf(skew_factor) -- Zipf: zipf-like power law with the specified skew
Popularity model specifies how to select the next object to be
requested among all objects that were requested before. In other
words, it specifies which objects are more popular than others (i.e.,
requested more often).
The selection of the object to be requested is done in three
stages. First, Polygraph determines whether the object should come
from a "hot set". That decision is positive with a probability
specified by the hot_set_prob field.
During the second step, the popularity distribution specified by
the pop_distr field is used to select a particular object. If
the object is selected among "hot" objects, the selection is limited
by the hot set size. Otherwise, the entire working set is used. The
hot set size is a fraction of the current working set size specified
by the hot_set_frac field.
Finally, a byte hit ratio (BHR) discrimination algorithm is applied
with bhr_discrimination probability. The algorithm selects
the object with the smallest size among at most nine objects centered
around the selection made at the second stage. Uncachable objects are
ignored during the selection. Moreover, the algorithm does nothing
when the second stage selects an uncachable object. Thus, configured
content type cachability ratio is not affected, and uncachable objects
should have the same recurrence ratio regardless of their size.
Without the discrimination algorithm, offered BHR would be about the
same as offered document hit ratio (DHR) while real BHR is usually
some 30%-40% lower than DHR. The BHR discrimination algorithm was
introduced in version 2.7.2 of Polygraph.
PopModel popModel = {
pop_distr = popUnif();
hot_set_frac = 1%; // hot set is 1/100th of the working set size
hot_set_prob = 10%; // every 10th object is requested from the hot set
bhr_discrimination = 90%; // revisit smaller files more often
};
Robot R;
R.pop_model = popModel;
Proxy agent simulates a proxy cache. The client side (i.e., the
side that sends requests to and receives replies from the servers) is
configured using a Robot agent. Similarly, the server side (i.e., the
side that receives requests from and sends replies to clients) is
configured using a Server agent. A proxy may also have a cache to
store some of the proxied traffic.
The client side attempts to cache every cachable object it fetches.
The server side attempts to resolve every request from the cache. See
the Cache type description for important caveats of using the
cache.
There is no direct connection between ICP ports of the client side
and the cache (Robot and Cache types for the descriptions of those
fields). However, in most cases, these two ports should be set to the
same value because a real proxy usually sends and receives ICP queries
using the same UDP port.
Note that the addresses field of the proxy agent overwrites
the addresses fields of client and server configurations. Other
fields inherited from the Agent type are currently ignored. The latter
is a bug.
Proxies are activated by the polypxy program.
Derived from the Agent type, robot (a.k.a. "user" or
"client") is the main logical thread of execution in
polyclt. Robots submit requests and receive replies. The
frequency and nature of the submissions depend on the workload.
The origins field lists names or addresses of origin
servers to be contacted.
The http_proxies and ftp_proxies fields list
addresses of proxies to send the requests through. Both domain names
and IP addresses are acceptable. A port number must be specified for
each proxy. These fields are available since Polygraph v4.0.4.
Earlier versions use the deprecated proxies field.
Polygraph supports HTTP proxies only. If a proxy is used for a
request, then robot-proxy communication uses HTTP protocol. When
proxied, requests for FTP servers use an ftp:// Request-URI
scheme and are sent through ftp_proxies (if configured) or
http_proxies (default). Requests for HTTP servers use http://
scheme and are always proxied through http_proxies.
The proxy selection algorithm mimics typical browser configuration
and behavior:
- Requests for HTTP origin servers:
- If http_proxies is set and is not empty (or
proxy command line option is given), then the
requests go through the specified proxies.
- Otherwise, the requests go direct.
- Requests for FTP origin servers:
- If ftp_proxies is set and is not empty, the
requests go through the specified proxies.
- If ftp_proxies is set but is empty, the requests
go direct (using FTP protocol, of course).
- If ftp_proxies is not set, then the above rules
for HTTP origin server requests apply.
For each Request-URI scheme (i.e., each origin server protocol), a
robot selects a random proxy at the configuration time and uses that
single proxy for the entire duration of the test (sticky proxy
assignment). Within a scheme, proxy addresses are evenly distributed
among all robots in the test (if possible). Individual groups of
robots (e.g., all robots on one host) may not get an even
distribution.
The http_proxies field is mutually exclusive with the proxy command-line
option. Both have the same semantics except the command-line option
cannot specify more than one proxy address.
The proxies field is deprecated in favor of (and has the
same semantics as) the http_proxies field (since Polygraph
v4.0.4).
The socks_proxies field lists addresses of SOCKS proxies to send
the requests through, similar to http_proxies and
ftp_proxies. Only SOCKS5 proxies are supported. SOCKS
proxies are supported for HTTP and passive FTP requests. Active FTP
data connections do not go through a SOCKS proxy at the moment. SOCKS
support is available since v4.1.0.
The socks_prob parameter specifies probability that a
Robot would use a SOCKS proxy. The decision to use SOCKS and the SOCKS
proxy selection are sticky. The parameter defaults to 1.0 for Robots
with non-empty socks_proxies array.
The socks_chaining_prob parameter specifies probability of
a SOCKS-using Robot also using an HTTP or FTP proxy. In this case,
Robot requests go through the selected SOCKS proxy first and then
through the selected HTTP or FTP proxy. As with other proxy-related
parameters the proxy chaining decision is sticky. By default, proxies
are not chained.
When req_rate is specified, a robot will emit a Poisson
request stream with the specified mean rate, subject to phase load
levels. The req_inter_arrival field can be used to specify
request arrival stream different from Poisson. Naturally, the two
fields are exclusive.
If neither of req_rate or req_inter_arrival are
set, a Robot will use the "best effort" approach, submitting next
request immediately after a reply to the previous request has been
received.
Recurrence ratio is simply how often a robot should
re-visit a URL. In other words, how often a robot should request an
object that was accessed before (possibly by other robots). Note that
recurrence ratio is usually higher than hit ratio because
many objects are uncachable and repetitive requests to uncachable
objects do not result in a hit.
The embed_recur field specifies the probability of
requesting an embedded object when the reference to the latter is
found in the response.
Public_interest ratio specifies how often a robot would
request a URL that is "known" to (and can be requested by) other
robots. Robots are usually independent from each other in their
actions. However, they may access same objects on the same servers. If
public_interest is zero, a robot would request only
"private" objects from all origin servers, resulting in no overlap
of URL sets requested by individual robots. Note that both public and
private objects can be requested more than once and hence produce a
hit. This field has been removed starting with Polygraph version 2.8.0
in favor of a more general interests field documented
below.
Interests selector configures
Robot interest in URL worlds. Three kinds of worlds are supported:
private, public, and foreign. These kinds can be mixed freely, but
non-foreign interest is required for phase synchronization to
work. Public worlds interest specifies how often a robot would request
a Polygraph-generated URL that is "known" to (and can be requested
by) other robots. Robots are usually independent from each other in
their actions. However, they may access same objects on the same
servers. If private interest is 100% (which is the default), a robot
would request only "private" objects from all origin servers,
resulting in no overlap of generated URL sets requested by individual
robots. Finally, foreign interest specifies the portion of URLs that
should come from Robot's foreign_trace. Note that public,
private, and foreign objects can be requested more than once and hence
produce a hit. This field replaced less general
public_interest field starting with Polygraph version
2.8.0.
Robot R = {
...
// public_interest = 75%;
interests = [ "foreign": 1%, "public": 74%, "private" ];
foreign_trace = "/usr/local/traces/special_sites.urls";
};
The req_types array specifies what kind of requests the
robot should emit and with what probability. Several request types are
supported: "Basic" (a common GET request), "IMS" (a
request with an If-Modified-Since header field),
"Reload" (a request with a Pragma: no-cache and
Cache-Control: no-cache header fields), "Range" (a request
with a Range header field), and "Upload" (an HTTP PUT or an FTP
STOR request).
The req_methods array specifies HTTP request methods the
robot should use and with what probability. Several methods are
supported: "GET" (default), "HEAD", "POST", and "PUT". Request
methods are somewhat orthogonal to request types. For example, an IMS
request may be issued using HEAD request method. Polygraph may not
support all combinations though. Please see the "Request properties"
section in the "Traffic generation" reference page for more
information.
The private_cache_cap field specifies the size of the
robot cache. Robots do not cache object content, but remember URLs and
other object characteristics. For example, when IMS request is
generated, the IMS timestamp is taken from the robot cache if
possible.
Pop_model specifies which "popularity model" to use when
requesting an object that has already been requested before. You must
specify popularity model if you specify positive recurrence
ratio.
When unique_urls flag is set, each request submitted by
polygraph will be for a different URL. Note that this option is
applied last and changes a URL without affecting the object id part.
Object IDs are responsible for generating various object properties.
Thus, for filling-the-cache experiments, it may be a good idea to use
this option (in conjunction with other options like
recurrence and public_interest) to generate objects
similar to production tests (but with zero hit ratio).
The pipeline_depth distribution determines the maximum
number of concurrent outstanding requests on a persistent connection.
Request is considered outstanding until the corresponding response is
completely received. By default, requests are not pipelined, as if
const(1) value was specified for the pipeline depth. Pipeline
depth knob has no effect on connection persistency and actual depth
depends on factors such as connection persistency and presence of
embedded objects. See traffic model for more details
about request pipelining. Pipelining is supported in Polygraph
starting with version 3.0.
Open_conn_lmt is the maximum number of open connections
(in any state, to any server) a robot may have at any given time. A
robot will postpone (queue) new transactions if the limit is reached.
This limit simulates typical behavior of browsers like Netscape
Communicator that have a hard limit on the total number of open
connections. See Pei Cao's experimental study
for more information.
Wait_xact_lmt is only useful when open_conn_lmt
is specified. If the robot reaches its open connections limit, it will
queue the extra transactions. When the queue length grows beyond
wait_xact_lmt, new transactions will be simply ignored (with
an appropriate error message).
Minimize_new_conn is the probability that a robot will
treat connections to substitute addresses as connections to the
same agent (and, hence, reuse them if needed). This is useful for
running various no-proxy or no-VIP tests while keeping the number of
persistent connections similar to a "proxied" environment.
The session field is useful for simulation of the
login/out behavior of many Web clients, including browsing humans. See
Session type for
more information.
User_names do not affect robot behavior but may be useful
for testing external accounting and authentication services. Each name
is just a string. A robot picks a new name at the start of the
session. Within one robot configuration, no two sessions share a name,
provided all configured names are unique and there are enough names
(i.e., the number of user names is at least the number of robot
addresses). Names are selected in random order, with equal
probability.
The peer_icp address enables ICP module of the robot; the
robot will send ICP queries for all to-be-requested objects from the
icp_port to that address. The peer_http address
specifies where to send HTTP queries after an ICP peer returns a
hit.
Note that if only peer_icp address is set, the robot will
send ICP queries to the specified address, but will not fetch objects
from a peer. Setting peer_http only may not be supported,
use the "--proxy" option instead. At most one ICP and at most one
HTTP peer can be configured. Using completely different addresses for
the two peers is allowed, but usually does not make sense.
The dns_resolver field specifies the DNS resolver for a
robot to use.
The foreign_trace specifies the name of a file that
contains absolute HTTP URLs to request when foreign interest is
selected according to the interests field. The trace file
must have one URL per line. HTML anchors (or #-comments) are stripped.
Whitespace at the beginning and at the end of a line is stripped.
Empty lines are ignored. All URLs are pre-loaded at the start of a
test. Thus, larger traces will require more RAM. Misses are generated
in trace order. Once all URLs in a trace are requested, the iteration
start from the top of a trace. The trace order has no influence on hit
generation. However, Polygraph assumes and does not check for URL
uniqueness, and duplicate trace entries may cause unexpected (for
Polygraph) hits.
The cookies_keep_lmt distribution determines the maximum
number of origin and foreign server cookies that a robot can remember
and keep. When the number of incoming cookies exceeds the specified
limit, the Robot removes old cookies in a FIFO order. By default, 4
cookies will be kept for each server. A robot will send back all
cookies it remembers, if any (provided the robot is a cookie-sending
agent, of
course).
Prior to Polygraph version 4.0.7, all robots within a
polyclt process share the cookie storage and the FIFO queue.
Since Polygraph version 4.0.7, each robot has a separate cookie
storage and queue. Cookie sending functionality has been added to
Polygraph version 3.0.
The accept_content_encodings strings specify content
codings to be listed in an HTTP Accept-Encoding request
header. This knob is used to trigger content compression at the
server.
The spnego_auth_ratio controls the choice of the algorithm
for NTLM or Negotiate authentication. If unset or set to
zero, NTLMSSP algorithm will be used. Otherwise, the corresponding
portion of authentications will be done using SPNEGO (a.k.a., GSSAPI)
algorithm.
The kerberos_wrap field provides parameters for Kerberos
authentication. Please see the User
Manual for details.
The ranges selector specifies what ranges the robot should
use when generating a "Range" request and with what probability.
Please see the Ranges manual
for more information.
The req_body_pause_prob parameter specifies
the probability of a paused request. A paused request is
the request with an "Expect: 100-continue" header. After
sending a paused request, the robot waits for an HTTP
100 "Continue" control message from the server or the
final HTTP 417 "Expectation Failed" response. The
default is not to pause requests. This option is
mutually exclusive with the
req_body_pause_start option described below.
Please see the Request Bodies manual page for more
information.
The req_body_pause_start parameter specifies
the minimum size of a paused request (see
req_body_pause_prob above for terminology and
implications). Requests with bodies smaller than the
specified size are not paused. The default is not to
pause any requests. This option is mutually exclusive
with the req_body_pause_prob option described
above. Please see the Request Bodies manual page for more
information.
The passive_ftp parameter specifies the probability of the
Robot using passive FTP mode. In passive mode, a robot sends the FTP
PASV command and receives a server address; the robot connects to the
server address, establishing a data channel. In active mode, a robot
sends the FTP PORT command with the robot address for the data
channel; the server connects to the robot address. The FTP mode
selection is sticky for the lifetime of the robot. All robots use
passive FTP by default. Passive FTP support is available in Polygraph
since v4.0.0; active since v4.0.7.
Range is a base type for PGL SingleRange and
MultiRange
types. You should not use the Range type directly.
SingleRange |
| size | first_byte_pos_absolute |
| float | first_byte_pos_relative |
| size | last_byte_pos_absolute |
| float | last_byte_pos_relative |
| size | suffix_length_absolute |
| float | suffix_length_relative |
The SingleRange type is used to configure a single range
request. For more information, please see the ranges manual.
The first_byte_pos_absolute and
first_byte_relative fields are absolute (in bytes) and
relative (in percentage of whole entity size) positions of the first
range byte.
The last_byte_absolute and last_byte_relative
fields are absolute (in bytes) and relative (in percentage of whole
entity size) positions of the last range byte.
The suffix_length_absolute and
suffix_length_relative fields are absolute (in bytes) and
relative (in percentage of whole entity size) sizes of the requested
range suffix.
The *_absolute fields are mutually exclusive with the
*_relative fields. The byte-fields are mutually exclusive with
the suffix-fields, just like in the RFC 2616 BNF.
The MultiRange type is used to configure a request with a
multi-spec Range header. For more information, please see the ranges
manual.
The first_range_start_absolute and
first_range_start_relative fields are distributions of
absolute (in bytes) and relative (in percentage of whole entity
size) positions of the first byte of the first range spec. These
fields are optional.
The range_length_absolute and
range_length_relative fields are distributions of absolute
(in bytes) and relative (in percentage of whole entity) sizes of an
individual range spec.
The range_count distribution is used to determine the
number of individual range specs.
The *_absolute fields are mutually exclusive with the
*_relative fields.
Rptm-stat is to response time what thermo-stat is to temperature in the
room. Rptmstat specifies an "acceptable" response time range
(from rptm_min to rptm_max) and the factor change
percentage that should be applied to the current load factor if mean
response time in a sample is outside of the given range.
For "flat" phases (i.e., phases with load_factor_beg equal
to load_factor_end), the current load factor will be increased or
decreased by load_delta percentage depending whether response
time is lower or higher than acceptable.
For phases with variable configured load factor, the slope of the
factor curve will be increased or decreased by load_delta.
However, current load factor will never become lower than
load_factor_beg or exceed load_factor_end!
The sample_dur field sets the sample duration or "size".
Samples follow each other without overlaps.
Derived from the Agent type, server is the main logical
thread of execution in polysrv that models an HTTP origin
server. Servers receive requests and send replies. The speed and
nature of the replies depend on the workload.
Accept_lmt specifies the limit for consecutive attempts to
accept(2) an incoming connection. The attempts are terminated
with the first un-successful accept(2) system call or when the limit
is reached. By default, there is not limit.
Contents field is a content selector. It specifies the
distribution (or relative popularity) of content types for the server.
Each content type must be "accessible". That is, each type must be
in the closure of the direct_access selector described
below.
Direct_access array specifies what content types can be
accessed directly (i.e., not as an embedded object) by a robot. The
configuration below describes a simplified relationship among the
three most popular content types.
#include "contents.pg"
Server S = {
contents = [ cntImage : 70%, cntHTML : 10%, cntOther ];
direct_access = [ cntHTML : 95%, cntOther ];
};
The rep_types array specifies what kind of replies the
server should emit and with what probability. Two reply types can be
specified: "Basic" and "302 Found". "Basic" corresponds to "200 OK" or
"304 Not Modified", as appropriate depending on the actual request.
The cookie_set_prob probability determines the portion of
HTTP responses for which the server will attempt to generate cookies
(provided the server is a cookie-sending agent, of course). If
cookies need to be generated, the cookie_set_count
distribution is used to determine the number of cookies in the
response, and the cookie_value_size distribution is used to
determine the sizes of individual cookie values. Each cookie gets its
own Set-Cookie header field. Cookie values are random quoted
strings with sessN cookie names. Cookies do not expire and do
not have explicit paths. Polygraph robots may return
cookies depending on client-side cookie-related options. Cookie
sending functionality has been added to Polygraph version 3.0.
The req_body_allowed parameter specifies the
probability that the server "allows" a "paused" request
by responding with an HTTP 100 "Continue" control
message to a request with an Expect: 100-continue
header. The default is 100% (i.e., allow all paused
requests). Please see the Request Bodies manual page and the Robot
req_body_pause_prob field for more
information.
Session objects are used to configure robot behavior. A
single session consists of two periods: busy and idle. During the busy
period, a robot behaves normally, as if no sessions were configured.
At the start of an idle period, a robot clears all request queues.
Robot does not emit new requests during the idle period, but may
finish some outstanding transactions.
Robot R = {
...
session.busy_period.duration = 7sec;
session.idle_period_duration = exp(3sec);
session.heartbeat_notif_rate = 1/2sec;
};
In the example above, the durations of busy and idle periods are
set to 7 seconds (constant) and 3sec (exponentially distributed; new
value is selected when a session starts). Thus, the total session
duration would be 10 seconds, on average.
Busy_period is of type Goal so that you can specify busy
period duration based on, say, the number of transactions and not just
time. Idle period duration is of type "time distribution". One cannot
use distributions with Goal members, but let us know if you need this
feature.
A non-idle session can be configured to emit "heartbeat"
notification events at a specified rate. The above example will emit
one heartbeat event every 2 seconds. These events have no effect on
robot behavior, but are useful for forwarding session events to
external remote programs via Polygraph Doorman feature.
Heartbeat_notif_rate field was named
heartbit_notif_rate in Polygraph version 2.7.0.
SpreadAs |
| int | agents_per_addr |
SpreadAs type represents addressing scheme called
Spread. It is possibly the simplest addressing scheme that
distributes the load evenly across all bench hosts.
Spread takes H, the number of configured hosts for
the bench side, and devides the entire address space into H
partitions of equal size. Iterating over partitions, Spread takes one
remaining agent IP address from the current partition per iteration, until
the total accumulated request rate produced by selected agents reaches the
configured total request rate for the bench (Bench::peak_req_rate).
For example, the following configuration will result in all three
client-side hosts utilized, each with 50 alias IP addresses and 100
robots, producing 300/sec total load:
Bench B = {
peak_req_rate = 300/sec;
client_side = {
max_agent_load = 1/sec; // estimated load produced by one Robot
addr_space = [ 'lo::10.0.1-6.1-250/32' ]; // 1500 IPs to partition
hosts = [ '172.16.0.1-3' ]; // three client-side hosts or partitions
};
server_side = { ... };
};
use(B);
SpreadAs asSpread = { agents_per_addr = 2; };
Robot R = {
R.addresses = robotAddrs(asSpread, B); // calculates Robot IP aliases
...
};
use(R);
In the above example, the first host (172.16.0.1) gets 10.0.1.1-50 IP
aliases created even though its address space partition contains 500 IPs
(10.0.1-2.1-250). The second host gets 10.0.3.1-50, and the third will see
10.0.5.1-50 IPs. There are only 50 IP aliases per host because
asSpread constant uses two robots per IP address and it needed
enough IP aliases to support 100/sec rate per host.
Keeping the number of hosts and the address space constant allows you
to setup stable routes for each host while varying request rate from
nearly zero to the maximum level supported by the bench. Higher request
rate means more IPs selected from each address space partition, but the
partitions themselves remain constant.
Spread distributes Server addresses the same way as Robot
addresses, except that agents_per_addr is always assumed to be equal to 1
for the servers.
SrvLb4As |
| int | agents_per_addr |
SrvLb4As type represents addressing scheme for SrvLB-L7-4 and SrvLB-L4-4 workloads.
Robot and server address allocation algorithm is the same as for WebAxe4As scheme.
SslWrap object describes SSL connection
properties. SslWraps are used in Agent configurations to
indicate and find-tune support for SSL. Please see the SSL
layer manual for more
information.
The protocols field specifies supported SSL
protocol names. An agent selects the protocol or protocol set
at startup time, and that selection is sticky. Default is
"any" which stands for "SSLv23" in OpenSSL terminology.
The root_certificate field specifies location of
the root (CA) certificate file. That file is needed for the
servers to generate their certificates and for the robots to
verify server certificates. If not defined, the servers will
generate self-signed certificates and the robots will not
check server certificates. Appending public certificates to
the root_certificate file allows robots to trust those
certificates (and/or certificates signed by them) as well;
any public certificates present do not affect certificate
generation by the server.
The ciphers field selects ciphers the agent will
use. The selection is sticky. By default all ciphers are
selected.
The rsa_key_sizes array specifies supported key
lengths to use when auto-generating a private server key.
Server's selection is sticky. Defaults to 1024 bits.
The session_resumption parameter enables and
controls the session caching and resumption algorithms. It
should be used together with session_cache. By
default session_resumption is 0%.
The session_cache parameter controls the size of
the session cache for session caching and resumption feature.
By default session_resumption is 0.
The sharing_group enables and configures recycling
or sharing of SSL certificates. Certificates within the same
group will be shared (i.e., will only be generated once)
across SslWraps and agents if their OpenSSL generation
commands are exactly the same. Sharing hurts realism but
provides significant speedup in Polygraph start times when
hundreds of servers require certificate generation.
The ssl_config_file parameter sets the OpenSSL
configuration file. Relative file names are rooted in the
directory from where the Polygraph program was started. If no
parameter is given, the 'myssl.conf' file name is used, with a
warning that such usage is deprecated. The
ssl_config_file parameter is supported since
v3.6.0.
The verify_peer_certificate parameter controls whether
Robots do peer certificate verification. By default, Robots
verify certificates if and only if root_certificate
is specified. Note that servers do not currently verify client
certificates. This knob is available since Polygraph v4.0.10.
StatSample objects are useful in the context of Polygraph Watchdog feature. Each object
provides read-only access to performance measurements collected during a
watchdog sampling period or a phase. Dozens of measurements are
available.
Most StatSample structure members are structures themselves.
See their corresponding types linked above for detales on individual
members. Paragraphs below define top-level member meaning only.
Req.rate is the offered request rate.
Rep.rate is the measured response rate.
Rep is statistics collected for all kinds of HTTP
transactions.
Basic is statistics collected for basic HTTP transactions. A
basic HTTP transaction is a transaction for which the definition or
meaning of a hit is relatively obvious. This excludes
transactions with the following characteristics: non-GET request methods,
If-Modified-Since request headers, response status codes other than 200 or
304, reloads, and aborted I/Os.
Offered is hit/miss statistics for offered hits and misses. An HTTP
request "offers" a hit if an ideal cache would most likely return a cached
copy in response. Only basic transactions are used for this
statistics.
Real is hit/miss statistics for real (i.e., actual or
measured) hits and misses. These stats are based on a client-side guess
when a proxy did not contact a server to produce a response. A guess may
be inaccurate when the proxy contacts the server but uses the old response
headers instead of forwarding the new ones. Only basic transactions are
used for this statistics.
Cachable is cachability statistics for basic transactions.
Fill is statistics for cachable real misses.
Redired_req is statistics for HTTP transactions involving
redirected responses such as 302 (Found). Such transactions are not basic
transactions.
Rep_to_redir is statistics for transactions caused by
earlier redirected responses.
Ims is statistics for transactions involving an HTTP request
with an If-Modified-Since request header. Such transactions are not basic
transactions.
Reload is statistics for transaction involving client "reload"
requests (HTTP requests with Cache-control: no-cache directive). Such
transactions are not basic transactions.
Head is statistics for transactions involving a HEAD request.
Such transactions are not basic transactions.
Post is statistics for transactions involving a POST request.
Such transactions are not basic transactions.
Put is statistics for transactions involving a PUT request.
Such transactions are not basic transactions.
Abort is statistics for HTTP transactions where either request
or response was intentionally aborted prematurely, due to positive
abort_prob setting of an Agent. Such transactions
are not basic transactions.
Xact is concurrency level statistics for all HTTP transactions.
Populus is concurrency level statistics for robots.
Wait is concurrency level statistics for HTTP requests waiting
(for available connection slot) to be submitted. See
open_conn_lmt setting of a Robot.
Conn.open is concurrency level for open HTTP/TCP connections.
A connection is considered "open" from right after the corresponding
connect(2) or accept(2) system call and until the close(2) system
call.
Conn.estb is concurrency level for established HTTP/TCP
connections. A connection is considered "established" if it is open and
was marked as "ready for I/O" by an operating system. This usually means
that the TCP handshake has succeeded for the connection.
Conn.ttl is time-to-live statistics for open connection. That
is, it is the measure of how long connections stay open.
Conn.use statistics counts the number of HTTP transactions per
connection. If persistent connections are disabled, all connections will
have just one "use" count.
Ok_xact.count is the number of successful transactions.
Err_xact.ratio is the ratio of failed to successful transactions.
Err_xact.count is the number of failed transactions.
Retr_xact.count is the number of retried transactions.
Transactions are retried if the request is aborted due to a race conflict
with persistent HTTP connections.
Duration is the time it took to collect the sample, from the
first collected datapoint to the last.
Warning: Do not confuse StatSample with StatsSample. The
latter is likely to be removed from PGL.
TmSzStatSample objects encapsulate response time (rptm)- and
size-based statistics for a given measurement. They can only be used as a
part of a StatSample object.
HrStatSample objects encapsulate "hit" ratio statistics for a
given measurement. They can only be used as a part of a StatSample object. Note that "hit" and "miss" terms
may be changed to names of some other disjoint classes, depending on
the measurement. For example, "yes" and "no" is used for cachability
statistics.
Ratio.obj is a count-based ratio for a given transaction or
content class. For example, actual document hit ratio (DHR) is
real.ratio.obj
Ratio.byte is a volume-based ratio for a given transaction or
content class. For example, actual byte hit ratio (BHR) is
real.ratio.byte
Hit is statistics for transactions that were classified as
those matching the HrStatSample criteria. For example, hit transactions
for the real hit ratio statistics.
Miss is statistics for transactions that were classified as
those not matching the HrStatSample criteria. For
example, miss transactions for the real hit ratio statistics.
KerberosWrap configures Kerberos authentication. Please see
the User Manual for details.
The realm string specifies the Kerberos realm part of the
service principal (i.e., "HTTP/<proxy-address>@realm"). It is
also used for the client principal if robot credentials do not specify a
realm. This field is required.
The servers field specifies KDC server addresses. At least
one address is required. If Polygraph fails to communicate with a KDC
server, it tries the next server address. Use this field if (and only
if) your robots should try using UDP first, and both UDP and TCP
listening addresses are the same across all KDC servers. Otherwise, use
servers_tcp and/or servers_udp instead. Please see the
User Manual for more
information.
The servers_tcp field specifies TCP-specific KDC addresses.
It is mutually exclusive with the servers field but has similar
semantics.
The servers_udp field specifies UDP-specific KDC addresses.
It is mutually exclusive with the servers field but has similar
semantics.
The timeout parameter limits the time spent waiting for a
KDC reply. After a timeout, the robot will usually try another KDC
server (if any). There is no wait limit by default.
AggrStatSample |
| int | count |
| sometype | mean |
| sometype | min |
| sometype | max |
| float | std_dev |
| float | rel_dev |
| sometype | sum |
AggrStatSample objects contains aggregate statistics for a
given measurement. They can only be used as a part of a StatSample object.
Count is the number of measurements taken.
Mean is the arithmetic mean of all measurements taken
(i.e., sum/count).
Min is the value of the smallest measurement taken.
Max is the value of the largest measurement taken.
Std_dev is the standard deviation of all measurements taken.
Rel_dev is the relative deviation of all measurements taken (i.e.,
std_dev/mean).
Sum is the sum of all measurements taken.
LevelStatSample objects contains level statistics for a
given set of concurrent events. They can only be used as a part of a StatSample object.
Started is the number of started events (including those that
ended).
Finished is the number of finished events.
Level.mean is the mean level of started but not finished
(pending) events during the measurement period. This statistics is not
very reliable, probably due to problems with the way the level is
computed. Polygraph essentially computes an integral of the measurement
function over the measurement period and then divides the computed space
by the duration of the period. This algorithm is either incorrect or
implementation is buggy, leading to surprising results in some tests.
Level.last is the number of not yet finished events at the end
of the measurement period. For short periods, this statistics should be
used instead of the level.mean until the latter is fixed.
Warning: Do not confuse StatsSample with StatSample. The
former is likely to be removed from PGL.
Use StatsSample objects to instruct Polygraph to collect
detailed samples of transactions.
The name field is just a label to identify a sample.
The start field specifies the delay since the beginning of the
test after which Polygraph will start collecting a sample.
Capacity determines the number of transactions in the
sample.
If samples overlap, the earlier sample(s) are forced to "close", and
the sample started last will get all the transactions.
At the time of writing, there are
no tools to extract collected samples from binary logs.
WebAxe4As |
| int | agents_per_addr |
WebAxe4As type represents addressing scheme for the WebAxe-4 workload.
Robot and proxy address allocation algorithm is the same as for PolyMix4As scheme.
Server-side IP addresses are set to the real addresses of the server-side
PCs.
Network addresses are represented using the addr type. The
addresses can store IPv4, IPv6, or FQDN information along with an
optional network interface name, port number, and subnet. Address
constants are usually specified using 'single quoted strings' as shown
below.
addr them = '204.71.200.245'; // no port number
addr theirServer = '204.71.200.245:80';
theirServer.host = '209.162.76.5'; // change host name only
theirServer.host = them; // error: type mismatch!
addr mask1 = '10.1.0.0/22'; // with a subnet
addr mask2 = 'fxp0::10.1.0.0:8080/22'; // more optional details
IPv6 addresses present a slight problem because common usage (e.g.,
in URLs) is to put a colon (":") between an address and a port number.
However, colons are used as delimiters in IPv6 addresses, the same
way that dots (".") are used for IPv4. So that PGL can tell the
difference between an IPv6 digit and a port number, you must place
IPv6 addresses inside square brackets, like this:
addr foo = '[1234::5:1:2]';
addr server = '[1234::5:1:2]:80';
addr masked = '[1234::5:1:2]/120';
addr theworks = 'lo0::[1234::5:1:2]:80/120';
Arrays of addresses can be formed using regular array
operations. To create an array with many "similar" addresses, a
handy address range notation can be used. The
a-b.c-d.e-f.g-h notation instructs PGL to produce an array of
IP addresses that belong to a range specification. At least two ranges
(or points) must be specified.
addr[] srv_ips = [ '10.100.1-2.1-250:8080' ]; // 500 unique IP addresses
addr[] rbt_ips = [
'10.100.0.1-255', '10.100.0.1-255' // 500 IP addresses
];
Or, for IPv6:
addr[] range = [ '[1234::5:1-10:1-250]' ];
addr[] space = [ '[1234::5:1-10:1-250]/120' ];
addr[] servers = [ 'lo0::[1234::5:1-10:1-250]:80/120' ];
Similar rules for forming address ranges apply to FQDNs. Use square
brackets to help Polygraph to identify which part of the address must
be "iterated".
R1.origins = [ 'www.1-15.company.com' ]; // 15 unique FQDNs
R2.origins = [ 'www.company[1-15].com' ]; // 15 unique FQDNs
While it is unlikely that you would want to do that, you can mix
IPv4 addresses, IPv6 addresses, and DNS names in a single array of
addresses because all these addresses are of the same addr
type. For example:
addr[] foo = [ '1.2.3.4', '[1234::1]', 'www.example.com' ];
A dynamic name is an address mask or pattern that generates new
static names as the test progresses. Dynamic names are represented by
the DynamicName PGL type. DynamicName objects are usually
created using the dynamicName
function:
DynamicName DN = dynamicName('*.example.com:9090', 10%);
PGL allows DynamicName to be used anywhere the addr type can be used,
but the address mask makes sense in Robot origins
and AddrMap
names contexts only. More information about dynamic domain
names is available elsewhere.
Array is simply a list of items of the same type. Polygraph extends
arrays dynamically to accommodate all items so no array size
specifications are supported. One cannot extract an element from an
array (such a capability seems unnecessary because PGL does not
support loops).
int[] numbers; // a declaration of an array of integers
time[] alarms = []; // an empty array of time values
addr[] ips = [ '10.0.1.1', '10.0.2.2' ]; // an array of two addresses
Arrays do automatic interpolation of sub-arrays. That is, when an
array A is evaluated, an item I of array type is
interpolated into A just as if each individual element of
I were a member of A. Thus, arrays lose their
identity in an array environment. (This feature and its explanation
were borrowed from Perl language).
// the following two arrays are equivalent
int[] A1 = [ 1, 2, 3, 4 ];
int[] A2 = [ 1, [2, [3]], 4];
// A1 becomes a concatenation of A1 and A2:
A1 = [ A1, A2 ];
Arrays that specify probabilities for their members are sometimes
called "selectors". Selectors are discussed elsewhere.
Array is not really a stand-alone type, just a notation.
Boolean type can take the following values, with obvious
interpretation: true, false, yes,
no, on, and off. Simply use whatever value
is appropriate for a given situation.
RampPhase.log_stats = yes;
The bwidth type is nothing else but a size/time
fraction.
bwidth bw = 100Mb/sec; // 100BaseTX (100 Mbit per second)
size sz = 500Kb;
time tm = 10sec;
bwidth bw2 = sz/tm; // 50Kbps, naturally
bwidth bw3 = 13/sec; // Error: type mismatch
Distr type allows you to specify a random distribution of a
well-known shape. In PGL, distributions are "typed". That is, you
must specify the type for values along with the shape of the
distribution. Polygraph is usually able to guess the values type by
examining the parameters of the distribution function.
size_distr repSize = exp(13KB); // exponential distribution of sizes
int_distr connLen = zipf(64); // Zipf-distributed connection lengths
The following distribution shapes are recognized.
- Constant: const(mean)
- Uniform: unif(min, max)
- Exponential: exp(mean)
- Normal: norm(mean, std_dev)
- Lognormal: logn(mean, std_dev)
- Zipf(1): zipf(world_size)
- Sequential: seq(max)
- Arbitrary/tabular: table(filename, type)
When a time distribution is used to specify Object Life Cycle
parameters, it can be augmented by special qualifiers. The following
qualifiers are supported.
- now -- current time
- lmt -- last modification time
- nmt -- next modification time
The value of the nmt qualifier is what lmt would
read after the object is modified once. That is, it is the "next last
modified time". This qualifier is handy for specifying truthful
Expires header fields.
// object life cycle for "HTML" content
ObjLifeCycle olcHTML = {
length = logn(7day, 1day); // heavy tail, weekly updates
variance = 33%;
with_lmt = 100%; // all responses have LMT
expires = [nmt + const(0sec)]; // everything expires when modified
};
Arbitrary distributions can be specified using external
value:probability tables described elsewhere.
Floating point values are represented using float type.
Common arithmetic operations are supported. Integer values are
implicitly converted to floating point in a float context. There is no
implicit or default conversion from floating point values to integers.
Use the int() function call for an explicit cast.
float f = 5/10; // f is equal to 0.0
float f = 5.0/10; // f is equal to 0.5
int i = f; // Error: no default conversion from float to int
Internally, Polygraph stores floating point values using "double
precision" (usually 8 bytes per variable).
Integer values are represented using int type. Common
arithmetic operations are supported for integers. The important thing
to remember about integer arithmetic is that all calculations are done
with integer precision. For example, 3/2 yields 1
and 3*(2/3) yields zero.
There is no implicit or default conversion from floating point
values to integers. Use the int() function call for an
explicit cast.
A integer value of zero can be implicitly converted to many other
types, resulting in a "none" or "nil" value. Note that the latter
is not the same as an "undefined" value. Polygraph may replace
undefined values with appropriate defaults, but zero value cannot be
silently replaced or ignored.
int i = 5/10; // OK; i is equal to 0
int i = 5.0/10; // Error; no default conversion from float
int i = int(10*(5.0/10)); // OK; i is equal to 5
time_distr xactThinkTime = 0; // no delays
List is a coma-separated enumeration of items. List items can be of
different types. Lists are used in function and procedure calls, but
you should not attempt to declare a list variable.
Rate type is nothing else but a float/time fraction.
rate req_rate = 10.1/sec; // about 10 requests per second
rate xact_rate = 3/5min; // 3 xactions in 5 min interval
rate rep_rate = 0; // no replies at all
float dummy = xact_rate * sec; // that many xactions each second
rate r = 13/5; // Error: type mismatch
Selector is an array with probabilities associated with every item.
By default, all probabilities are unknown. When actual probabilities
are needed, the items with unknown probabilities will absorb whatever
is left from 100%, in a fair fashion.
addr[] servers = [
'10.0.2.1:80' : 30%, // this server will be used in 30% of cases
'10.0.2.2:80' : 50%, // this server will be used in 50% of cases
'10.0.2.3:80' // 100-30-50 = 20% is everything that is left
// for the last server
];
If probabilities add up to less than 100%, they are adjusted
proportionally to their absolute values.
// the following two selectors are equivalent:
Phase[] scheduleA = [ ph1 : 20%, ph2 : 60% ];
Phase[] scheduleB = [ ph1 : 25%, ph2 : 75% ];
Note that Polygraph does not complain if you specify probabilities
in an array where none are expected. Such probabilities are silently
ignored.
Selector is not really a stand-alone type, just a
notation.
For size constants, Polygraph understands the following scales:
Suffix | Bytes |
Byte | 1 |
Kb | 128 |
KB | 1024 |
Mb | 131072 |
MB | 1048576 |
Gb | 134217728 |
GB | 1073741824 |
|
Scale suffices can be shortened to the first two letters (e.g.
5KB) except for the Bytes suffix that cannot be
shortened.
Scale suffix can be applied to integer and floating point numbers.
In case of floating point numbers, the final number of bytes is
rounded to the smallest closest integer.
size s0 = 3KB + 1Mb;
size s1 = 2.5Bytes; // OK; truncated to 2 bytes
size s2 = 10 * s1; // s2 holds 20 bytes
size s3 = s0/s1; // Error: type mismatch
PGL can handle sizes up to 4611686016279904256 bytes on
machines with 4 byte integers, which is approximately 4
exabytes. However, Polygraph objects cannot handle sizes larger
than 2GB unless noted otherwise.
Socket objects can be used to specify socket(2) options
for HTTP connections. Polygraph defaults should do just fine
though.
String constants are specified using "double quoted" strings. At
the time of writing, no interesting operations on strings were
supported.
For time constants, Polygraph understands the following scales:
Suffix | Abbreviation | Value |
msec | ms | millisecond (1/1000 second) |
sec | s | second |
min | | minute |
hour | hr | 60 minutes |
day | | 24 hours |
year | | 365 days |
|
Scale suffix can be applied to integer and floating point numbers.
In case of floating point numbers, the closest approximation is chosen
to represent integer seconds and milliseconds.
time t0 = 5min + 1sec;
time t1 = 0.5sec; // OK; 500 milliseconds
time t3 = t0/t1; // Error: type mismatch
PGL also allows for "absolute time" constants. Absolute constants
are specified using single quoted strings and come in one of the two
formats: 'YYYY/MM/DD' or 'YYYY/MM/DD HH:MM:SS'.
time today = '1999/08/23 13:10:30'; // absolute date
Absolute times are assumed to represent Universal Coordinated Time
(UTC).
Identifiers are used by Polygraph agents to distinguish the URLs and
content they generate.
Polygraph generates unique identifiers internally. At the time of
writing, one cannot specify an arbitrary unique identifier; the only way
to get an object of the uniq_id type is to call the
uniqId() function.
Home · Search · Print · Help
|