Authentication

Many HTTP intermediaries must authenticate a user before proxying user requests. Access granting decisions are often made based on a complex request-dependent set of rules. Intermediaries may use external resources (e.g., and LDAP server) to check user credentials. Web Polygraph can be used to test the performance and correctness of proxy authentication in a variety of environments.

Basic authentication is supported starting with Polygraph version 2.8.0. NTLM/SSP and Negotiate authentication schemes are available from v3.1.2. NTLM/GSSAPI support was introduced in v3.1.4. Negotiate/Kerberos is supported since v4.8.0.

Table of Contents

1. Credentials
    1.1 NTLM and Negotiate credentials
2. Authentication algorithm
    2.1 When credentials are sent
    2.2 Authentication schemes: How credentials are sent
    2.3 User typos or otherwise invalid credentials
    2.4 Custom authentication methods
    2.5 Authentication states and errors
3. Kerberos
4. LDAP and other external authentication
    4.1 Group membership configuration
    4.2 Generating LDIF and other external configurations
    4.3 The Plan: Putting it all together

1. Credentials

User credentials are the foundation of all user authentication schemes. User credential is usually a (username, password) pair, with a username that is unique across all users in a given context or namespace. Polygraph robot agents can be assigned credentials (i.e., a username:password pair) using PGL:

Robot R1 = {
    credentials = [ "bob:secret" ];
    addresses = [ '10.0.0.1' ];
    ...
};

The above code sample configures a single robot R1, bound to 10.0.0.1 IP address and having username "bob" and password "secret". If more than one credential is specified, a robot will select one at random in the beginning of each user session (or at program start if no user sessions are configured).

Robot R2 = {
    credentials = [ "bob:secret", "mary:letMeIn", "duane:p$987<4" ];
    addresses = [ '10.0.0.1' ];
    ...
};

Robot R2 will have one of the three credentials available. Usually, a workload uses multiple robots and at least the same number of credentials:

Robot R3 = {
    credentials = [ "bob:secret", "mary:letMeIn", "duane:p$987<4" ];
    addresses = [ '10.0.0.1', '10.0.0.2', '10.0.0.3' ];
    ...
};

The above code sample configures three R3 robots, one per IP specified in the addresses field. Each robot will be assigned a username:password pair configuration from the credentials. Polygraph guarantees that a unique credentials member will be selected for each robot as long as there are enough members in the array (i.e., as long as the number of available credentials is no less than the number of addresses). The selection is sticky for the duration of a user session (or program lifetime if there are no sessions configured).

Many workloads use thousands of robots. It would be a pain to enumerate thousands of credentials. Fortunately, PGL has a function call to generate a given number of credentials. The call below will produce 1000 unique username:password pairs in the "east-end" namespace and 30 unique username:password pairs in the "west-end" namespace:

string[] endEast = credentials(1000, "east-end");
string[] endWest = credentials(30, "west-end");

Credentials in different namespaces are guaranteed not to clash. At the time of writing, Polygraph generates credentials in using the following pattern (without any white space between tokens):

"user" index "_" random_hex_number1 "@" namespace ":" 
    "pw" random_hex_number2 "x"

Since the credentials() function uses random numbers, it is difficult to predict the exact credentials. If you need to know the credentials (and you probably do because you need to configure the proxy accordingly!), see the "LDAP" section below or use string ranges to form custom patterns. When string ranges are used to form credentials, note that if password field contains ranges, you get more than one credential with the same username and different passwords:

// same password for all users
string[] endNorth = "my[1-1000]user@north:password";

// each user gets 10 passwords
string[] endNorth = "my[1-100]user@north:pass[1-10]word";

Here is an example of a robot configuration using PolyMix-4 addressing scheme and generated credentials. Note how the number of credentials depends on the number of robot addresses (which, in turn, depends on configured peak request rate), avoiding hard-coded magic constants:

Robot R4 = {
    addresses = robotAddrs(asPolyMix4, TheBench);
    credentials = credentials(count(addresses), "authmix");
    ...
};

1.1 NTLM and Negotiate credentials

Credentials for NTLM or Negotiate authentication are specified using the generic mechanisms described above, but are further manipulated by Polygraph to match NTLM expectations:

For example, Polygraph splits a WINZONE/Ian.Eli@hq.example.com username into WINZONE NTLM domain, Ian.Eli NTLM account login, and hq user host name. The example.com part is ignored when performing NTLM authentication.

2. Authentication algorithm

This section explains how Polygraph uses credentials during a test.

2.1 When credentials are sent

As in real life, Polygraph robots do not volunteer their credentials unless the intermediary responds with a "407 Proxy Authentication Required" status code. Once 407 response is received from the proxy, the robot will start sending its credentials with every request.

2.2 Authentication schemes: How credentials are sent

The first HTTP 407 "Proxy Authentication Required" response from proxy determines the authentication scheme that Polygraph Robot will use. The decision is based on the first Proxy-Authenticate header value recognized by Polygraph:

  1. A Proxy-Authenticate: NTLM header enables NTLM authentication. Polygraph versions prior to 3.1.0 skip this step.

  2. A Proxy-Authenticate: Negotiate header triggers Negotiate authentication, using NTLM authentication algorithms. Polygraph versions prior to 3.1.2 skip this step.

  3. Otherwise, Polygraph uses Basic authentication (RFC 2617): credentials are sent using Proxy-Authorization HTTP header field after being base-64 encoded.

HTTP requires the client to pick the best supported authentication method, but Polygraph picks the first one. This is consistent with behavior of some popular browsers. Future Polygraph versions may have knobs to control the choice of the HTTP authentication scheme.

For NTLM authentication, Polygraph uses either NTLMSSP or GSSAPI (a.k.a., SPNEGO) algorithm. The choice is controlled by the Robot::spnego_auth_ratio setting.

The Proxy-Authorization request header generated by Polygraph and other request aspects can be customized as described below.

NTLM and Negotiate authentication requires client-side persistent connections so make sure they are enabled in your tests! See Agent::pconn_use_lmt and related options for PGL knobs.

2.3 User typos or otherwise invalid credentials

Real users make typos when supplying their credentials in a browser window. Crackers may enumerate well-known passwords in an attempt to get authorization to use proxy services. Polygraph allows you to simulate authentication errors by specifying the probability that an incorrect password is generated for a given transaction. This probability is defined on a per-robot basis using auth_error field:

Robot goodGuys = {
    // almost all transactions have correct credentials
    auth_error = 0.1%; 

    // these are correct credentials
    credentials = credentials(100, addresses), "good");
    ...
};

Robot crackers = {
    // almost all transactions have incorrect credentials
    auth_error = 99%;

    // it might be a good idea to use the same "good" credentials
    // here because crackers are trying to guess these
    // (and they will guess with 1% probability)
    credentials = goodGuys.crackers;
    ...
};

To generate invalid credentials for a transaction, Polygraph modifies the password field but leaves the username intact. This behavior may change in the future; let us know if you need more knobs.

2.4 Custom authentication methods

Web Polygraph supports custom authentication methods via a loadable module interface. The interface provides access to credential information and "407 Proxy Authentication Required" response headers. It is possible to modify transaction headers on the fly to supply different credentials or to supply credentials in a way that differs from the supported authentication schemes. The interface allows for efficient (fast) information exchange; good modules do not slow Polygraph down.

No public modules have been written for Polygraph at the time of writing, but Polygraph users have implemented their private authentication modules, including modules that support NTLM Authentication.

Details of the loadable module interface are documented elsewhere.

2.5 Authentication states and errors

HTTP authentication is often a complex, multi-step process with many opportunities for things to go wrong at each step. The table below shows the outcomes of HTTP response status codes received by a Robot in various authentication states.

Robot configuration Robot state Robot sent Robot received Outcome
      401/407 without Authenticate error: origin/proxy authentication without authenticate headers ( errOriginAuthHeaders or errProxyAuthHeaders)
lacks credentials     401/407 error: origin/proxy authentication with anonymous robot ( errOriginAuthWoutCreds or errProxyAuthWoutCreds)
lacks credentials     403 error: access forbidden to an anonymous robot
has credentials no authentication   401/407 robot starts origin/proxy authentication
has credentials no authentication   403 error: access forbidden before authentication was started
has credentials authentication completed valid credentials 401/407 error: origin/proxy re-authentication requested after authentication ( errOriginAuthAfterAuth or errProxyAuthAfterAuth)
has credentials authentication completed valid credentials 403 error: access forbidden after authentication was completed
has credentials authentication completed invalid credentials 401/407 success
has credentials authentication completed invalid credentials 403 success
has credentials authentication in progress   401/407 robot continues authentication
has credentials authentication in progress   403 error: access forbidden while authentication was in progress

An empty cell means that its value does not affect the outcome.

HTTP 401 (407) responses have WWW-Authenticate (Proxy-Authenticate) header fields unless noted otherwise.

A Robot in an authentication completed state has sent credentials and is expecting a response with either the requested object or a permission denied error, depending on whether the sent credentials were valid.

A Robot in an authentication in progress state has started authentication process but has not sent the credentials yet because not all preliminary authentication steps are completed. This state is possible during NTLM but not Basic authentication.

3. Kerberos

Polygraph supports Kerberos proxy authentication since v4.8.0. The PGL KerberosWrap type supplies Kerberos-specific options to robots via the Robot::kerberos_wrap field. A simple Kerberos workload is available.

Polygraph uses HTTP/<proxy-address>@realm string as a service principal. If you want Polygraph to use a domain name for service principal name, then you need to configure DnsResolver and use domain name to specify the proxy address, as demonstrated by the sample workload.

Currently, only proxy Kerberos authentication is supported: Robots cannot authenticate with origin servers using Kerberos.

Polygraph mimics some real-world client aspects with regard to Kerberos credentials management. Initial Kerberos credentials are acquired when a Robot session starts. Service tickets are acquired when they are needed for the first time. Kerberos credentials are not shared between different Robots, but they are shared between different transactions of the same Robot.

Polygraph supports both UDP and TCP transports for KDC communication. If possible, Polygraph tries UDP first. If no UDP addresses for KDC servers are configured (see the KerberosWrap::servers and servers_udp fields), or if a KDC-over-UDP response results in a KRB5KRB_ERR_RESPONSE_TOO_BIG error, then robot switches to TCP. Once a Robot switches to TCP, it will not try UDP again until the end of the session.

Polygraph uses krb5_init_creds_context and krb5_tkt_creds_context APIs to support asynchronous communication with KDC. That API is available in MIT Kerberos v1.9 and newer. Polygraph cannot use the Heimdal Kerberos library because that library lacks the krb5_tkt_creds_context API.

4. LDAP and other external authentication

Polygraph makes it relatively easy to test complex authentication environments involving large group memberships, especially those based on LDAP (Lightweight Directory Access Protocol). This section provides necessary details.

4.1 Group membership configuration

User group memberships can be expressed in PGL using the MembershipMap type. The membership map specifies available group names, available user credentials, and a distribution for the number of groups a user can join. Here is an almost complete example:

MembershipMap managers = {
    group_space = "mana[1-1000]gers[1-1000]"; // 1M
    member_space = credentials(10000, "mgr"); // 10K
    groups_per_member = exp(50);
};

MembershipMap employees = {
    group_space = "emplo[1-1000]yees[1-1000]"; // 1M
    member_space = credentials(100000, "emp"); // 100K
    groups_per_member = norm(500, 500/3);
};

Robot R = {
    ...
    credentials = select(
        [ managers.member_space, employees.member_space ], 
        1000); // just 1000 actual credentials

    // one robot per credentials pair
    addresses = [ '10.0.0.42' ** count(credentials) ];
};

use(managers, employees);
use(R);

Note that managers and employees objects above are not groups. They are mappings that tell Polygraph how to generate user groups. The interface allows you to specify very large groups and very large number of groups. The above PGL code defines one million manager and employee group names and thousands of group member credentials (with only 1000 actual workers simulated via robots).

It is also important to note how Robot configuration uses credentials from membership maps so that mapping information and robot credentials are kept in sync. The select(array, n) PGL function selects n random entries from the array.

MembershipMap objects must be use()d to become visible to Polygraph.

4.2 Generating LDIF and other external configurations

Since Polygraph generates credentials internally, one needs some way to export generated credentials so that the proxy (or an authentication server) can be configured using them. When PGL configuration specifies millions of users and groups, configuring the proxy by hand becomes impossible even if Polygraph would report credentials that will be used during the test. To address this problem, Polygraph includes a tool called pgl2ldif that can generate LDIF (and other text-based) configuration files based on PGL code. LDIF is the configuration format used by LDAP servers.

To generate a configuration file, one needs to create a template. Polygraph will substitute well-known keywords in the template file with credentials and other authentication-related information. Empty lines separate template "records". For each record, pgl2ldif will repeat substitutions (i.e., instantiate a template) to produce configuration record for each group and/or for each user. Care is taken to produce a single copy of each record to avoid duplicate configuration entries.

The following keywords, if enclosed in curly braces, are recognized and substituted:

KeywordMeaning
groupgroup name from the membership map
usernameuser name part of user credentials from robot configuration
passwordpassword part of user credentials from robot configuration

For example, consider the following template (see test.ldift file in test_auth.tgz archive)

# Organization for Test Corporation
dn: dc=test,dc=com
objectClass: dcObject
objectClass: organization
dc: test
o: Test Corporation
description: The Test Corporation

# Organizational Role for Directory Manager in group {group}
dn: cn={group} Manager,dc=test,dc=com
objectClass: organizationalRole
cn: {group} Manager
description: Directory Manager of {group} group

# a {username} user
dn: cn={username},dc={group},dc=test,dc=com
objectClass: person
cn: {username}
sn: {username}
userPassword: {password}

... and the following PGL code (see auth.pg).

...
MembershipMap managers = {
    group_space = "mana[1-1000]gers[1-1000]";
    member_space = credentials(10000, "mgr");
    groups_per_member = exp(10);
};

MembershipMap employees = {
    group_space = "emplo[1-1000]yees[1-1000]";
    member_space = credentials(100000, "emp");
    groups_per_member = norm(100, 100/3);
};

Robot R = {
    ...
    credentials = select(
        [ managers.member_space, employees.member_space ],
        1000);
};

The pgl2ldif command will produce the following standard error output (useful statistics and such),

% pgl2ldif --template test.ldift --config auth.pg > test.ldif
fyi: templates:                               3
fyi: use()d MembershipMaps:                   2
fyi: possible user group names:         2000000
fyi: possible user credentials:          110000
fyi: user credentials:                     1000
fyi: user groups:                         45086
fyi: instantiated templates:             138216

while generating a 27MB LDIF file (test.ldif). The beginning and the end of test.ldif are quoted below.

# Organization for Test Corporation
dn: dc=test,dc=com
objectClass: dcObject
objectClass: organization
dc: test
o: Test Corporation
description: The Test Corporation


# Organizational Role for Directory Manager in group mana15gers3
dn: cn=mana15gers3 Manager,dc=test,dc=com
objectClass: organizationalRole
cn: mana15gers3 Manager
description: Directory Manager of mana15gers3 group


# Organizational Role for Directory Manager in group mana312gers817
dn: cn=mana312gers817 Manager,dc=test,dc=com
objectClass: organizationalRole
cn: mana312gers817 Manager
description: Directory Manager of mana312gers817 group


# Organizational Role for Directory Manager in group mana10gers189
dn: cn=mana10gers189 Manager,dc=test,dc=com
objectClass: organizationalRole
cn: mana10gers189 Manager
description: Directory Manager of mana10gers189 group


# Organizational Role for Directory Manager in group mana695gers597
dn: cn=mana695gers597 Manager,dc=test,dc=com
objectClass: organizationalRole
cn: mana695gers597 Manager
description: Directory Manager of mana695gers597 group


....


# a user40863_e555@emp user
dn: cn=user40863_e555@emp,dc=emplo88yees671,dc=test,dc=com
objectClass: person
cn: user40863_e555@emp
sn: user40863_e555@emp
userPassword: pw7134def4x

# a user40863_e555@emp user
dn: cn=user40863_e555@emp,dc=emplo188yees863,dc=test,dc=com
objectClass: person
cn: user40863_e555@emp
sn: user40863_e555@emp
userPassword: pw7134def4x

# a user40863_e555@emp user
dn: cn=user40863_e555@emp,dc=emplo56yees322,dc=test,dc=com
objectClass: person
cn: user40863_e555@emp
sn: user40863_e555@emp
userPassword: pw7134def4x

# a user40863_e555@emp user
dn: cn=user40863_e555@emp,dc=emplo965yees990,dc=test,dc=com
objectClass: person
cn: user40863_e555@emp
sn: user40863_e555@emp
userPassword: pw7134def4x

Note that the same pgl2ldif tool can be used to enumerate all used groups and their members in human-friendly format. For example, this one-line [non-LDIF] template

group: {group} member: {username} / {password}

... applied to auth.pg would yield the following output (88300 records total, separated by empty lines not shown below):

group: emplo7yees765 member: user78120_a985@emp / pw49ef0c4fx
group: emplo68yees33 member: user78120_a985@emp / pw49ef0c4fx
group: emplo1000yees198 member: user78120_a985@emp / pw49ef0c4fx
group: emplo819yees254 member: user78120_a985@emp / pw49ef0c4fx
group: emplo263yees684 member: user78120_a985@emp / pw49ef0c4fx
...
group: emplo513yees318 member: user18317_baa9@emp / pw4f8958dx
group: emplo927yees289 member: user18317_baa9@emp / pw4f8958dx
group: emplo806yees263 member: user18317_baa9@emp / pw4f8958dx
group: emplo184yees556 member: user18317_baa9@emp / pw4f8958dx
group: emplo32yees431 member: user18317_baa9@emp / pw4f8958dx

4.3 The Plan: Putting it all together

Follow these steps to create an external authentication test.

  1. Design the workload that mimics your authentication environment. Start with the most simply workload you can write and add more features/objects later, after the first successful tests.

  2. Write an LDIF or similar template file and produce LDAP (or similar) configuration file using the pgl2ldif tool.

  3. Feed the generated configuration file to your [LDAP] authentication server. Make sure the server can parse it.

  4. Configure the device under test to use your authentication server. Make sure the device can talk to the server.

  5. Start the test using the original PGL configuration you have developed in the first step. Watch out for authentication errors on Polygraph client and proxy. It may be a good idea to use auth_error PGL setting or deliberately misconfigure one or the other to cause some errors, just to make sure that authentication is engaged and working.

  6. Scale the PGL configuration to include more users and groups, play with other settings.