HTTP Range request support

This page describes Polygraph features related to HTTP Range requests. Range support is available starting with Polygraph version 3.2.

Table of Contents

1. For the impatient
2. Introduction
3. Server side
4. Client side
    4.1 Robot configuration
    4.2 Single-range configuration
    4.3 Multi-range configuration
5. Stats
6. Range generation limitations
    6.1 Single-range requests
    6.2 Multi-range requests

1. For the impatient

// All examples assume 1,000-byte entity

SingleRange range1 = {
    first_byte_pos_absolute = 30Byte;   // Range: bytes=30-300
    last_byte_pos_relative = 30%;
};

SingleRange range2 = {
    suffix_length_relative = 10%;       // Range: bytes=-100
};

SingleRange range3 = {
    suffix_length_absolute = 128Byte;   // Range: bytes=-128
};

// e.g., Range: bytes=28-175,382-399,510-541,644-744,977-980
MultiRange rangeM = {
    first_range_start_absolute = exp(15Byte); // very first octet pos
    range_length_relative = unif(1%, 10%);    // random spec length
    range_count = const(5);                   // number of range specs
};

// Half of generated requests will have a specified Range header.
Robot R = {
    req_types = ["Basic", "Range": 50% ];
    ranges = [ range1, range2: 10%, range3, rangeM: 20% ];
};

2. Introduction

An HTTP client may request partial entity content using a Range header. Range requests may significantly decrease bandwidth usage and response times, but they present serious protocol and performance challenges for caching proxies.

HTTP Range functionality is defined in sections 14.16, 14.35, and 19.2 of RFC 2616. The syntax of the Range header value is defined as follows:

byte-ranges-specifier = bytes-unit "=" byte-range-set
byte-range-set  = 1#( byte-range-spec | suffix-byte-range-spec )
byte-range-spec = first-byte-pos "-" [last-byte-pos]
suffix-byte-range-spec = "-" suffix-length

Polygraph supports both single- and multi-spec Range requests. Polygraph Servers generate appropriate "206 Partial Content" or "416 Requested range not satisfiable" responses, depending on whether the requested range is satisfiable. When a satisfiable multi-range request is received, the server replies with a conforming multipart/byteranges response, as described in RFC 2616, section 19.2.

3. Server side

No server-side configuration is required to enable support for Range requests. When a Polygraph server receives a request with a Range header, it replies with a proper 206 Partial Content or 416 Requested Range Not Satisfiable response.

If a server gets a so called syntactically invalid Range request (e.g. Range: bytes=1-0), the Range header is ignored and an ordinary full-entity response is generated. This conforms to RFC 2616 section 14.35.1.

If the last-byte-pos value is absent, or if the value is greater than or equal to the current length of the entity-body, then last-byte-pos is taken to be equal to one less than the length of the entity body in bytes.

If the entity is shorter than the specified suffix-length, the entire entity body is used.

If a syntactically valid byte-range-set includes at least one byte-range-spec whose first-byte-pos is less than the current length of the entity-body, or at least one suffix-byte-range-spec with a non-zero suffix-length, then the byte-range-set is satisfiable. Otherwise, the byte-range-set is unsatisfiable.

If byte-range-set is satisfiable, the Polygraph server sends a 206 Partial Content response. Otherwise, the server sends a 416 Requested Range Not Satisfiable response.

Although RFC 2616 allows the server to merge overlapping byte-range-specs, Polygraph does not do it. Ranges are returned in the same order they are requested.

When doing ranges, Polygraph server sends correct content part(s), even for randomly generated response bodies. Thus, a proxy can merge and cache generated range responses without corrupting the content.

4. Client side

4.1 Robot configuration

Two Robot parameters define how Polygraph robots generate Range requests: req_types and ranges.

Robot R = {
    // 10% of requests have a Range header
    req_types = ["Basic", "IMS": 20%, "Range": 10% ]; 

    // 20% of Range requests will use multi-spec Range header
    ranges = [ range1, range2: 10%, range3, rangeM: 20% ];
    ...
};

When Robot's req_types selector includes a "Range" type, Polygraph sends Range requests with the specified probability.

Once the decision to generate a Range header has been made, the ranges selector defines what kind of Range header value should be sent. The selector may contain SingleRange and MultiRange PGL objects, with selection probabilities.

4.2 Single-range configuration

Single range specification is configured using a SingleRange PGL type. It follows BNF from RFC 2616 section 14.35.1 and can fully control single byte-range-set generation. All types of single range can be generated. See above and below for examples.

If you want to generate tiny one-byte ranges one of the two following configurations may work:

SingleRange TenthByteRange = {
    first_byte_pos_absolute = 9Byte;
    last_byte_pos_relative = 9Byte;
};

SingleRange LastByteRange = {
    suffix_length_absolute = 1Byte;
};

These following PGL specs will generate large or full-entity ranges:

SingleRange AllBytesRange = {
    first_byte_pos_absolute = 0Byte;
};

// last 10MB of content (or full body for smaller entities)
SingleRange Last10MByteRange = {
    suffix_length_absolute = 10MB;
}

The first_byte_pos_absolute, last_byte_pos_absolute, and suffix_length_absolute parameters have relative counterparts. They work the same way as absolute values but are calculated at runtime as (relative_portion*entity_length). Absolute and relative parameters are mutually exclusive.

Some configurations may generate invalid ranges (e.g., when first_byte_pos exceeds last_byte_pos). In such cases, last_byte_pos is set to first_byte_pos and the "range_gen.first_last_swap" statistic counter is incremented.

4.3 Multi-range configuration

Multiple range specification is configured using MultiRange PGL type. Not all types of multi range can be generated. Only multi ranges of the form "bytes=a-b,c-d,..." are supported. See above and below for examples.

The gap between the ranges is calculated using a hard-coded exponential distribution. The mean for the range-spec length distribution is used to estimate the average gap between the range specs.

If the start of the first range-spec is not specified, its position is computed using the same random gap approach to achieve more-or-less even but random range positions throughout the entity body.

If you want to generate many small ranges, consider the following approach:

// Make 10-20 one-byte range-specs, starting from the first byte
MultiRange OneByteSpecsRange = {
    first_range_start_absolute = const(0Byte); // start from the 1st byte
    range_length_absolute = const(1Byte); // one byte per spec
    range_count = unif(10, 20); // generate 10 to 20 range-specs
};

The absolute first range start and the range length parameters have relative counterparts. They work in the same way as absolute values but are calculated at runtime as (relative_portion*entity_length). For example, the following range is similar to the OneByteSpecsRange above, but is using 1% of the entity length instead of one byte.

MultiRange OnePercentSpecsRange = {
    first_range_start_absolute = const(0Byte); // start from the 1st byte
    range_length_relative = const(1%); // one percent/spec
    range_count = unif(10, 20); // generate 10 to 20 range-specs
};

Absolute and relative parameters are mutually exclusive.

If the position of the first range set byte is larger than the entity length, then the range generation stops and the "range_gen.set_overflow" statistics counter is incremented.

5. Stats

Range generation statistics is collected in range_gen object which is accessible via the lx tool. The following measurements are recorded:

spec_size

Statistics for a single range spec. A Range request with multiple range specs will update this statistics multiple times.

For example, a "0-9" range request will add 10 bytes to the spec_size measurement. And a "0-5,10-19" multi-range will record 5 and 10 bytes.

set_size

Statistics for a single Range request header.

For example, a "0-9" range request will add 10 bytes to the set_size counter. And a "0-4,10-19" multi-range request will contribute 15 bytes. Each request will record one value.

specs_per_set

Mean count of byte range specs per range request. The value is calculated as (set_size.count/spec_size.count).

first_last_swap

When a range spec is generated, the first byte position may exceed the last byte position. In that case, the last byte position is set to the first byte position and the swap counter is incremented. This can only happen during single-range generation.

spec_overflow

When a single-range request with an explicit absolute offset is generated, the first byte position may exceed the entity size. Polygraph does not adjust such unsatisfiable Range requests. This overflow statistics counts the number of such cases.

set_overflow

When a multi-range request is generated, the first byte position of a range spec may exceed the entity size. In such cases, this counter is incremented and the range set generation stops.

6. Range generation limitations

This section documents what kind of Range requests cannot be generated by Polygraph (yet).

6.1 Single-range requests

It is possible to generate all types of so called syntactically valid single-range sets (e.g., "a-b", "a-", "-b"), including sets exceeding entity size. It is also possible to generate an unsatisfiable suffix range: "-0".

However, there is no way to generate a syntactically invalid "a-b" range set, where "a" exceeds "b". If such a range is configured or generated internally, an "a-a" value is actually used.

6.2 Multi-range requests

Only valid "a-b,c-d,..." range specs with ascending first positions can be generated.

It is not possible to generate the following multi-range specs: