Rate limiting

In order to ensure the stability of the system and to guarantee good performance for all users we have to protect it against deliberate attacks or runaway scripts. Every request which reaches our system will therefore go through a rate limiting filter. As long as the agreed upon rate limiting policies are conformed to, responses by our services shall be delivered in timely fashion. On the other hand, requests which violate any of the agreed upon policies will be responded to with a HTTP 429 response.

We are able to adjust rate limit policies for each individual user so do contact our Support for specific requirements.

Rate limiting policy

A rate limiting policy defines either how many processing units or HTTP requests can be used per given time period or in total. Both processing units and requests are rate limited and the level of rate limiting depends on your account (see pricing plans). You can see your current rate limiting policies in the Sentinel Hub Dashboard:

image

It is possible to have multiple rate limiting policies. Our standard accounts contain both a processing unit and also a request rate limiting policy. To conform to our rate limiter, you need to pass each of your rate limiting policies. For example, lets say you have a policy of 20 requests per second and a policy of 10000 requests per day. By issuing 100 requests in one second only 20 requests will pass. Even though you have a limit of 10000 requests per day, 80 requests would violate the “per second policy” and thus be rate limited.

Unused processing units and requests do not accumulate. If you have a rate limit policy with 20 request per second and you don't consume any request for a longer period you are still able to do just 20 requests in the next second.

Response Headers

All requests going through rate limiting include headers to allow for programmatic adaption to Rate Limiting:

  • Retry-After: Time until the next request is available. In milliseconds
  • X-ProcessingUnits-Spent: The number of processing units that the request spent
  • X-RateLimit-Remaining: The number of requests which can be used immediately
  • X-ProcessingUnits-Remaining: The number of processing units remaining for usage
  • X-ProcessingUnits-Retry-After: Time until enough processing units will be available to execute the request. In milliseconds.
  • X-RateLimit-ViolatedPolicy: The rate limiting policy which has been violated
    • capacity: The number of processing units or requests agreed to in the violated rate limiting policy.
    • samplingPeriod: Time period agreed to in the violated rate limiting policy. Possible values are PT1S - second,PT1M - minute,PT1H - hour, PT1D - day, P31D - month.

Note: X-RateLimit-ViolatedPolicy header is only returned in rate limited (HTTP 429) responses.

Example:

curl -I http://services.sentinel-hub.com/ogc/wms/{instanceId}
HTTP/1.1 429 Too Many Requests
Date: Wed, 28 Aug 2019 07:49:49 GMT
Retry-After: 0
X-RateLimit-Remaining: 287.0
X-ProcessingUnits-Remaining: 14
X-ProcessingUnits-Retry-After: 593
X-RateLimit-ViolatedPolicy: {"samplingPeriod": "PT1M", "capacity": 1000}

The HTTP status code in this example is 429 meaning that the request was rate limited. The value of the Retry-After header is 0, which means that next request is already available. There are actually 287 remaining (available) requests. A request-related rate limiting policy was therefore not violated with this request.

The value of the X-ProcessingUnits-Retry-After header is 593. It will take 593 milliseconds to refill the processing units budget with enough processing units to execute the request. There are 14 processing units currently available, which was not enough to execute the request. We see that the request violates the "1000 processing units per minute" rate limiting policy.

Try it out

We have set up a test user with two very restrictive rate limiting policies:

  • 10 requests per minute and
  • 10 processing units per minute

You can use its instance (for OGC requests) or Oauth client credentials (for API requests) to test how our rate limiting works and for integration purposes.

An example of a WMS request using the test user's instance:

https://services.sentinel-hub.com/ogc/wms/7702fda8-f583-4ae0-a581-1b34e7a6d350?

The test user's Oauth client credentials below can be used to get an access token, which can then be included in header of a process API requests (for examples of requests see here):

Client id: fa02a066-fc80-4cb4-af26-aae0af26cbf1 Client secret: rate_limit_secret

Note that many people may be using it at the same moment so there is a chance that it will be over the limit more or less all the time. Its purpose is to evaluate response headers anyway.

Tips to Avoid Being Rate Limited

Caching

Store API responses that you expect to use a lot. For example, don’t call same requests on every page load but try to store responses in local storage.

Request only what you need

Be defensive in fetching and try to request only the data that you actually need.

Exponential backoff

When your limits have been exceeded, we recommend implementing retries with a exponential backoff. An exponential backoff means that you wait for exponentially longer intervals between each retry of a single failing request.