Batch Processing API

Batch Processing API is only available for enterprise users. If you don't have an enterprise account, and would like to try it out, contact us for a custom offer. It is currently supported only on the EU-Central-1 (Frankfurt) deployment.

Batch Processing API (or shortly "batch API") enables you to request data for large areas and/or longer time periods.

It is an asynchronous REST service. This means that data will not be returned immediately in a request response but will be delivered to your object storage, which needs to be specified in the request (e.g. S3 bucket, see AWS S3 bucket settings below). The processing results will be divided in tiles as described below.

Workflow

The batch processing API comes with the set of REST APIs which support the execution of various workflows. The diagram below shows all possible statuses of a batch processing request (CREATED, ANALYSING, ANALYSIS_DONE, PROCESSING, DONE, FAILED) and user's actions (ANALYSE, START, CANCEL) which trigger transitions among them.

CREATEDPROCESSINGANALYSINGANALYSIS_DONECANCELEDDONEPARTIALFAILED👤START👤ANALYSE👤START👤CANCEL👤CANCEL👤CANCEL

The workflow starts when a user posts a new batch processing request. In this step the system:

  • creates new batch processing request with status CREATED,
  • validates the user's inputs, and
  • returns an estimated number of output tiles that will be processed.

User can then decide to either request an additional analysis of the request, start the processing or cancel the request. When additional analysis is requested:

  • the status of the request changes to ANALYSING,
  • the evalscript is validated,
  • a list of required tiles is created, and
  • the request's cost is estimated (i.e. the estimated number of processing units (PU) needed for the requested processing).
  • After the analysis is finished the status of the request changes to ANALYSIS_DONE.

If the user chooses to directly start processing, the system still executes the analysis but when the analysis is done it automatically starts with processing. This is not explicitly shown in the diagram in order to keep it simple.

The user can now request a list of tiles for their request, start the processing, or cancel the request. When the user starts the processing:

  • the estimated number of PU is reserved,
  • the status of the request changes to PROCESSING (this may take a while),
  • the processing starts.

When the processing finishes, the status of the request changes to:

  • FAILED when all tiles failed processing,
  • PARTIAL when some tiles were processed and some failed,
  • DONE when all tiles were processed.

Although the process has built-in fault tolerance, occasionally, tile processing may fail. Re-processing of such FAILED tiles is possible by requesting per-tile reprocessing and starting the processing of that request again.

Canceling the request

User may cancel the request at any time. However:

  • if the status is ANALYSING, the analysis will complete,
  • if the status is PROCESSING, all tiles that have been processed or are being processed at that moment are charged for. The remaining PUs are returned to the user.

Automatic deletion of stale data

Stale requests will be deleted after some time. Specifically, the following requests will be deleted:

  • failed requests (request status FAILED),
  • requests that were created but never started (request statuses CREATED, ANALYSIS_DONE),
  • successful requests (request statuses DONE and PARTIAL) for which it was not requested to add the results to your collections (new feature - coming soon). Note that only such requests themselves will be deleted, while the requests' result (created imagery) will remain under your control in your S3 bucket.

Tiling grids

For more effective processing we divide the area of interest into tiles and process each tile separately. While process API uses grids which come together with each datasource for processing of the data, the batch API uses one of the predefined tiling grids. The predefined tiling grids are based on the Sentinel-2 tiling in UTM/WGS84 projection with some adjustments:

  • The width and height of tiles in the original Sentinel 2 grid is 100 km while the width and height of tiles in our grids are given in the table below.
  • All redundant tiles (i.e. fully overlapped tiles) are removed.

All available tiling grids can be requested with (NOTE: To run this example you need to first create an OAuth client as is explained here):

url = "https://services.sentinel-hub.com/api/v1/batch/tilinggrids/"
response = oauth.request("GET", url)
response.json()

This will return the list of available grids and information about tile size and available resolutions for each grid. Currently available grids are:

nameidtile widthtile heightresolutionsdownload the grid [zip with shp file]
s2gm grid020040 m20040 m10 m, 20 m, 60 ms2gm grid
10km grid110000 m10000 m10 m, 20 m10km grid
100.08km grid2100080 m100080 m60 m, 120 m, 240 m, 360 m100.08km grid
WGS84 1 degree grid30.0001°, 0.0002°WGS84 1 degree grid

To use s2gm grid with 60m resolution for example, specify id and resolution parameters of the tilingGrid object when creating a new batch request (see an example of full request) as:

{
...
"tilingGrid": {
"id": 0,
"resolution": 60.0
},
...
}

Contact us if you would like to use any other grid for processing.

Processing results

The outputs of a batch processing will be stored to your object storage. By default, the results will be organized in sub-folders where one sub-folder will be created for each tile. Each sub-folder might contain one or more images depending on how many outputs were defined in the evalscript of the request. For example: batch sub folders

You can also customize the sub-folder structure as described under output parameter in the BATCH API reference .

Currently supported output image formats are png, jpeg, and GeoTIFF.

The results of batch processing will be in UTM/WGS84 projection. Each part of the aoi is delivered in the UTM zone with which it intersects. In other words, in case your area of interest (aoi) intersects with more UTM zones, the results will be delivered as tiles in different UTM zones (and thus different CRSs).

AWS S3 bucket settings

Bucket region

The bucket to which the results will be delivered needs to be in the eu-central-1 (Frankfurt) region.

Bucket settings

Sentinel Hub needs full access to the bucket. To do this, update your bucket policy to include the following statement (don't forget to replace <bucket_name> with your actual bucket name):

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Sentinel Hub permissions",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::614251495211:root"
},
"Action": [
"s3:*"
],
"Resource": [
"arn:aws:s3:::<bucket_name>",
"arn:aws:s3:::<bucket_name>/*"
]
}
]
}

Examples

Example of Batch Processing Workflow