Performance features #

Starburst products are fast, but you can make them even faster with the performance features described in this topic.

Fault-tolerant execution #

Fault-tolerant execution (FTE) allows a cluster to retry queries or parts of query processing in the event of failures without having to start the whole query from the beginning. This is especially useful for long-running queries that are typical with batch processing and Extract Transform Load (ETL) queries.

In fault-tolerant execution mode, intermediate exchange data is spooled and can be re-used by another worker. When queries require more memory than currently available in the cluster, they are still able to succeed. Multiple queries are able to share resources in a fair way, and make steady progress.

In Starburst Galaxy, simply create your cluster with fault-tolerant execution mode selected. To enable fault-tolerant execution in SEP, it is enabled with a configuration property.

Warp speed #

With Starburst Warp Speed you can use accelerated clusters to leverage smart indexing and caching. Starburst Warp Speed automatically creates and maintains these indexes and caches based on the characteristics of the processed queries. The index and cache data is stored on local storage attached to each worker node in the cluster. Because the data is available directly in the cluster and no longer must be retrieved from remote object storage, query processing is accelerated when accessing the same data.

When a query accesses data that is not accelerated, the system performs data and index materialization on the cluster to accelerate future access to that data. This process of creating the indexes and caches is also called warmup. Warmup is performed individually by each worker based on the processed splits and uses the local high performance storage of the worker.

When new data is added to a table or the index and cache creation are in progress, the new portions of the table that are not accelerated are served from the object storage. After the asynchronous indexing and caching is complete, query processing is accelerated when accessing the same data, because the data is available directly in the cluster from the indexes and caches, and no longer has to be retrieved from the remote object storage.

This results in immediately improved performance for recently used datasets.

In Starburst Galaxy, simply create your cluster with Warp Speed mode selected. To use Warp Speed with SEP, you must meet the cluster sizing, deployment, and configuration requirements and enable it in each catalog.