4.3. Configuring Presto

Presto has an extensive set of configuration switches that allow it to be tuned for certain specific requirements. Default values are chosen for the best “out of the box” experience. However, if you need to fine-tune Presto behavior, you can do so when using Starburst’s Script Actions for Azure HDInsight.

Default Configuration

The following configuration changes are applied automatically for you:

  • Java heap maximum memory (-Xmx) is set appropriately for the selected Azure node size
  • JVM’s JIT caches are set to 512 MiB
  • Java is configured to use G1 garbage collector, this is the recommended garbage collector to use when running Presto

Custom Configuration

When using Starburst Presto on Azure HDInsight, you remain in full control of the Presto cluster being run. Using the Configure Presto Script Action, you can deploy a custom configuration. Before submitting the Script Action, you need to create a configuration package.

Creating the Custom Configuration Package

The configuration package is a ZIP file with the structure shown below. All files are optional except for top-level etc/ directory entry.

etc/
        config.properties
        jvm.config
        catalog/
                hive.properties
                <catalog-name>.properties

etc/config.properties This file is optional. Please refer to Properties Reference for documentation of properties that can be set here.

etc/jvm.config This file is optional. Please refer to Oracle’s documentation of options that can be set here. Please refer to Tuning Presto for information about JVM options that are often useful when troubleshooting performance issues. Certain options, including -Xmx and garbage collection algorithm selection are set by default.

etc/catalog/<catalog-name>.properties This file is optional. When such a file is placed in the configuration package, a catalog <catalog-name> will be created. The file must contain the following:

connector.name=<connector-name>

Where <connector-name> is the name of the connector you are going to use, please refer to the Connector documentation for a list of supported connectors and their documentation. If the chosen connector has some mandatory configuration parameters, they must be set in the <catalog-name>.properties file. There can be more than one such file in the etc/catalog/ folder of the configuration package. This allows you to define multiple catalogs.

A hive.properties files will alway be created for you. However, you can still create one in the configuration package to override properties or add additional ones.

Please refer to Auxiliary Files for instructions on how to configure properties that refer to additional files.

Auxiliary Files If a configuration property in any of the configuration files accepts a path to an additional file (e.g., Hive’s security.config-file), add the file in the configuration package and refer to it using a path that is relative, starting with the configuration package top-level directory.

For example, if you are configuring Hive connector to use hive.security=file, you also need to set security.config-file (see Hive Security documentation for the meaning and structure of the file). To do so, add etc/catalog/hive-security.json in the configuration package and refer to etc/catalog/hive.properties using a relative path:

...
hive.security=file
security.config-file=etc/catalog/hive-security.json

Example Custom Configuration Package

You can download an example configuration package from Starburst here:

https://starburstdata.blob.core.windows.net/presto/test/presto-override.zip

Deploying the Custom Configuration Package

The custom configuration package is deployed via Script Action. Refer to the Script Actions section for information on how to submit a Script Action. For this Script Actions enter the following:

Script Type Custom
Script Name Name the script e.g. “Custom Presto Configuration”
Bash script URI. https://starburstdata.blob.core.windows.net/presto/update-presto-config.sh
Node Types Check Head and Worker
Parameters -p <path_to_custom_configuration.zip> e.g -p https://starburstdata.blob.core.windows.net/presto/test/presto-override.zip
Persist Script Action This is optional. But you should consider checking it if you plan to expand the cluster. When you expand, this Script Action will be run automatically if the box is checked.

Note that the path to the custom configuration package must be in a publicly accessible location.

../_images/configuration_script_action.png

Interactions Between Default and Custom Configurations

It is important to note that the “out of the box” default values are overridden only for the keys where an advanced configuration entry exits. If no advanced configuration is entered, the default value will remain. However, in the case of jvm.config , additional configuration entries are appended to the default configuration.