5.3. Configuring Presto
Presto has an extensive set of configuration switches that allow it to be tuned for certain specific requirements. Default values are chosen for the best “out of the box” experience. However, if you need to fine-tune Presto behavior, you can do so when using Starburst’s Script Actions for Azure HDInsight.
The following configuration changes are applied automatically for you:
- Java heap maximum memory (-Xmx) is set appropriately for the selected Azure node size
- JVM’s JIT caches are set to 512 MiB
- Java is configured to use G1 garbage collector, this is the recommended garbage collector to use when running Presto
When using Starburst Presto on Azure HDInsight, you remain in full control of the Presto cluster being run. Using the Configure Presto Script Action, you can deploy a custom configuration. Before submitting the Script Action, you need to create a configuration package.
Creating the Custom Configuration Package
The configuration package is a ZIP file with the structure shown below. All files are optional except for top-level etc/ directory entry.
etc/ config.properties jvm.config catalog/ hive.properties <catalog-name>.properties
etc/config.properties This file is optional. Please refer to Properties Reference for documentation of properties that can be set here.
etc/jvm.config This file is optional. Please refer to Oracle’s documentation of options that can be set here. Please refer to Tuning Presto for information about JVM options that are often useful when troubleshooting performance issues. Certain options, including -Xmx and garbage collection algorithm selection are set by default.
etc/catalog/<catalog-name>.properties This file is optional. When such a file is placed in the configuration package, a catalog <catalog-name> will be created. The file must contain the following:
<connector-name> is the name of the connector you are going to use, please refer to the Connector documentation for a list of supported connectors and their documentation. If the chosen connector has some mandatory configuration parameters, they must be set in the
<catalog-name>.properties file. There can be more than one such file in the
etc/catalog/ folder of the configuration package. This allows you to define multiple catalogs.
hive.properties files will alway be created for you. However, you can still create one in the configuration package to override properties or add additional ones.
Please refer to Auxiliary Files for instructions on how to configure properties that refer to additional files.
If a configuration property in any of the configuration files accepts a path to an additional file (e.g., Hive’s
security.config-file), add the file in the configuration package and refer to it using a path that is relative, starting with the configuration package top-level directory.
For example, if you are configuring Hive connector to use
hive.security=file, you also need to set
security.config-file (see Hive Security documentation for the meaning and structure of the file). To do so, add
etc/catalog/hive-security.json in the configuration package and refer to
etc/catalog/hive.properties using a relative path:
... hive.security=file security.config-file=etc/catalog/hive-security.json
Example Custom Configuration Package
You can download an example configuration package from Starburst here:
Deploying the Custom Configuration Package
The custom configuration package is deployed via Script Action. Refer to the Script Actions section for information on how to submit a Script Action. For this Script Actions enter the following:
|Script Name||Name the script e.g. “Custom Presto Configuration”|
|Bash script URI.||
For Presto release 0.208-e the
|Node Types||Check Head and Worker|
|Parameters||-p <path_to_custom_configuration.zip> e.g -p https://starburstdata.blob.core.windows.net/presto/test/presto-override.zip|
|Persist Script Action||This is optional. But you should consider checking it if you plan to expand the cluster. When you expand, this Script Action will be run automatically if the box is checked.|
Note that the path to the custom configuration package must be in a publicly accessible location.
Interactions Between Default and Custom Configurations
It is important to note that the “out of the box” default values are overridden only for the keys where an advanced configuration entry exits. If no advanced configuration is entered, the default value will remain. However, in the case of jvm.config , additional configuration entries are appended to the default configuration.