4.2. Deploying Presto

Presto is deployed as an application on a Azure HDInsight Hadoop cluster. This section will document deploying the HDInsight cluster with Presto using the Azure Portal. Microsoft provides general instructions for deploying an HDInsight cluster that you should familiarize yourself beforehand as well. Creating the HDInsight cluster will take between 20 to 30 minutes.

1. Select HDInsight

In the Azure portal, select Create a resource > Analytics > HDInsight.


2. Chose Custom Setup

Select Custom (size, settings app)


3. Select HDInsight Basics

In the Basics tab, enter the values based on the following description.

Cluster Name Enter the name you want your Presto cluster.
Subscriptions Select your Azure subscription.
Cluster Type Choose “Hadoop” version 2.7.3. Do not select Enterprise Security Package as Presto does not currently support this.
Cluster login username Enter the cluster login username. The default user name is admin.
Cluster login password. Enter the cluster login password.
SSH username Enter the SSH username. The default user is sshuser.
Use same password as custer login Select this checkbox to use the same password as the one used for the cluster login user.
Resource Group Create a resource group or select an existing resource group.
Location Select the Azure location to create the cluster.
../_images/deploy_basics.png ../_images/deploy_cluster_type.png

Once you’ve completed entering in the values in the Basics tab, click Next.

4. Select Storage

In the Storage tab, enter the Storage Account Settings.

Primary Storage Type Select Azure Storage.
Selection Method Keep the default selection of “My Subscriptions” unless it is desired to access data from another subscription.
Select a Storage Account. Select an existing or create a new Storage Account.
Default Container Specify an existing or create a new Container for the HDInsight. cluster to use.
Additional Storage Accounts Skip this option.
Data Lake Store Access Skip this option.

5. Configuring the Metastore

Still in the Storage tab, you can configure using a custom Metastore. This step is optional and assumes you have a Metastore already configured. Microsoft provides documentation for configuring an custom Metastore for HDInsight. The primary advantage allows the metadata to persist even after a cluster is deleted and recreated.

Select a SQL database for Hive Choose the external database that will be used.
Authenticate SQL Database Enter the SQL Database username and password to authenticate.

Once you’ve completed entering in the values in the Storage tab, click Next.

6. Select Starburst Presto Application

From the list of available applications, select Starburst Presto on Azure HDInsight. Click “Review Legal Terms” and then click Create to agree to those terms. Click Ok and then click Next to complete the selection of Starburst Presto.


7. Configure Cluster Size

Starburst Presto is installed on a Hadoop Cluster. The Presto Coordinator is installed on one of the two HDInsight Head Nodes and the Presto Workers are installed on the variable number of HDInsight Worker Nodes.

The Presto command line interface (CLI) and Apache Superset is installed on an HDInsight Edge Node.

Number of Worker Nodes. Enter the number of Worker nodes for the cluster. This number usually depends on your workload size and may require resizing.
Worker and Head node size Select the node sizes. We recommend you choose the same size type. See Recommended Node sizes for more information.
Starburst Presto on Azure HDInsight node size. Keep the default select Edge Node size for the Presto Application. This node does not run Presto itself so it does not need to be the same as the Head and Worker node sizes.

Once you’ve made your selection, click Next.

8. Advanced Settings

Starburst provides Script Actions for modifying the Presto cluster. But these are applied after cluster creating so we will skip this for now. See Script Actions for more information.

9. Virtual Network

This is optional. Microsoft provides documentation about extending the cluster with a virtual network

Click Next.

10. Summary and Create

The summary tab provides the description of everything you entered in the previous step. Review and make sure that Starburst Presto is listed as an application. Once you are satisfied with the settings click “Create.” Creating the HDInsight cluster will take between 20 to 30 minutes.