5.4. Deploying Presto

Presto on AWS Marketplace is available as both an Amazon Machine Image (AMI) and a CloudFormation template. Launching as an AMI provides a fully functional single node Presto setup – suitable for trial deployment of Presto in your development environment. The Starburst CloudFormation template is ideal for production deployment by configuring multiple Presto AMIs to form a Presto cluster consisting of a Presto Coordinator and various Presto Workers.

Deploying the Presto AMI Using AWS EC2 Console

1. Launching the AMI

After subscribing to the software, choose ‘Launch through EC2’ to launch through the AWS EC2 Console.

../_images/presto_ami_launch.png

This will direct you to the ‘Choose an Instance Type’ step. Optionally you can choose ‘Copy to Service Catalog’ for when using with the AWS Service Catalog.

2. Choose an Instance Type
Choose an instance type that best suits your workload. The r4.4xlarge instance type is recommended by default and works well for most workloads. See Recommended Instance Types to assist you with what instance types may be best for you. Note that a single node Presto instance is typically used for trying Presto in a development environment.
3. Configure Instance Details

Configure your instance to fit your needs. Choose the existing VPC and Subnet you want to deploy to. And optionally choose an IAM Role. Please refer to the above Prerequisites for more information on these various specification fields.

../_images/presto_ami_configure.png
4. Add Storage
Manage your instance’s storage and add supplementary EBS and instance store volumes as needed. The defaults are generally OK.
5. Add Tags
Add and create one or more tags. Refer to the link below for more information on tags
6. Configure Security Group

Create or select an existing security group to control traffic to your instance. Note that you are able to choose multiple security group IDs when selecting from the pool of existing security groups. For additional information regarding security groups, please refer to the Prerequisites section.

It’s recommended that ports 8080 and 8088 are accessible in order to access the Presto Web UI, submit queries from outside the cluster, and access Apache Superset. Additionally, it’s recommended that port 22 is accessible for SSH access.

7. Review
Review the details of your instance. When content, proceed by pressing launch to assign a key pair to the instance and conclude the launch process.

Deploying a Presto Cluster Using CloudFormation Template (Web Console)

1. Select Template

After you subscribe to the Presto offering on AWS Marketplace, you’ll be able to launch CloudFormation.

../_images/presto_cft_launch.png

This will direct you to the ‘Select Template’ step for creating a CloudFormation stack. You should find a pre-populated field under ‘Amazon S3 template URL’. This is the location of Starburst’s Presto CloudFormation template. Click ‘next’.

Optionally you can choose ‘Copy to Service Catalog’ for when using with the AWS Service Catalog.

../_images/presto_cft_select.png
2. Specify Details

Proceed by specifying the details of your Presto cluster (CloudFormation stack). This step includes network, EC2 and Presto configurations.

  • Preliminary Details: Name your Presto cluster .

    ../_images/presto_cft_stackname.png
  • Network Configuration: Specify your existing VPC, Subnet, and Security Groups. It is assumed these are preconfigured in your AWS account. See the below Prerequisites sections for more detail:

    ../_images/presto_cft_network.png
  • EC2 Configuration:

    Choose a CoordinatorInstanceType and WorkerInstanceType suitable for your workload. The r4.4xlarge instance types are chosen by default and work well for most workloads. See Recommended Instance Types to assist you with what instance types may be best for you.

    Choose a KeyName which is the name of an EC2 KeyPair to enable SSH access to the instance. See SSH Keys for more detail.

    Specify the WorkersCount to specify the number of Presto worker nodes for the cluster. Presto worker nodes are added to an AWS AutoScaling Group. See Auto Scaling a Running Presto Cluster for more detail.

    Optionally specificy the InstanceProfile to attach to Presto nodes. See Instance Profiles for more detail.

    ../_images/presto_cft_ec2.png
  • Presto Configuration: Indicate whether you will be allowing external access on HTTP. Disabled by default, this is to disallow external access for security purposes. When disabled, clients (CLI/JDBC/etc) can connect to Presto only from machines in the AWS Subnet specified above. When enabled and instances are assigned public IP, Presto endpoints will be available over HTTP without authorization (unless eg. LDAP is configured via “Additional Configuration”).

    Note

    It’s important to make sure your AWS infrastructure is set up in such a way that the Presto is not publicly accessible.

    If you want to use the Hive connector to query data in HDFS or S3, you can include HiveMetastoreURI and the CloudFormation template will automatically configure the Hive connector for you.

    When AWS Glue catalog option is enabled, an additional glue catalog is added to the list of catalogs within Presto. This is independent of the Hive connector with Hive Metastore. Both can be used on a single Stack. Please refer to AWS Glue Support for more details.

    Lastly, you can optionally specify an AdditionalConfigurationURI. This is the location of your customer Presto configuration if you wish to override the default configuration. Typically this is used for advanced setups. Please see the section on Additional Configuration for more detail.

    ../_images/presto_cft_configuration.png
  • Other: Indicate whether you will be launching Apache Superset.

    ../_images/presto_cft_superset.png
3. Options
Enter any additional stack specifications as shown on the Options page. These options include adding tags to resources within your cluster, choosing IAM roles, and specifying monitoring time for rollback triggers, among other advanced specifications. For further insight into said options follow the link to the AWS CloudFormation documentation.
4. Review
Finally, review the details of your Presto cluster. When content, proceed by pressing create to conclude the creation.

Deploying a Presto Cluster Using CloudFormation Template (AWS CLI)

After subscribing to the software you can optionally launch the Presto cluster using the AWS CLI instead of the AWS Web Console. This is often useful for those wanting to script control of the Presto cluster. Or simply for those more comfortable at the command line.

1. Open a Terminal Window
Open a terminal window to begin a command line.
2. Create Stack
Use the create stack command to initiate the creation of your Presto cluster (CloudFormation stack).
create-stack
3. Name Stack
Specify the name that is to be associated with the cluster. The name must be unique in the region in which you are creating the cluster.
--stack-name (string)
4. Specify Template

Indicate the template you will be using to create your cluster.

URL: Specify the location of the file containing the template body. The URL must point to a template that is located in an Amazon S3 bucket.

--template-url (string)
5. Specify Parameters
Define a list of parameter structures that specify input parameters for the cluster. Reference the following list of possible parameters for your cluster creation.
Parameter Key Description Example Parameter Values
VPC Virtual Private Cloud ID vpc-4bd6ca11
Subnet Subnet to use for Presto nodes (must belong to the selected VPC) subnet-123abc2b
SecurityGroups Additional Security Groups for Presto nodes (e.g: allowing SSH access). Must select at least one. sg-12e34aeb
CoordinatorInstanceType EC2 instance type of the coordinator r4.xlarge (For a full list see Available Instance Types under Supported Instances and Regions)
WorkerInstanceType EC2 instance type of the workers r4.xlarge (For a full list see Available Instance Types under Supported Instances and Regions)
KeyName Name of an EC2 KeyPair to enable SSH access to the instance. john.smith
IamInstanceProfile. The name of an instance profile to attach to Presto nodes (Optional) my-ec2-instance-profile.
WorkersCount Number of dedicated Presto worker nodes (apart from coordinator) to instantiate 10
LaunchSuperset When enabled, Superset will be started on a an EC2 instance yes
AllowExternalAccessOnHttp When enabled and instances are assigned public IP, Presto endpoints will be available over HTTP without authorization. no
HiveMetastoreURI URI of Hive Metastore (starting with: thrift://) (Optional) thrift://172.31.6.18:9083
DeployGlueConnector When enabled, a Hive connector with AWS Glue integration will be added. yes
AdditionalConfigurationURI URI of S3 zip file with additional Presto (Optional) s3://starburstdata/aws-marketplace /examples/cft/cloudformation- template-configuration/0.1/hive- allow-drop-table.zip

For a more detailed description on the above parameters, please refer to the Prerequisites section. Also, parameter values need to be provided on the command line in special form. Please refer to the Example below for guidance.

6. Options

Rollback: Set rollback ability to true to disable rollback of the cluster if stack creation failed.
--disable-rollback
--no-disable-rollback
RollbackTriggers=[{Arn=string,Type=string},{Arn=string,Type=string}],MonitoringTimeInMinutes=integer
7. Review
Finally, review the details of your cluster and your commands. When ready, proceed by pressing enter to conclude the creation of your Presto cluster.

Example

See the following create-stacks command as a reference for your Presto cluster deployment:

aws cloudformation create-stack \
--stack-name "Presto-cluster" \
--template-url "https://s3.amazonaws.com/awsmp-fulfillment-cf-templates-prod/PrestoCFT.template" \
--parameters \
"ParameterKey=VPC,ParameterValue=vpc-4bd6ca11" \
"ParameterKey=Subnet,ParameterValue=subnet-123abc2b" \
"ParameterKey=SecurityGroups,ParameterValue=sg-12e34aeb" \
"ParameterKey=CoordinatorInstanceType,ParameterValue=r4.xlarge" \
"ParameterKey=WorkersInstanceType,ParameterValue=r4.xlarge" \
"ParameterKey=KeyName,ParameterValue=john.smith" \
"ParameterKey=IamInstanceProfile,ParameterValue=my-ec2-instance-profile" \
"ParameterKey=WorkersCount,ParameterValue=2" \
"ParameterKey=LaunchSuperset,ParameterValue=yes" \
"ParameterKey=AllowExternalAccessOnHttp,ParameterValue=no" \
"ParameterKey=HiveMetastoreURI,ParameterValue=thrift://172.31.6.18:9083" \
"ParameterKey=AdditionalConfigurationURI,ParameterValue=s3://my_bucket/presto-additional-configuration.zip"

The above commands yield output like the following:

{
"StackId":"arn:aws:cloudformation:us-east-1:123456789012:stack/myteststack/466df9e0-0dff-08e3-8e2f-5088487c4896"
}