5.10. AWS Glue Support
AWS Glue is a supported metadata catalog for Presto. It is intended to be used as a alternative to the Hive Metastore with the Presto Hive plugin to work with your S3 data.
AWS Glue with Starburst Presto AMI
When you deploy a Starburst Presto AMI from the AWS Marketplace, you need to configure the Hive connector to use Glue. The minimal setup is to do the following on all Presto nodes:
/etc/presto/catalogs/glue.propertiesfile with at least the contents below:
connector.name=hive-hadoop2 hive.metastore = glue
restart Presto with:
sudo service presto restart
AWS Glue with Starburst Presto Cloud Formation Template
When using the Starburst Cloud Formation template in AWS, you can leverage Glue by simply choosing
DeployGlueConnector field of the Stack creation form (
Presto Configuration section)
Starburst Presto with AWS Glue usage
When configured as above, the Glue catalog will be available as the
glue connector from within Presto CLI or any other Presto connection.
As usual remember to specify the location of the data on S3. Either for the entire schema or on the table level. For example to create a schema
foo in Glue, with the S3 base directory (root folder for per table subdirectories) pointing to the root of
my-bucket S3 bucket, you would write:
CREATE SCHEMA glue.foo WITH (location = 's3://my-bucket/')
You can also create and edit the schema and tables directly from AWS Glue. In AWS Glue terminology the schema is called “database”.
Starburst Presto with AWS Glue prerequisites
Both the AMI and Cloud Formation approach mentioned above require the Presto instances to have permissions to access both S3 and Glue AWS services.
When using Starburst Presto via our Cloud Formation template by default you do not need to provide anything, the template will create all necessary resources automatically.
If however you need to provide your own IAM Instance Profile for the Presto instances (
IamInstanceProfile field in the Stack creation form) please consult the AWS Security Prerequisites section. Same applies when launching the AMI manually, make sure you choose a IAM Role that satisfies the requirements.
Known limitations of AWS Glue support
There are a couple Presto features that are not yet supported with the Glue catalog:
Presto’s new Cost-Based Optimizer will not work with Glue due to current lack of Column statistics support with Glue.
Renaming tables from within AWS Glue is not supported.
Partition values containing quotes and apostrophes are not supported (for example,
Using Hive authorization is not supported.