4.6. Configuring Hive Metastore#
There are several ways to configure a Hive Metastore in catalog using the Hive connector.
None#
By choosing MetastoreType
to None
(which is default configuration), no
Hive Metastore is configured.
Standalone (ephemeral)#
By choosing MetastoreType
to Standalone (ephemeral)
a separate EC2
instance is created by CFT which contains both the Hive Metastore and its
underlying RDBMS.
Notice that information stored in such a Metastore lives as long as the Presto cluster. Because of that such configuration should be avoided on production system, while it is the best option to play with Presto and the Hive connector.
AWS Glue Data Catalog#
By choosing MetastoreType
to AWS Glue Data Catalog
Hive catalog
uses the AWS Glue Data Catalog as its Metastore service.
External MySQL RDBMS#
By choosing MetastoreType
to External MySQL RDBMS
a separate EC2
instance is created by CFT which runs a Hive Metastore service that leverages an
external MySQL RDBMS as its underlying storage. This new instance needs
network access to the external MySQL system. Make sure to set up your networking
and security groups appropriately. You can use your own MySQL instance. However,
we recommend using AWS RDS.
This configuration requires the below properties to be set:
ExternalMetastoreHost
with host address where MySQL service is runningExternalMetastorePort
with port number of MySQL service. If0
is set then3306
(default MySQL port) is used.ExternalRdbmsMetastoreUserName
with MySQL user nameExternalRdbmsMetastorePassword
with MySQL user passwordExternalRdbmsMetastoreDatabaseName
with MySQL database name that is used for storing Hive Metastore data.
RDBMS does not require any schema initialization other than database creation. It is well suited for MySQL provisioned with AWS RDS service.
External PostgreSQL RDBMS#
By choosing MetastoreType
to External PostgreSQL RDBMS
a separate EC2
instance is created by CFT which runs a Hive Metastore service that leverages an
external PostgreSQL RDBMS as its underlying storage. This new instance
needs network access to the external PostgreSQL system. So make sure to set up
your networking and security groups appropriately. You can use your own
PostgreSQL instance you manage. However, we recommend using AWS RDS.
This configuration requires the below properties to be set:
ExternalMetastoreHost
with host address where PostgreSQL service is runningExternalMetastorePort
with port number of PostgreSQL service. If0
is set then5432
(default PostgreSQL port) is used.ExternalRdbmsMetastoreUserName
with PostgreSQL user nameExternalRdbmsMetastorePassword
with PostgreSQL user passwordExternalRdbmsMetastoreDatabaseName
with PostgreSQL database name that is used for storing Hive Metastore data.
RDBMS does not require any schema initialization other than database creation. It is well suited for PostgreSQL provisioned with AWS RDS service.
External Hive Metastore Service#
By choosing MetastoreType
to External Hive Metastore Service
Hive
connector uses an existing Hive Metastore Service.
This configuration requires the below properties to be set:
ExternalMetastoreHost
with host address where External Hive Metastore Service service is runningExternalMetastorePort
with port number of External Hive Metastore Service service. If0
is set then9083
(default Hive Metastore Service port) is used.