10.8. Cloudera Data Platform support#

The Starburst Hive connector can be used to query Cloudera Data Platform (CDP) version 7.1.

Note

The Cloudera Data Platform support requires a valid Starburst Enterprise Presto license.

Configuration#

  • Edit your catalog properties file using the Hive connector

  • Set the metastore to use thrift-cdp7

  • Configure the URI to point to your Hive metastore Thrift service

connector.name=hive-hadoop2
hive.metastore=thrift-cdp7
hive.metastore.uri=thrift://cdp-master:9083

Hive metastore and statistics#

The CDP support includes the improved thrift-cdp7 Hive metastore support. It supports the metastore thrift communication protocol regarding table statistics management implemented by CDP.

This supports separate handling of a variety of statistics for Presto:

  • Column statistics

  • Partition statistics

  • Table statistics

All statistics handling, when using CDP, is performed by the Hive connector and the thrift-cdp7 Hive metastore, and is therefore identical to standard Hive connector usage.

Reading data#

CDP support includes read operations on the following tables:

  • compacted tables

  • bucketed tables

  • partitioned tables

  • unpartitioned tables

The following file formats can be read:

  • Avro

  • CSV

  • ORC ACID

  • Parquet

  • RCFile

Writing data#

Write operations, such as CREATE TABLE AS or CREATE VIEW and others, are generally supported.

Write operations on ORC ACID tables is not supported.