Cloudera Data Platform support#

The Starburst Hive connector can be used to query Cloudera Data Platform (CDP) version 7.1.

Compatibility information for older CDP versions is available in the compatibility matrix.

Note

The Cloudera Data Platform support requires a valid Starburst Enterprise Presto license.

Configuration#

  • Edit your catalog properties file using the Hive connector

  • Set the metastore to use thrift-cdp7

  • Configure the URI to point to your Hive metastore Thrift service

connector.name=hive-hadoop2
hive.metastore=thrift-cdp7
hive.metastore.uri=thrift://cdp-master:9083

Hive metastore and statistics#

The CDP support includes the improved thrift-cdp7 Hive metastore support. It supports the metastore thrift communication protocol regarding table statistics management implemented by CDP.

This supports separate handling of a variety of statistics for Presto:

  • Column statistics

  • Partition statistics

  • Table statistics

All statistics handling, when using CDP, is performed by the Hive connector and the thrift-cdp7 Hive metastore, and is therefore identical to standard Hive connector usage.

Reading data#

CDP support includes read operations on the following tables:

  • compacted tables

  • bucketed tables

  • partitioned tables

  • unpartitioned tables

The following file formats can be read:

  • Avro

  • CSV

  • ORC ACID

  • Parquet

  • RCFile

Writing data#

Write operations, such as CREATE TABLE AS or CREATE VIEW and others, are generally supported.

Write operations on ORC ACID tables is not supported.