13.9. Hive MapR Connector

Overview

The Hive MapR connector allows querying data stored in a Hive data warehouse on a MapR cluster. The connector is based on the Hive Connector, see Hive Connector documentation for information about supported file types and generic configuration properties.

Note

The Hive MapR connector is currently a beta preview functionality. This connector is bundled in Presto Enterprise and requires a license from Starburst. For more information about Presto Enterprise and the MapR connector or to obtain a free trial, please contact hello@starburstdata.com.

Configuration

The Hive MapR connector supports MapR 6 clusters, using MapR SASL authentication.

Create etc/catalog/hive.properties with the following contents to mount the hive-mapr connector as the hive catalog, replacing example.net:9083 with the correct host and port for your Hive metastore Thrift service:

connector.name=hive-mapr
hive.metastore.uri=thrift://example.net:9083

MapR client libraries need to be installed and configured separately on each node running Presto. Please refer to MapR documentation for MapR Client installation instructions specific for your environment.

MapR SASL Authentication

MapR SASL authentication is supported for both MapR Filesystem and the Hive metastore. The authentication is based on MapR ticket file and is enabled by default. MapR ticket search path is the following:

  • /opt/mapr/conf/maprserverticket (or ${MAPR_HOME}/conf/maprserverticket when MAPR_HOME environment variable is set),
  • location defined by MAPR_TICKETFILE_LOCATION environment variable (if it is set),
  • /tmp/maprticket_${UID} where ${UID} is replaced with user id of the OS user executing Presto process (applicable when MAPR_TICKETFILE_LOCATION environment variable is not set).

Impersonation

The connector supports impersonation for both MapR Filesystem and the Hive metastore. Impersonation is disabled by default. You can enable impersonation using hive.metastore.impersonation-enabled=true, hive.maprfs.impersonation.enabled=true properties.

When enabling impersonation make sure that the MapR ticket used by Presto allows impersonation. The ticket should be of servicewithimpersonation type. Ticket type can be verified using maprlogin print -ticketfile path/to/ticket/file command.

Additionally, Presto service user must be allowed to impersonate users in Hive Metastore. Typically, this is achieved by setting hadoop.proxyuser.mapr.groups and hadoop.proxyuser.mapr.hosts properties to appropriate values in core-site.xml on the node running Metastore. Please refer to MapR documentation for detailed instructions.

Volumes

Typically, your MapR Filesystem is divided into multiple volumes. Presto creates temporary files when writing data to Hive tables backed by the MapR Filesystem (this is similar to “scratch directory” in Hive). It is strongly recommended to use a temporary staging directory that is on the same volume as the tables being written to, otherwise data needs to be written twice (once to the staging directory and then copied to the destination directory).

The temporary staging directory location can be configured using <catalog>.temporary_staging_directory_path session property or hive.temporary-staging-directory-path configuration property. The ${USER} placeholder can be used to use a different location for each user.

If you enable the MapR Filesystem impersonation in Presto, make sure the impersonated user has read and write permission to the selected location. /user/${USER}/tmp/presto is the recommended staging directory location when using impersonation. If you do not enable MapR Filesystem impersonation in Presto, make sure the Presto service user has read and write permission to the selected location. When impersonation is disabled, /user/presto/tmp/presto-${USER} is the recommended staging directory location.

MapR-specific Configuration Properties

See Hive Configuration Properties for general Hive Connector configuration properties. Configuration properties specific to Hive MapR are documented below.

Property Name Description Default
hive.metastore.authentication.type Hive metastore authentication type. Possible values are NONE or MAPR_SASL. MAPR_SASL
hive.metastore.impersonation-enabled Enable Metastore end user impersonation. false
hive.maprfs.impersonation.enabled Enable MapR Filesystem end user impersonation. false

Unsupported Configuration Properties

Certain Hive Configuration Properties are not supported in Hive MapR connector. These are listed below.

  • Kerberos-specific properties,
  • hive.hdfs.*.

Limitations

The following functionalities have not yet been certified by Starburst when used in combination with the Hive MapR connector: * Apache Sentry integration, * Apache Ranger integration.