7.16. Hive level security with Apache Sentry#
Apache Sentry is a granular, role-based authorization module for Hadoop. Sentry provides the ability to control and enforce precise levels of privileges on data for authenticated users and applications such as Presto.
Hive level security with Apache Sentry requires a valid Starburst Enterprise Presto license.
SEP can use Apache Sentry for role-base access control enforcing the same and existing privileges granted on Hive objects. Presto will enforce privileges assigned to Hive Databases, Tables, Columns, and Views. If a user does not have a privilege to query an object, the query will fail and an error will be returned.
The Hive level security for Apache Sentry is limited to usage with Hive connector only. We suggest to replace it with the more powerful system level security using Apache Ranger. It is capable of securing catalogs using other connectors and many other aspects.
Before You Begin#
Before you configure Presto with Apache Sentry, verify the following prerequisites:
Cloudera Enterprise 5.12+ with Apache Sentry and Hive installed.
Presto Coordinator and Presto Workers have the appropriate network access to communicate with the Apache Sentry Service. Typically this is port 8038.
If LDAP is used for user to groups mapping, Presto Coordinator and Presto Workers have the appropriate network access to communicate with the LDAP server. Typically this is port 636 or 389.
If you are new to Apache Sentry, Cloudera provides excellent documentation for installing and configuring Apache Sentry:
When a query is submitted to Presto, Presto parses and analyzes the query to understand the privileges required by the user to access a particular object. Presto communicates with the Apache Sentry Service to determine if the request is valid. If the request is valid, the query continues to execute. If the request is invalid, an error is returned to the user.
Caching is also used to improve performance and reduce the number of requests to the Sentry service.
Configuring Presto with Apache Sentry#
Apache Sentry Configuration#
As with Hive, Impala, Spark, and Hue, you must create a Admin Group for Presto named
You can do this via the Cloudera Manager or manually by adding to the property,
sentry.service.admin.group in the
sentry-site.xml file. The user of the Presto process
should belong to this group. Additionally you must add Presto user (from
For additional information refer to the Cloudera documentation:
Presto Configuration for Apache Sentry#
SEP must be configured to enable Presto to communicate with the Apache Sentry service. To enable set the following property in the Hive connector configuration.
hive.security = sentry
Once Apache Sentry is enabled for Presto, there are additional required and optional properties to further configure. You can also see the full list of configuration properties in Apache Sentry Based Authorization.
Sentry manages role permissions and the roles to user groups associations.
Sentry does not manage users to user groups associations. For this reason, any
application using Sentry needs to be configured to be able to determine a user’s
groups. In Presto, the
sentry.group-mapping property specifies how the user
groups are determined. By default it is set to
HADOOP_DEFAULT. See Apache Sentry Based Authorization for other possible values.
For more information from Cloudera’s documentation, refer to:
It may be desired to reuse your existing
sentry-site.xml configuration instead of setting new configurations
in the Hive connector configuration. To have Presto use an XML configuration file, set
to the file location of a
sentry-site.xml configuration file.
HADOOP_DEFAULT group mapping and
sentry.config.resources is set,
and the provided file(s) contain a value for
the configured user group mapping will be used. If you do not set
Presto will use Hadoop’s default behavior, which is to retrieve user groups from the local operating system.
Similarly, when using
LDAP group mapping and you provide Hadoop configuration files using
sentry.config.resources property, you can abstain from setting LDAP Group mapping properties in Hive connector
There is some latency associated with making the remote procedure calls to Apache Sentry as well as syncing LDAP groups. Presto includes a caching mechanism so that subsequent calls can look at the cache before making the remote call.
See Apache Sentry Based Authorization for the cache properties along with their default values. Depending on your use case, you may want to increase or decrease the default ttl values.
If you get the exception
GSSException: Failure unspecified at GSS-API level (Mechanism level: Checksum failed)then you need to make sure you are using proper
If you get an
SentryAccessDeniedExceptionexception then make sure the user that you set for
sentry.admin-userbelongs to any group listed by
If Presto is not capable to connect to Kerberized Sentry and you get an exception
Peer indicated failure: Problem with callback handlermake sure that you added the Presto user (from
sentry-site. Additionally, make sure the letter casing matches.
Make sure that your
sentry.servervalue is correct. It is not an IP or Hostname. It is server object name in Sentry.
Presto only enforces the Apache Sentry policies. Presto does not support any modification of authorization policies in Sentry. This includes commands like CREATE ROLE, GRANT, or REVOKE. If you need to modify the roles and privileges, that must be done via another tool such as Apache Hive or Hue. Sentry Policy Files are also not supported.