Hive level security with Apache Ranger#

Apache Ranger offers comprehensive security for Hadoop. Using the Apache Ranger web console, you can manage policies for access to data for authenticated users and applications such as Presto.

SEP can use Apache Ranger for role-base access control enforcing the same and existing privileges granted on Hive objects. Presto enforces privileges assigned to Hive databases, tables, and columns. If a user does not have a privilege to query an object, the query fails and an error is returned.

Note

Hive level security with Apache Ranger requires a valid Starburst Enterprise Presto license.

Warning

The Hive level security for Apache Ranger is limited to usage with the Hive connector only. We suggest to replace it with the more powerful system level security. It is capable of securing catalogs using other connectors and many other aspects.

Before you begin#

Before you configure Presto with Apache Ranger, verify the following prerequisites:

  • Hortonworks Data Platform (HDP) 2.6+ with Apache Ranger and Apache Hive installed.

  • If you’re not using HDP, then Apache Ranger 0.7.1+ must be installed.

  • Presto Coordinator and Presto Workers have the appropriate network access to communicate with the Apache Ranger Service. Typically this is port 6080 or 6182, if SSL is used.

If you are new to Apache Ranger, Hortonworks provides excellent documentation for installing and configuring Apache Ranger.

Architecture#

When a query is submitted to Presto, Presto parses and analyzes the query to understand the privileges required by the user to access a particular object. Presto communicates with the Apache Ranger Service to determine if the request is valid. If the request is valid, the query continues to execute. If the request is invalid, an error is returned to the user.

Caching is also used to improve performance and reduce the number of requests to the Ranger service.

Configuring Presto with Apache Ranger#

Apache Ranger configuration#

There needs to be sys admin Ranger user (user with role ROLE_SYS_ADMIN) that matches Presto Kerberos principal ranger.kerberos-principal or Ranger Presto plugin username ranger.presto-plugin-username and password ranger.presto-plugin-password, if BASIC auth is used.

Presto Kerberos principal is translated to Ranger user name via auth-to-local hadoop rules from core-site.xml.

Presto configuration for Apache Ranger#

SEP must be configured to enable Presto to communicate with the Apache Ranger service. To enable set the following property in the Hive connector configuration.

hive.security = ranger

Once Apache Ranger is enabled for Presto, there are additional required and optional properties to further configure.

Apache Ranger based authorization#

Find more information about the necessary configuration properties in Apache Ranger Based Authorization.

Ranger policy management with SQL#

SEP allows managing Hive policies using the SQL GRANT and REVOKE statements.

Ranger Hive service definitions use a limited access type model compared to Presto. This limits the granularity of grant and revoke operations.

SEP defines the following mapping from the Hive access type model to the Presto privileges model:

Hive access type

Presto privilege

SELECT

SELECT

UPDATE

INSERT and UPDATE and DELETE

and the other way around:

Presto privilege

Hive access type

SELECT

SELECT

INSERT

Unsupported, use ALL PRIVILEGES instead

UPDATE

Unsupported, use ALL PRIVILEGES instead

DELETE

Unsupported, use ALL PRIVILEGES instead

ALL PRIVILEGES

SELECT and UPDATE

Only managing policies for users is currently supported.

Resource and privilege relationship#

The integration of Ranger with the Hive connector does not support any kind of relation between a resource, such as a table, and the granted access. This is particularly evident when newly created resources are treated with the access rights of the existing policies.

For example, as user without full administrative access (ALL PRIVILEGES) you can end up in a situation where you create a new table, but can not grant access to that table to other users. As a workaround you can create global policies that use matching patterns and define the desired default access.

For example, if you want all tables ending with -reporting to be accessible by everyone, you can create a policy that uses a value for *-reporting for the table. If you use * for database and * for the Hive column, you can grant a specific access to such a reporting table and all its columns in any database.

Alternatively you can grant all privileges to anyone creating new objects and requiring privileges to grant access to others. This policy might be too open for your use case, depending on your security policies, the data, the number of users and other aspects.

Additional features#

In addition to enforcing the policies in Apache Ranger, Presto also integrates with:

  • Apache Ranger Key Management Service

  • row level filtering

  • column masking

  • tag based policies

  • support for AWS Glue Data Catalog

Limitations#

Authorization information cannot be accessed by querying the following tables such as information_schema.roles, information_schema.applicable_roles, information_schema.enabled_roles, and information_schema.table_privileges