14.32. Teradata Connector

The Teradata Database connector allows querying and creating tables in an external Teradata database. This can be used to join data between different systems like Teradata and Hive, or between different Teradata Database instances.

Presto Teradata Connector supports Teradata 16.

Note

This connector is bundled in Presto Enterprise and requires a license from Starburst. For more information about Presto Enterprise and Teradata connector or to obtain a free trial, please contact hello@starburstdata.com.

Installation and Configuration

Prerequisites

Any of the below installation and configuration methods require the following artifacts:

  • Teradata’s JDBC driver, may be obtained from Teradata’s downloads.teradata.com
  • Connection details to connect to the Teradata DB, in form of a regular Presto connector config (e.g. teradata.properties for a Presto catalog named teradata). File should contain the following contents, replacing the connection properties as appropriate for your setup:
connector.name=teradata
connection-url=jdbc:teradata://HOST
connection-user=USERNAME
connection-password=PASSWORD

Please refer to the Teradata documentation for more information about format and parameters of the JDBC URL supported by the Teradata JDBC Driver.

Plugin Configuration

To install the Teradata Database connector manually, use the following directions. You will need an installation of the Starburst Distribution of Presto and an Teradata JDBC Driver files (terajdbc4.jar and tdgssconfig.jar).

  1. Add the Teradata JDBC Driver JAR files to the Presto plugin/teradata directory.
  2. Add Presto Teradata catalog properties file (eg. teradata.properties for a Presto catalog named teradata).
  3. Perform the above steps on every Presto cluster node.
  4. Restart Presto on every node.

Kerberos Authentication

The Teradata connector supports Kerberos-based authentication with the following configuration:

teradata.authentication.type=KERBEROS
kerberos.client.principal=example@example.com
kerberos.client.keytab=etc/kerberos/example.keytab

With this configuration the user example@example.com, defined in the principal property, is used to connect to the database, and the related Kerberos service ticket is located in the example.keytab file.

Kerberos Credential Passthrough

The Teradata connector can be configured to pass through Kerberos credentials, received by the Presto server, to the Teradata database.

Configure Kerberos and the Presto server, following the instructions in Credential Passthrough.

Configure the Teradata connector to pass through the credentials from the server to the database:

teradata.authentication.type=PASS_THROUGH

Now any database access via Presto is subject to the data access restrictions and permissions of the user supplied via Kerberos.

Querying Teradata Databases

The Teradata connector provides a schema for each Teradata database’s. Assuming the catalog name is teradata, you can see the available schemas by running SHOW SCHEMAS:

SHOW SCHEMAS FROM teradata;

If you have an Teradata database named web, you can view the tables they own by running SHOW TABLES:

SHOW TABLES FROM teradata.web;

You can see a list of the columns in the clicks table in web’s schema using either of the following:

DESCRIBE teradata.web.clicks;
SHOW COLUMNS FROM teradata.web.clicks;

Finally, you can access the clicks table in web’s schema:

SELECT * FROM teradata.web.clicks;

Your privileges in these schemas are those of the user configured in the connection properties file. If the user does not have access to these tables, you will not be able to access them.

Configuring Concurrency

Presto is reading from Teradata Database using multiple parallel JDBC connections. By default, number of parallel reads is determined automatically based on number of AMPs available in Teradata. The parallel reads be configured by using below properties.

Teradata Concurrency Configuration Properties

Property Name Description Default
teradata.concurrency-type
Determines the parallelism method. Possible values are:
  • AMP_BASED - single JDBC connection per Teradata AMP
  • MANUAL - user configures number of parallel connections
AMP_BASED
teradata.connections-count Number of parallel connections, to be used when concurrency-type=MANUAL is used. Use 1 to disable parallel read. Number of AMP in Teradata Database

User Impersonation

Teradata connector supports user impersonation. Please see User Impersonation for more details.

User impersonation can be turned with:

teradata.impersonation-enabled=true

User impersonation in Teradata connector is based on GRANT CONNECT THROUGH. For more details, search for GRANT CONNECT THROUGH on the Teradata documentation.

Note

Option WITH TRUST_ONLY in GRANT CONNECT THROUGH is not yet supported in Terdata connector.

Table Statistics

Teradata connector supports table and column statistics. The statistics are collected by Teradata Database and retrieved by the connector. To collect statistics for a table, execute the following statements in Teradata Database. The second statement needs to be repeated for every column you want to collect statistics for.

COLLECT SUMMARY STATISTICS ON table_schema.table_name;
COLLECT STATISTICS COLUMN(column_name) ON table_schema.table_name;

The table and column statistics can be viewed in Presto using SHOW STATS and are used for Cost-Based Optimizations.

Table Statistics Configuration Properties

Property Name Description Default
statistics.enabled Enables table and column statistics. true
statistics.cache-ttl Duration for which table and column statistics are cached. 10m
statistics.cache-missing Cache the fact that table statistics were not available. false

Mapping Data Types Between Presto and Teradata

Presto and Teradata each support different data types for table columns and use different names for some of them. Some data types are not supported equally by both systems. The following tables display the mapping used by Presto when working with existing columns, and when creating tables in Teradata.

Mappings for working with existing fields in Teradata:

Teradata data type Presto data type
BYTEINT TINYINT
SMALLINT SMALLINT
INTEGER INTEGER
BIGINT BIGINT
REAL REAL
FLOAT REAL
DOUBLE DOUBLE
NUMERIC DECIMAL
DECIMAL DECIMAL
CHAR CHAR
NCHAR CHAR
VARCHAR VARCHAR
NVARCHAR VARCHAR
LONGVARCHAR VARCHAR
LONGNVARCHAR VARCHAR
DATE DATE
TIMESTAMP TIMESTAMP
TIMESTAMP WITH TIME ZONE not supported
JSON not supported
XML not supported

Any Teradata data types not listed above, such as JSON or XML and others, are not supported.

If a Teradata column uses a type not listed in the above table, any query accessing such a field fails.

Mappings for creating tables with specific fields in Teradata:

Presto data type Teradata data type
TINYINT BYTEINT
SMALLINT SMALLINT
INTEGER INTEGER
BIGINT BIGINT
REAL REAL
DOUBLE DOUBLE PRECISION
DATE DATE
TIMESTAMP TIMESTAMP
CHAR CHAR
VARCHAR VARCHAR

The mapping above applies to creating a table with CREATE TABLE and CREATE TABLE ... AS [SELECT] statements.

Any Presto data type not listed in the above table can not be created as a column in a new Teradata table.

Configuration

Configuration Property Name Session Property name Description Default
jdbc-types-mapped-to-varchar   Coma separated list of data types that will be converted to unbounded VARCHAR.  
unsupported-type.handling-strategy unsupported_type_handling_strategy
Configures how unsupported column data types are handled:
  • IGNORE - column is not accessible.
  • CONVERT_TO_VARCHAR - column is converted to unbounded VARCHAR.
IGNORE
teradata.number.default-scale number_default_scale Default Presto DECIMAL scale for Teradata NUMBER (without precision and scale) date type. When not set then such column is treated as not supported. 0
teradata.number.rounding-mode number_rounding_mode

Rounding mode for the Teradata NUMBER data type. Possible values are:

  • UNNECESSARY - Rounding mode to assert that the requested operation has an exact result, hence no rounding is necessary.
  • CEILING - Rounding mode to round towards positive infinity.
  • FLOOR - Rounding mode to round towards negative infinity.
  • HALF_DOWN - Rounding mode to round towards nearest neighbor unless both neighbors are equidistant, in which case rounding down is used.
  • HALF_EVEN - Rounding mode to round towards the nearest neighbor unless both neighbors are equidistant, in which case rounding towards the nearest neighbor is performed.
  • HALF_UP - Rounding mode to round towards nearest neighbor unless both neighbors are equidistant, in which case rounding up is used
  • UP - Rounding mode to round towards zero.
  • DOWN - Rounding mode to round towards zero.
UNNECESSARY

Lower case of Teradata NOT CASESPECIFIC char types in Presto

Presto does not support case insensitive comparision between values that of char-based type. However, it is possible to force Presto Teradata connector to lower case values of such types. This can be turned with:

teradata.type.not-case-specific.to-lower-case=true

or with catalog session property:

SET SESSION teradata.not_casespecific_to_lower_case = true;