Starburst MemSQL connector#

The Starburst MemSQL connector is an extended version of the MemSQL connector. It can be used to connect and query MemSQL databases. Configuration and usage is identical.

The following improvements are included:

Note

The additional features of the connector require a valid Starburst Enterprise Presto license, unless otherwise noted.

Performance#

The connector includes a number of performance improvements, detailed in the following sections.

Table statistics#

The MemSQL connector supports table and column statistics, as documented in Table Statistics. Table-level statistics are based on MemSQL’s INFORMATION_SCHEMA.TABLE_STATISTICS table.

MemSQL can automatically update its table and column statistics. In some cases, you may want to force a statistics update, for example after creating new columns or after changing data in the table. You can do that by executing the following statement in MemSQL Database:

ANALYZE TABLE table_name;

The table and column statistics can be viewed in Presto using SHOW STATS and are used for Cost based optimizations.

Note

MemSQL statistics are estimates, and Presto and MemSQL may use statistics information in different ways. For this reason, the accuracy of table and column statistics returned by the MemSQL connector might be lower than that of others connectors.

Improving statistics accuracy

You can improve statistics accuracy and access column-level statistics with histogram statistics. Column-level statistics are based on MemSQL’s column statistics INFORMATION_SCHEMA.ADVANCED_HISTOGRAMS table, if available. If that table is not available, the information is based on the INFORMATION_SCHEMA.OPTIMIZER_STATISTICS table instead.

The ADVANCED_HISTOGRAMS table includes additional stats such as the MIN and MAX values for a column, which are not available in the LEGACY_HISTOGRAMS table. MemSQL’s ADVANCED_HISTOGRAMS feature requires version 6.5 or later, and a cardinality_estimation_level >= 6.5.

To check for these conditions and determine whether ADVANCED_HISTOGRAMS are available to you, execute the following statement in MemSQL:

SELECT
   COLUMN_NAME,
   IF(RANGE_STATS=1, true, false) as histograms_available,
   IF(ADVANCED_HISTOGRAMS=1, 'Advanced', 'Legacy') as histogram_type
FROM INFORMATION_SCHEMA.OPTIMIZER_STATISTICS
WHERE DATABASE_NAME = 'db' AND TABLE_NAME = 'table';

If they are available in your MemSQL version, use the following statement to populate the ADVANCED_HISTOGRAMS table:

ANALYZE TABLE table_name COLUMNS ALL ENABLE;

Refer to the MemSQL documentation for information about options, limitations and additional considerations.

Table statistics configuration properties#

Property Name

Description

Default

statistics.enabled

Enables table and column statistics

true

statistics.cache-ttl

Duration for which table and column statistics are cached

0s

statistics.cache-missing

Cache the fact that table statistics are not available

false

Pushdown#

The connector supports pushdown for the following aggregate functions:

Dynamic filtering#

Dynamic filtering is enabled by default. You can configure the wait-for-dynamic-filters property in your catalog properties file:

wait-for-dynamic-filters=false

You can use the wait_for_dynamic_filters catalog session property in a specific session:

SET SESSION mycatalog.wait_for_dynamic_filters = true;