6.4. Presto Kubernetes Resource#
A Presto Kubernetes resource represents a specific Presto cluster. The YAML file allows you to specify various properties related to the coordinator and workers, the catalogs and the cluster overall.
Overview#
The following snippet shows all available properties with defaults. When defining your particular Presto cluster resource, you need to specify only the properties with non-default values.
apiVersion: starburstdata.com/v1
kind: Presto
metadata:
name: presto-cluster-name
spec:
nameOverride: ""
clusterDomain: cluster.local
environment: ""
additionalJvmConfigProperties: ""
additionalCatalogs: {}
additionalEtcPrestoTextFiles: {}
additionalEtcPrestoBinaryFiles: {}
licenseSecretName: ""
imageNamePrefix: ""
additionalBootstrapScriptVolume: {}
additionalBootstrapScriptVolumes: {}
additionalVolumes: []
prometheus:
enabled: false
additionalRules: {}
service:
type: ClusterIP
name: ""
additionalSpecProperties: {}
nodePort: 31234
image:
name: starburstdata/presto:338-e.5-k8s-0.50
pullPolicy: Always
memory:
nodeMemoryHeadroom: 2Gi
xmxToTotalMemoryRatio: 0.9
heapHeadroomPerNodeRatio: 0.3
queryMaxMemory: 1Pi
queryMaxTotalMemoryPerNodePoolFraction: 0.333
coordinator:
cpuLimit: ""
cpuRequest: 16
memoryAllocation: 60Gi
nodeSelector: {}
affinity: {}
additionalProperties: ""
additionalAnnotations: {}
worker:
count: 2
autoscaling:
enabled: false
minReplicas: 1
maxReplicas: 100
targetCPUUtilizationPercentage: 80
deploymentTerminationGracePeriodSeconds: 7200 # 2 hours
prestoWorkerShutdownGracePeriodSeconds: 120
cpuLimit: ""
cpuRequest: 16
memoryAllocation: 100Gi
nodeSelector: {}
affinity: {}
additionalProperties: ""
additionalAnnotations: {}
readinessProbe:
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 15
livenessProbe:
initialDelaySeconds: 300
periodSeconds: 300
failureThreshold: 1
timeoutSeconds: 15
spilling:
enabled: false
volume:
emptyDir: {}
usageMetrics:
enabled: true
usageClient:
initialDelay: 1m
interval: 1m
hive:
metastoreUri: ""
awsSecretName: ""
googleServiceAccountKeySecretName: ""
azureWasbSecretName: ""
azureAbfsSecretName: ""
azureAdlSecretName: ""
additionalProperties: ""
internalMetastore:
mySql:
jdbcUrl: ""
username: ""
password: ""
postgreSql:
jdbcUrl: ""
username: ""
password: ""
internalPostgreSql:
enabled: false
image:
name: postgres:9.6.10
pullPolicy: IfNotPresent
storage:
className: ""
size: 10Gi
claimSelector: {}
memory: 2Gi
cpu: 2
nodeSelector: {}
affinity: {}
s3Endpoint: ""
image:
name: starburstdata/hive-metastore:k8s-0.7
pullPolicy: IfNotPresent
memory: 6Gi
cpu: 2
nodeSelector: {}
affinity: {}
General Properties#
Property name |
Example |
Description |
---|---|---|
nameOverride |
|
Presto Operator by default assigns a unique name to various Kubernetes resources (e.g. Services). The name consists of the cluster name and a unique suffix. Use this property to assign a static name instead. |
clusterDomain |
|
Domain of the K8s cluster. |
environment |
|
The name of the Presto cluster environment. |
additionalJvmConfigProperties |
additionalJvmConfigProperties: |
-XX:NewRatio=4
-XX:SurvivorRatio=
|
Specify additional properties for the configuration of the JVM running Presto. Properties are appended to the default configuration. |
additionalCatalogs |
additionalCatalogs:
tpcds: |
connector.name=tpcds
jmx: |
connector.name=jmx
cms: |
connector.name=postgresql
connection-url=jdbc:postgresql://example.com:5432/cmsdb
connection-user=myuser
connection-password=mypassword
|
Add one or more catalogs with the relevant properties files. The element name determines the name of the catalog and the multi-line text sets the content of the properties file. |
additionalEtcPrestoTextFiles |
additionalEtcPrestoTextFiles:
access-control.properties: |
access-control.name=read-only
event-listener.properties: |
event-listener.name=event-logger
jdbc.url=jdbc:postgresql://example.com:5432/eventlog
jdbc.user=myuser
jdbc.password=mypassword
|
Add one or more configuration text files to the |
additionalEtcPrestoBinaryFiles |
Add one or more binary files to the |
|
licenseSecretName |
|
Name of a Kubernetes Secret that contains a SEP license file. The license file within the secret
should be named |
imageNamePrefix |
|
Specifies prefix of Docker image names used by the Presto cluster. This property enables using a private Docker registry. |
image |
image:
name: org/name:tag
pullPolicy: IfNotPresent
|
Image section allows you to specify a custom Docker image to be used by the cluster with organization namespace, image name and tag. |
additionalBootstrapScriptVolume |
additionalBootstrapScriptVolume:
configMap:
name: my-bootstrap-script
|
Property of coordinator and worker pod. Allows adding a custom bootstrap script. |
additionalBootstrapScriptVolumes |
additionalBootstrapScriptVolumes:
- configMap:
name: my-bootstrap-script-1
- configMap:
name: my-bootstrap-script-2
|
Property of coordinator and worker pod. Allows adding a custom bootstrap script. |
additionalVolumes |
additionalVolumes:
- name:
emptyDir: {}
name: my-bootstrap-script-1
- path: /var/lib/presto/cache1
volume:
hostPath:
path: /media/nv1/presto-cache
- path: /var/lib/presto/cache2
volume:
hostPath:
path: /media/nv2/presto-cache
|
Add one or more volumes supported by k8s, to all nodes in the cluster. Optionally, it can be mounted on a specific path by adding it to the container image. You can use this feature to add one or more volumes and use them to cache distributed storage objects. |
Service Properties#
Access to coordinator is possible via the Kubernetes Service. By default the
service is only accessible within the Kubernetes cluster at
http://presto-coordinator-CLUSTER_NAME_UUID.NAMESPACE.svc.cluster.local:8080
,
where NAMESPACE
is the Kubernetes namespace where the given Presto
cluster is deployed and CLUSTER_NAME_UUID
is the cluster name with
unique suffix appended.
Use the service.name
Presto resource parameter to make the service name
more predictable. For example, setting service.name=test-cluster
causes
the coordinator service address to be
http://presto-coordinator-test-cluster.NAMESPACE.svc.cluster.local:8080
.
Use the nameOverride
parameter in order to set CLUSTER_NAME_UUID
to a different value.
You can also change type of the coordinator Service using the
service.type
parameter. For more information on Kubernetes Service
types, refer to Kubernetes Services types.
You can add additional parameters to spec section of Service by using
service.additionalSpecProperties
Presto resource parameter, e.g.
service:
additionalSpecProperties:
loadBalancerIP: 78.11.24.19
type: LoadBalancer
Use service.nodePort
parameter to specify the port on which Presto
Coordinator Service should be exposed when service.type
is set to
NodePort
, e.g.
service:
type: NodePort
nodePort: 3001
General Memory Properties#
The memory
section specifies general Presto memory configuration.
Property name |
Example |
Description |
---|---|---|
memory.nodeMemoryHeadroom |
memory:
nodeMemoryHeadroom: 2Gi
|
Memory headroom that Presto should leave on a node when Presto pods are
configured to use entire node memory (empty |
memory.xmxToTotalMemoryRatio |
memory:
xmxToTotalMemoryRatio: 0.9
|
Ratio between Presto JVM heap size and memory available for a Presto pod. |
memory.heapHeadroomPerNodeRatio |
memory:
heapHeadroomPerNodeRatio: 0.3
|
Ratio between |
memory.queryMaxMemory |
memory:
queryMaxMemory: 1Pi
|
Value of the |
memory.queryMaxTotalMemoryPerNodePoolFraction |
memory:
queryMaxTotalMemoryPerNodePoolFraction: 0.333
|
Value the |
Coordinator Properties#
All coordinator properties are nested within coordinator
.
Property name |
Example |
Description |
---|---|---|
cpuRequest and cpuLimit |
coordinator:
cpuRequest: 6
cpuLimit: 32
|
Specifies the coordinator pod’s CPU limit and request. |
memoryAllocation |
coordinator:
memoryAllocation: 60Gi
|
Specifies coordinator pod’s memory usage (both request and limit). If empty, the coordinator pod utilizes the entire memory available on the node. |
nodeSelector and affinity |
coordinator:
nodeSelector:
role: "presto"
affinity:
podAffinity:
...
|
Specifies the coordinator pod’s node selector and affinity. |
additionalProperties |
coordinator:
additionalProperties:
resource-groups.config-file=filename
|
Specify additional configuration properties. |
additionalAnnotations |
Worker Properties#
All worker properties are nested within worker
.
Property name |
Example |
Description |
---|---|---|
count |
worker:
count: 3
|
Number of worker pods. |
autoscaling |
worker:
autoscaling:
enabled: true
minReplicas: 1
maxReplicas: 100
targetCPUUtilizationPercentage: 80
|
Configuration of workers autoscaling. |
deploymentTerminationGracePeriodSeconds |
worker:
deploymentTerminationGracePeriodSeconds: 7200
|
Specifies termination grace period for workers pods. Worker pods are not terminated until queries running on the pod are finished or grace period passes. |
prestoWorkerShutdownGracePeriodSeconds |
||
cpuRequest and cpuLimit |
worker:
cpuRequest: 6
cpuLimit: 32
|
Specifies worker pod CPU limit and request. |
memoryAllocation |
worker:
memoryAllocation: 100Gi
|
Specifies worker pod memory usage (both request and limit). If empty, the worker pod utilizes the entire memory available on the node. |
nodeSelector and affinity |
worker:
nodeSelector:
role: "presto"
affinity:
podAffinity:
...
|
Specifies the worker pod’s node selector and affinity. |
additionalProperties |
worker:
additionalProperties:
resource-groups.config-file=filename
|
Specify additional configuration properties. |
additionalAnnotations |
Readiness and Liveness Probe#
You can configure Kubernetes probes
with the readinessProbe
and livenessProbe
elements.
Property name |
Example |
Description |
---|---|---|
readinessProbe |
readinessProbe:
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 15
|
Properties of the coordinator and worker readiness probes. For more information on readiness probes, refer to Kubernetes probes. |
livenessProbe |
livenessProbe:
initialDelaySeconds: 300
periodSeconds: 300
failureThreshold: 1
timeoutSeconds: 15
|
Properties of coordinator and worker liveness probes. For more information on liveness probes, refer to Kubernetes probes. |
Usage Metrics#
The defaults configure usage metrics correctly. No changes are necessary.
Spill to Disk Properties#
You can add an additional volume and enable the optional disk spilling in the spilling
section. Use any type of volume supported
by k8s,
and configure it in spec.volumes
.
The volume needs to be defined, and then used for spilling:
spec:
volumes:
- name: someVolume
emptyDir: {}
- name: mySpillVolume
hostPath:
path: /opt/data
type: Directory
spilling:
enabled: true
volume:
mySpillVolume
The default spilling configuration uses spec.volumes[0]
set to emptyDir
.
We recommend using emptyDir
or hostPath
type for spilling. Ensure that
each worker has exclusive access to its spilling volume, and the path
exists.
Hive Connector Properties#
SEP on Kubernetes provides automatic configuration of the Hive connector. Such a connector allows you to either access an external Metastore or use built-in internal Presto cluster Metastore as well.
External Metastore#
You can configure Presto to use an external Hive Metastore by setting the
hive.metastoreUri
property, e.g.
hive:
metastoreUri: thrift://hive-metastore:9083
Property name |
Example |
Description |
---|---|---|
awsSecretName |
Name of the AWS secret to access the metastore |
|
googleServiceAccountKeySecretName |
Name of the secret key for the Google service account to access the metastore |
|
azureWasbSecretName |
Name of the secret for the Windows Azure Storage Blob (WASB) to access the metastore |
|
azureAbfsSecretName |
Name of the secret for the Azure Blob Filesystem (ABFS) to access the metastore |
|
azureAdlSecretName |
Name of the secret for the Azure Data Lake (ADL) to access the metastore |
|
additionalProperties |
hive:
additionalProperties: |
hive.allow-drop-table=true
|
Specify additional properties to configure the Hive connector behavior. |
Internal Metastore#
You can configure Presto to use an internal Hive Metastore by setting the
hive.internalMetastore.mySql
or
hive.internalMetastore.postgreSql
properties, e.g.
hive:
internalMetastore:
mySql:
jdbcUrl: jdbc:mysql://mysql-server/metastore-database
username: hive
password: hivePassword
or
hive:
internalMetastore:
postgreSql:
jdbcUrl: jdbc:postgresql://postgresql-server/metastore-database
username: hive
password: hivePassword
In such a case, an additional Hive Metastore pod is created. It stores the metadata in the provided PostgreSQL or MySQL database.
You can also make the internal Metastore to use an internal ephemeral PostgreSQL
database. This can be enabled by setting the
hive.internalMetastore.internalPostgreSql.enabled
to true
, e.g.
hive:
internalMetastore:
internalPostgreSql:
enabled: true
In this case, an additional PostgreSQL pod is created along with the Hive Metastore pod. This Hive Metastore uses the internal ephemeral PostgreSQL as a relational database.
Note: Using the internal ephemeral PostgreSQL is not recommended for production. Data stored within the internal PostgreSQL is lost when the Presto cluster is terminated.
Note: You cannot use external and internal Metastore at the same time. You also cannot use external and internal relational database for Metastore at the same time.
Customizing Internal Hive Metastore Pod#
It possible to customize internal Hive Metastore pod’s Docker image via the
hive.internalMetastore.image
property, e.g.
hive:
internalMetastore:
image:
name: customized-internal-hive-metastore:0.1
pullPolicy: IfNotPresent
It is possible to change the resources required by Hive Metastore’s pod via:
hive.internalMetastore.cpu
hive.internalMetastore.memory
hive.internalMetastore.nodeSelector
hive.internalMetastore.affinity
properties.
Customizing Internal PostgreSQL Docker Pod#
It is possible to customize the internal PostgreSQL pod’s Docker image via the
hive.internalMetastore.internalPostgreSql.image
property, e.g.
hive:
internalMetastore:
internalPostgreSql:
image:
name: customized-internal-postgresql:0.1
pullPolicy: IfNotPresent
It is possible to change the resources required by PostgreSQL’s pod via:
hive.internalMetastore.internalPostgreSql.cpu
hive.internalMetastore.internalPostgreSql.memory
hive.internalMetastore.internalPostgreSql.nodeSelector
hive.internalMetastore.internalPostgreSql.affinity
properties.
It is possible to specify the storage used by PostgreSQL’s pod via: the
hive.internalMetastore.internalPostgreSql.storage
property, e.g.
hive:
internalMetastore:
internalPostgreSql:
storage:
className: db-class
size: 20Gi
claimSelector:
matchLabels:
release: "stable"
matchExpressions:
- {key: environment, operator: In, values: [dev]}
Prometheus Support#
Presto Kubernetes supports exposing Presto metrics to Prometheus. You can use the Prometheus Operator in order to set up Prometheus in your Kubernetes cluster and collect Presto metrics.
To enable Presto Prometheus metric endpoints set the prometheus.enabled
property to true
. This causes the following additional Kubernetes
services to be created:
prometheus-coordinator-CLUSTER_NAME_UUID
prometheus-worker-CLUSTER_NAME_UUID
where CLUSTER_NAME_UUID
is the cluster name with a unique suffix. Use
the nameOverride
property in order to set CLUSTER_NAME_UUID
to a
different value.
Those services expose Presto metrics in the Prometheus format. You can use ServiceMonitor resources from the Prometheus Operator to make Prometheus collect metrics from those endpoints. In order to match Presto Prometheus endpoints you can use labels, e.g.
matchLabels:
instance: CLUSTER_NAME_UUID
role: "prometheus-coordinator"
The following Presto and JVM metrics are exported by default:
running_queries
- number of currently running queriesqueued_queries
- number of currently queued queriesfailed_queries
- total count of failed queriesjvm_gc_collection_seconds_sum
- total GC (young and old) time in secondsjvm_gc_collection_seconds_count
- total GC (young and old) countjvm_memory_bytes_committed, jvm_memory_bytes_init, jvm_memory_bytes_max, jvm_memory_bytes_used
- JVM memory usage metrics (heap and non-heap). For more information see MemoryMXBean.