Requirements#

Kubernetes (k8s) and the related tools are very vibrant open source projects. The dynamic nature of these projects and the commercial extensions and modifications result in a lot of change as well as a myriad of features and options.

Usage of Presto on Kubernetes can not support all the variations and the following sections detail the specific requirements for SEP deployments on k8s.

K8s cluster requirements#

The following k8s versions are supported:

  • 1.18

  • 1.17

  • 1.16

As a result the following services can be used:

  • EKS

  • GKE

  • AKS

  • OpenShift 4.x or higher

In all cases the clusters, and the described usage and deployment, only support standard k8s tools.

The nodes in the k8s cluster need to fulfill the following requirements:

  • 64 to 256 GB RAM

  • 16 to 64 cores

  • all nodes are identical

  • each node is dedicated to one Presto worker or coordinator only

  • nodes are not shared with other applications running in the cluster

The recommended approach to achieve this is a dedicated cluster or namespace for all Presto nodes.

If your cluster is required to be shared with other applications, you need to ensure that the exclusive node access for Presto is guaranteed. You can use nodegroups, taints and tolerations, or pod affinity and anti-affinity to achieve this. This approach is not recommended, since it is more complex to implement, but can be used by experienced k8s administrators.

Scaling requirements#

If you plan to use automatic scaling of your Presto deployment, additional required components need to be installed:

Reference the documentation of the above tools for installation instructions.

The automatic scaling adds and removes worker nodes based on demand. This differs from the commonly used horizontal scaling where new pods are started on existing nodes, and is a result of the fact that Presto workers require a full dedicated node. You need to ensure that your k8s cluster supports this addition of nodes and has access to the required resources.

Access requirements#

Access to Presto from outside the cluster using the Presto CLI, or any other application, requires the coordinator to be available via HTTPS and a DNS hostname.

This can be achieved with an external load balancer and DNS that terminates HTTPS and reroutes to HTTP requests inside the cluster

Alternatively you can configure a DNS service for your k8s clsuter and configure ingress appropriately.

Installation tool requirements#

  • kubectl, version identical to the k8s cluster version

  • helm, version 3.2.4 or newer

In addition we strongly recommend Octant to simplify cluster workload visualization and management. The Octant Helm plugin can simplify usage further.

Helm chart repository#

The Helm charts and docker images required for deployment and operation are available in the Starburst Harbor instance at https://harbor.starburstdata.net.

Customer-specific user accounts to access Harbor are available from Starburst.

Installation and usage requires you to add the Helm repository on Harbor:

helm repo add \
  --username yourusername \
  --password yourpassword \
  starburstdata \
  https://harbor.starburstdata.net/chartrepo/starburstdata

Confirm success by listing the repository with the following command:

$ helm repo list
NAME           URL
starburstdata  https://harbor.starburstdata.net/chartrepo/starburstdata

If you search the repository, the available charts are listed:

$ helm search repo
NAME                                          CHART VERSION   APP VERSION     DESCRIPTION
starburstdata/starburst-hive                  338.0.0                         Helm chart for Apache Hive
starburstdata/starburst-presto                338.0.0         1.0             A Helm chart for Starburst Presto
starburstdata/starburst-ranger                338.0.0         2.0.21          Apache Ranger

After new releases from Starburst, you have to update the repository:

$ helm repo update
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "starburstdata" chart repository
Update Complete. ⎈ Happy Helming!⎈

Docker registry#

The Helm charts reference the Docker registry on Harbor to download the relevant docker images.

Create a values.yml for each Helm chart you want to install in a cluster. This file contains the configuration for the specific chart in the specific cluster.

As a minimum you need one file for the SEP Helm chart for each cluster.

Configure your credentials in the values.yml as a minimum configuration:

registryCredentials:
  enabled: true
  registry: harbor.starburstdata.net/starburstdata
  username: yourusername
  password: yourpassword

More details, including using your own Docker registry, are available in the Docker image and registry section for the SEP Helm chart

License#

You need to ensure you get a license file from Starburst and configure it, if you intend to use features of SEP that require a license,