Auto scaling a Presto cluster#
AWS Auto Scaling offers automatic control over the size of your Presto cluster (CloudFormation stack).
Manage auto scaling groups#
When you create a cluster, an auto scaling group (ASG) is automatically created for all the workers. To view and manage this ASG, refer to the AWS ASG page and log into your AWS account. There, you see a list of ASGs for all workers across all clusters you have running. Here, you can control how Amazon Auto Scaling manages your cluster.
Auto scaling models#
There are three types of auto scaling models you can employ to manage your cluster:
Static or manual auto scaling#
The static or manual auto scaling model is managed from the “Details” tab. This model is configured by default. In this tab, there are three main properties: “Desired Capacity”, “Min” and “Max”. Click on the “Edit” button to change those values to your desired values and when you hit “Save” the auto scaling mechanism starts to satisfy your requirements – either spinning up new workers or shutting down existing ones.
In the CloudFormation template, by default, all three properties are set to the same value. As a result the number of workers remains constant. When a worker is terminated (or is unavailable for whatever reason), auto scaling starts a new one to satisfy the requirements.
Static or scheduled auto scaling#
The static or scheduled auto scaling model is controlled from the “Scheduled Actions” tab. There, you can create a number of scheduled actions that allow you to change the size of the cluster based on the time of day. For example, you can keep a small number of nodes during the night, and boost it during different parts of the day.
The configuration of this model is a simple list of actions that are scheduled to execute and change the static values of “Min”, “Max” and “Desired Capacity” properties to some other arbitrary (static) values of your choosing. Such an action is executed with the configured schedule, either once or in a repetitive manner (cron). Continuing on the previous example, you can configure a nightly cooldown – one event to handle lowering the values in the evening and another event every morning to bring them back up.
Dynamic auto scaling#
Dynamic auto scaling uses policies which you define in the in the “Scaling Policies” tab. Of the three types of policies, “scaling policy with steps” and “target tracking scaling policy” (default policy), are the most useful. The third is a special case of the “with steps” policy that contains a single step. You can change the policy type by clicking a link at the bottom of the “Scaling Policies” tab.
Dynamic target tracking:
With the dynamic target tracking policy you: (1) choose a relevant metric (eg., avg CPU utilization) and state the target value; and (2) indicate the time buffer to wait before reassessing the metric to let the new nodes start up and start contributing to the metric value. Additionally, you can disable scale-in to have the mechanism be able to only increase the worker count, not shrink the cluster.
Dynamic “with steps”:
The dynamic “with steps” policy is more complex, as it consists of an alarm and a number of adjustments. To define an alarm, you must choose a metric and define its breach criteria (eg., avg CPU utilization over a chosen period of time higher than 70%). Additionally, the alarm can optionally send an event to an SNS topic for other systems to observe. Once the alarm is breached, a set of adjustments to the number of nodes are executed. Those adjustments can be either arbitrary (setting the number of nodes to a specific value) or increments. The increments can be a value (eg., add 2 nodes, or remove 1 node) or a percentage of the current number of workers (eg., add 10%, or reduce by 20%).
Auto scaling activity#
All events in the auto scaling mechanism can be observed in the Activity History tab. This is very useful for debugging purposes. The current instances part of the ASG are listed in the “Instances” tab. There you can observe which instances are currently being started-up or decommissioned.
Auto scaling can also be used for clusters built manually using the Starburst AMIs and not using the CloudFormation stack. The workers need to be manually put into a single ASG, and configured as described above. Graceful scaledown of workers, as described in the Graceful scaledown of workers, does not work for manually setup auto scaling groups.
Graceful scaledown of workers#
When a CloudFormation stack is created using the CloudFormation template all the workers are automatically organized within an AWS ASG.
When AWS auto scaling resizes the cluster it starts decommissioning workers. The CloudFormation stack has features to make sure this process doesn’t disrupt the usage of the cluster, most importantly that no queries fail because of that.
Without this feature if a worker is forcefully shut down, all queries currently running fail and need to be restarted.
How it works#
With graceful scaledown, when the ASG stack is modified to shrink the cluster (number of workers lowered, or the auto scaling group is configured to do so automatically) then AWS auto scaling notifies the workers it intends to shut down and let them prepare for this.
The worker enters a special state in which it (1) stops serving new requests, (2) continues processing the current query tasks that are scheduled on it and (3) shuts down after finishing that work. Next after a 2 minute quiet period the worker process automatically exits, and notifies the auto scaling mechanism to proceed with the termination of its EC2 node.
The maximum time a worker can postpone AWS auto scaling termination of its node is 48hrs, this is a AWS limitation.
AWS elements on the stack#
The CloudFormation Template creates a number of resources on the stack:
an SQS Queue that this hook writes to
an IAM Role to allow the workers to talk to SQS, Auto scaling and EC2 services. The role is fine grained to allow only the necessary actions. It’s discussed in a section below.
All the resources created on the stack are explicit, and you can find them and view their settings/permissions. All resources are terminated once the stack is deleted.
Presto node role permissions#
The Presto node role is created automatically by the CloudFormation template on the stack (and deleted when the stack is deleted).
When using SEP via our CloudFormation template, by default, you do not need to provide anything. The template creates all necessary resources automatically.
If you need to provide your own IAM Instance Profile for the Presto instances
IamInstanceProfile field in the Stack creation form), consult the
IAM role Permissions for Presto cluster nodes section. Same applies when launching the AMI
manually. Make sure you choose an IAM Role that satisfies the requirements.
Graceful scaledown limitations#
Presto instances created manually from the AWS Marketplace AMIs and manually setup in an ASG do not benefit from this mechanism without manual setup. They are operating without graceful scaledown, so when auto scaling kicks in, all queries that are currently running may fail. In that case, at boot time a warning log is recorded in the graceful scaledown handler log saying it’s not running - this is intended behavior.