2.0 KiB
2.0 KiB
Scheduled-scaling_686083970
Introduction
This page presents all the contents about scheduled scaling.
Background
Usually there are outages due to reaching the capacity of the farm during peak time.
Before the auto-scaling is available, we can leverage the concept of scheduled scaling:
Concept
Start up a new worker node group before the busy time begins, shutdown the new worker node group after the busy time ends. This is usually 1/3 to 1/2 of the day. So even we add 3-9 workers for the peak time, the cost is only 1-4 workers or even less, which is quite affordable.
As per previous experience, this approach is more stable than auto scaling for two reasons
- It only do one scale up and scale down per day, less interruptions
- It scale down during the non-peak hours, which has less impact on the farm
Details
Procedure
- Plan for the schedule: scale up before the peak hour begins / scale down after the peak hour ends.
- Peak ours: for example, 7 - 18 for EU8, 8-19 for BR14, etc.
- Setup
- Adding an additional worker node group for the new pods 2. Adding a new k8s cronjob for the scheduled scaling
- Detailed action during scale up
- Increase workers 2. Adding replicas to the scalable deployments 3. Rolling restart deployments
- Detailed action during scale down
- Scale in replicas 2. Drain and decrease workers 3. Rolling restart deployments
- Settings of node number based on different sizing profiles, instance type: r5.xlarge, r6i.xlarge, etc.
Number of Concurrent User Max nodes of group sma-autoscaling-nodes Max nodes of non-autoscaling nodegroup 3000 10 6 1000 6 4 Others Adjust the number based on the node resource usage monitoring Adjust the number based on the node resource usage monitoring