Files
nexus/knowledgebase/csd-wiki/ICSD/Scheduled-scaling_686083970.md

48 lines
2.0 KiB
Markdown

# Scheduled-scaling_686083970
## Introduction
This page presents all the contents about scheduled scaling.
## Background
Usually there are outages due to reaching the capacity of the farm during peak time.
Before the auto-scaling is available, we can leverage the concept of scheduled scaling:
## Concept
Start up a new worker node group before the busy time begins, shutdown the new worker node group after the busy time ends. This is usually 1/3 to 1/2 of the day. So even we add 3-9 workers for the peak time, the cost is only 1-4 workers or even less, which is quite affordable.
As per previous experience, this approach is more stable than auto scaling for two reasons
1. It only do one scale up and scale down per day, less interruptions
2. It scale down during the non-peak hours, which has less impact on the farm
## Details
#### Procedure
1. Plan for the schedule: scale up before the peak hour begins / scale down after the peak hour ends.
1. Peak ours: for example, 7 - 18 for EU8, 8-19 for BR14, etc.
2. Setup
1. Adding an additional worker node group for the new pods
2. Adding a new k8s cronjob for the scheduled scaling
3. Detailed action during scale up
1. Increase workers
2. Adding replicas to the scalable deployments
3. Rolling restart deployments
4. Detailed action during scale down
1. Scale in replicas
2. Drain and decrease workers
3. Rolling restart deployments
5. Settings of node number based on different sizing profiles, instance type: r5.xlarge, r6i.xlarge, etc.
| Number of Concurrent User | Max nodes of group sma-autoscaling-nodes | Max nodes of non-autoscaling nodegroup |
| --- | --- | --- |
| 3000 | 10 | 6 |
| 1000 | 6 | 4 |
| Others | Adjust the number based on the node resource usage monitoring | Adjust the number based on the node resource usage monitoring |
## Reference
1. [SMA Autoscaling](https://rndwiki.houston.softwaregrp.net/confluence/display/SMA/SMA+Autoscaling).