Azure Virtual Machine Scale Set Duration and Cool Down Explained

Reading Time: 7 minutes

When configuring an Azure Virtual Machine Scale Set (VMSS), there is an option to configure auto scaling rules. Auto scaling is the process of dynamically allocating resources to match performance requirements. As the volume of work grows, an application may need additional resources to maintain the desired performance levels and satisfy service level agreements (SLAs). As demand reduces and the additional resources are no longer needed, they can be automatically removed to minimise costs.

As part of the auto scale configuration inside a Azure Virtual Machine Scale Set, we can set a duration and a cool down period. In this post, I will focus on explaining the differences between both options based on a couple of scenarios.

Want to learn more about scaling in Azure?
If you wish to learn more about Azure VM Scale Sets, visit the following Microsoft Learn link, Azure Virtual Machine Scale Sets Overview.

In addition to Azure VM Scale Sets, you can also configure scaling rules for a number of other Azure services such as Azure App Service Plans. When configuring scaling rules for Azure App Service Plans, you can also set up auto scaling based on metrics such as CPU usage, memory usage, HTTP queue length and more. Basically, the App Service Plan includes a built in VM Scale Set.

For auto scaling best practices and to learn more about the different services in Azure which include built in scaling, visit the following Microsoft Learn link Autoscaling guidance – Best practices for cloud applications.

Note: it is important that you plan and configure your scaling rules correctly to avoid performance issues, unnecessary scaling and costs due to an incorrect configuration. The metrics used in this post are for demo purposes only.

Duration in Azure VM Scale Set

Duration is the time the VM Scale Set will look back at metrics before making a decision to scale.

For example, in the scaling rule below, I have configured

  • If CPU Percentage = greater than 85%
  • for a DURATION of 10 minutes
  • Increase the Instance/VM (Virtual Machine) count by 1

So for this condition to trigger, CPU must be continuously greater than 85% for a duration of 10 minutes. The VM Scale Set will look back at CPU utilisation for the past 10 minutes and if CPU was constantly greater than 85%, it would add another instance/VM.

Below is a screenshot of an Azure VM Scale Set rule. I have used a green arrow to highlight the DURATION field.

Cool down in Azure VM Scale Set

The cool down period comes into effect after a scale-in (remove VM instance) or a scale-out (Add VM instance) event is triggered. For example, if I set a COOL DOWN period of 10 minutes, this instructs the scale set to not scale again for another 10 minutes. You’re simply asking the scale set to take a break within the cool down period to allow the VM Scale Set to stabilise and check whether the additional VM instance has made a difference to the CPU utilisation.

I have used a blue arrow to highlight the COOL DOWN field in the image below.

Let’s take a look at what the above scale rule configuration looks like on a diagram.

In the diagram below we start with 1 VM in our scale set.

1. At 3pm CPU for VM1 goes above the threshold of 85%, and constantly remains above the threshold for a duration of 10 minutes (until 3.10pm).

2. At 3.10pm the scale set looks back at the last 10 minutes from 3.10pm and 3.00pm, and because the duration of CPU was constantly above 85%, the scale set adds another VM, totaling two VM’s in our scale set.

3. At 3.10pm, the cool down period also kicks in and no further scaling operations take place. However, the cool down down period does not pause time or stop the collection of metrics under the hood. As you can see from the diagram above, adding another VM at 3.10pm makes a difference to the CPU as it normalises between 50% and 60% utilisation, but the metrics will still be analysed and collected. The cool down period is only requesting for the VM scale set to pause temporarily and to not add (Scale-out) or remove (Scale-in) any further VM’s in the configured cool down time of 10 minutes.

4. The addition of one additional VM has stabilized the VM Scale Set and operations resume as normal. CPU is averaging between 50% to 60%.

5. At 3.40pm, CPU utilisation increases and is above 85% for a duration of 10 minutes. At 3.50pm, the VM Scale Set looks back at the duration of 10 minutes and makes a decision to add another VM.

That’s how duration and cool down periods perform. I hope this helped you understand the differences.

Now that we understand the differences between duration and cooldown in an Azure VM Scale Set (VMSS), let’s move onto another scenario.

What if after the second VM was added, CPU did not stablise and constantly remained over 85%. Would the VM Scale Set add another VM straight after the cooldown period, or would it wait another duration of 10 minutes before making a decision to add another VM?

Firstly, if you come across a scenario where after the VMSS adds an additional VM, and it does not make a difference, for example CPU had not dropped below 85%, you should consider investigating and possibly reconfiguring your VM Scale Set rule.

However, because we’re learning, we want to know what would happen in this scenario, right? Ok, another diagram below to explain.

To test this scenario, I deployed a new Virtual Machine Scale Set and configured a VM Scale Set rule as follows,

  • If CPU Percentage = greater than 0.1% (Yes, a silly number, but it’s for testing purposes only!)
  • for a DURATION of 10 minutes
  • Increase the VM (Virtual Machine) count by 1
  • COOL DOWN period of 10 minutes

The diagram above shows that I start with 1 Virtual Machine at 3pm. Because of the silly CPU threshold of 0.1% the condition is met instantly and the VM Scale Set looks back at the last 10 minutes (duration) from 3pm and 3.10pm and adds another VM Instance. A cooldown of 10 minutes is triggered to pause any further scaling operations whilst the VM scale set is stablising. However, as you can see from the diagram above, metrics are still being monitored and recorded under the hood. The additional VM has not made a difference as CPU utilisation is still greater than 0.1%.

At 3.20pm, the cooldown period of 10 minutes has expired.

But what happens now?

CPU has constantly been greater than 0.1% throughout the cooldown period and the additional VM has not made a difference. What would happen after the cool down period?

Is another scale operation triggered and a third VM/Instance is added shortly after the COOL DOWN period at point A 3.20PM shown in the diagram below? or does the VM scale set analyse metrics for another 10 minutes of DURATION before adding another VM at point B 3.30pm below?

Another VM is added after the cooldown period at around 3.20pm. The VM Scale Set takes into account the last 10 minutes and metrics included in the cooldown period. Remember, the cooldown period only temporarily pauses scaling operations, but under the hood the time and metrics are still being analysed and recorded.

Below are the scaling logs from my testing,

Let’s zoom in and focus on the time stamp column showing the times the VM Scale Set added a new VM/Instance. Image below.

According to the time stamps above,

  • at 9.07am a VM Scale operation was triggered to add another VM/Instance. Why? Because CPU was greater than 0.1% for a DURATION of 10 minutes. The VM Scale set looked back at the duration from approx 8.57am – 9.07am.

  • when the scale-out triggered, a COOL DOWN period of 10 minutes was also initiated to allow the deployment of the new VM and CPU to stabilise.

  • the 10 minute COOL DOWN period ended at around 9.17am. However, another scale operation was initiated after the COOL DOWN period ended. The VM Scale set did not wait for another 10 minute duration.

  • because I had a CPU threshold of 0.1% set, the VM Scale Set never stablised and was continuously scaling by adding another VM after the cooldown period of 10 minutes.

    What do we learn from this? When the VM Scale Set looks back at the duration, it will include the cool down time and metrics to make a decision if another scale operation is required.

Note: don’t forget to configure a scale-in rule so the scale set can scale-in (remove VM’s) as CPU levels reduce in less busy times. The VM scale set will only add VM’s (Scale-out) but won’t know how to remove VM’s when no longer needed (Scale-In). In my demo, having a CPU threshold of 0.1% would never allow the VM scale set to scale in. Plan and configure your VM Scale Sets as per your requirement.

and that’s it for now. I hope you found this post useful. Any feedback, please free to comment below.

See you at the next post.