Skip to content

Yo-Yo attacks on cloud auto-scaling

Camel Up

Camel Up is a popular board game designed by Steffen Bogen. It is a family-friendly, light-hearted, and entertaining game that revolves around a crazy camel racing event. What makes Camel Up particularly unique and enjoyable is the unpredictability of the race. The camels stack on top of each other when they land on the same space, creating a camel stack. The camels at the top of the stack move faster, while the ones at the bottom move slower. This adds an element of surprise and excitement, making it challenging to predict the winner until the very end.

Camel Up board game and auto-scaling in the cloud may appear as two completely distinct area. However, the camels' movement—going up and down—in the game reminds me of the process of scaling resources in cloud auto-scaling.

Cloud auto-scaling is a very powerful tool, but it can also be a double-edged sword. Without the proper scaling configuration and testing it can cost cloud users a lot. So, auto-scaling is a trade of performance and cost.

Story

In this blog, the term scale up/down refers to scaling resources, encompassing both vertical and/or horizontal scaling.

Over the past decades, DoS and DDoS attacks have emerged as a grave threat to the Internet's infrastructure. Recent data from Cloudflare indicates a dramatic surge in DDoS attacks over the past few years [comparitech.com]. The increasing number of attacks has led to the emergence of new detection and mitigation solutions as well. In November 2020, the Alibaba Cloud Security Team detected the largest resources exhaustion DDoS attack on their cloud platform, with a peak of 5.369 million QPS (Queries Per Second). Microsoft mitigated upwards of 359,713 DDoS attacks against their Azure Cloud infrastructures during the second half of 2021 [microsoft.com].

On the other hand, attackers do not surrender. New kinds of DDoS attacks have emerged to exploit cloud anti-DDoS solutions. Burst attacks, also known as hit-and-run DDoS, are a new kind of DDoS attack where the attacker launches periodic bursts of traffic overload at random intervals on online targets. Burst attacks have grown significantly, to the extent that a comprehensive survey in 2017 revealed that half of the participants cited an increase in burst attacks. [radware.com].

Enterprises using cloud services mostly benefit from cloud features. Enhanced cloud scalability and elasticity through auto-scaling allow customers to dynamically scale their applications. Incoming traffic is distributed evenly across multiple endpoints, so individual backend services cannot be overwhelmed until the volume of traffic approaches the capacity of the entire network.

Hostile actors adjust their tactics to correspond to the realities posed by the cloud. The Yo-Yo attack is a new attack technique against the cloud autoscaling feature. In this method, attackers send a burst of request traffic to significantly increase the load on the cloud server, triggering the autoscaling mechanism, and causing the server to be scaled up.

Therefore, during this confirmation period, the victim's system deploys resources that far exceed the required amount. Burst traffic will be stopped after scaling up and waiting for the auto-scaling mechanism to scale down the server. The attacker continues the latter attack procedure and forces the cloud services to scale up and scale down continuously. This adds extra load to the services to respond to the fake requests. In effect, the attacker forces the victim to pay for large amounts of resources that are not actually necessary to handle the legitimate workload. The Yo-Yo attack can affect any platform using auto-scaling mechanisms, such as container-based and Kubernetes platforms.

Yo-Yo attack

The Yo-Yo attack is designed to exploit the cloud's auto-scaling mechanism. The attacker employs a specific strategy to create a significant load on the cloud by sending bursts of request traffic to a target running on the cloud. As a result, the auto-scaling mechanism triggers, attempting to scale up the cloud resources to handle the high traffic load.

The attack operates in a cyclical manner. After the attacker notices the scaling-up process, they halt the burst attack and wait for the auto-scaling mechanism to scale down the resources. This crucial step is key to the success of the Yo-Yo attack. The attacker then resumes sending the burst traffic to trigger the auto-scaling mechanism to scale up again, perpetuating the cycle.

The primary objective of the Yo-Yo attack is not necessarily to take the services offline but rather to inflict financial damage on the victim. Cloud service providers typically follow a consumption-based pricing model, wherein end-users pay for the resources they utilize. This model allows users to pay for additional resources as needed and stop paying for resources that are no longer required. The Yo-Yo attack aims to exploit this pricing model by forcing the victim's system to scale up and consume more resources during the attack cycles, leading to increased costs for the victim without actually providing any legitimate workload.

During each cycle of the Yo-Yo attack, as the victim's system scales in resources, significant charges accumulate on the victim's account. The attacker forces the victim to pay for large amounts of resources that are not truly necessary to handle the current legitimate workload. For example, cloud providers like AWS charge for EC2 instances based on the time they are consumed, and partial instance-hours are billed per second. This means that after scaling in an instance, the service provider incurs charges per second, even if the scale down occurred due to a Yo-Yo attack. Consequently, the financial damage to the victim can be substantial, as they are billed for the extra resources forced upon them by the attacker during each cycle of the attack.

The graph below shows the results of simulating a Yo-Yo attack towards AWS EC2 instances running on autoscaling groups. Please keep in mind that the test has been conducted under the AWS DDoS simulation testing policy.

Yo-Yo attack

The Yo-Yo attack is a new DDoS technique that has emerged due to the cloud auto-scaling feature. So, the number of studies in this area is very limited. In a paper published in 2017, Anat Bremler-Barr, Eli Brosh, and Mor Sides were the first to discuss the Yo-Yo attack on the autoscaling mechanism. This groundbreaking paper demonstrates that, apart from the economic effects of the attack, the Yo-Yo attack can inflict substantial performance damage.

During the repetitive scale up process, which takes several minutes due to the instance startup process, the cloud service suffers from significant performance degradation. The article reveals that the autoscaling policy configuration is an important factor in minimizing the impact of the Yo-Yo attack. Therefore, the Yo-Yo attack can also be classified as a type of Reduction of Quality (RoQ) attack.

Based on this study, another group published a paper with the topic 'Towards Yo-Yo attack mitigation in cloud auto-scaling mechanisms'. The paper proposed a detection and mitigation system for Yo-Yo attacks in cloud auto-scaling mechanisms. The suggested approach is called Trust-based Adversarial Scanner Delaying (TASD). The TASD approach is inspired by two key factors. Firstly, in comparison to benign users, Yo-Yo attackers tend to initiate burst requests, leading to more frequent auto-scaling. Additionally, there is a substantial difference in request load between the scale up and scale down phases caused by the attackers. To address this, the TASD system assigns a trust value to each client based on their behavior, which represents their Quality of Service (QoS). Consequently, TASD introduces specific delays to suspicious requests within the QoS constraints. This manipulation of response times aims to deceive the attackers and mitigate the impact of the Yo-Yo attack.

During my master's thesis, I implemented a realistic scenario to test the TASD approach and enhance the mitigation algorithm. In the original TASD system, an Additive Decrease method was used to update the trust value. To improve the system, I drew inspiration from TCP rate control mechanisms and introduced two optimization methods: ADAI (Additive Decrease/Additive Increase) and MDAI (Multiplicative Decrease/Additive Increase). These methods aim to optimize the TASD detection and mitigation system further. I published the results in a paper that can be accessed here.

Results

The study mainly focused on the difference between DDoS and Yo-Yo attacks on cloud autoscaling and implemented mitigation and detection methods. Surprisingly, I have found some interesting side results:

  1. The warming time of scale up is the duration it takes for an instance (VM, Container, etc.) to get ready to function, while the warming time of scale down is the duration an instance allocates to close all services and release resources. The warming time plays a significant role in the damage, especially with a simple auto-scaling policy. Whenever the scaling metric threshold was selected close to the maximum capacity of the service (e.g., web application), the service degraded during scale up due to the warming time.

    To address this issue, several approaches can be considered. One approach is to minimize the warming time, but this may not be feasible for all scenarios, as some services require time to initialize properly. Another approach is to scale up in two steps, where the auto-scaling increases the number of instances by two. However, a potential bottleneck with this option is the need to keep some unused capacity available for a quick response, which can result in ongoing costs for the user.

    An alternative option is to adopt an early scale up strategy and a slow scale down approach, allowing ample time before immediately scaling in the instances.

  2. Yo-Yo attackers should approximate the auto-scaling state and configuration to maximize the damage. The attacker can send probe requests and compare response times to detect the scale up status. The same technique is suggested to detect scale down. However, the experiment's results revealed that using a probe request may not necessarily detect the scaling status accurately. For example, in my test scenario, the auto-scaling policy was configured to scale based on theRequest Count, resulting in stable response times for the probe requests during the test. This stability prevented the attacker from effectively detecting the scaling process.


Last update: August 3, 2023

Comments