Alt Text for accessibility

Like any tech-geek out there to a degree, I do like comic books, superheroes, Star Wars, Star Trek, the whole shebang, and this time around, I would love to draw your attention to 2 fictional characters in comic books.

Ant-Man

Alt Text for accessibility

The first guy is a guy called Ant-Man who’s got this awesome bodysuit and helmet that, once put on, genetically modifies the molecular structure of his cells at the atomic level, which in turn grants him the ability to shrink his size to as small as, or even smaller than, an Ant. And it so happens he can reverse the suit’s effect to either restore him to his original height and size or extend his growth until he pretty much becomes a giant. Yes, he’s been known to grow bigger than buildings. Cool, isn’t it? He can grow or shrink. Just him

Dupli-Kate

Alt Text for accessibility

On the other hand, we got Dupli-Kate. A superhero who is kinda cool too. As her name suggests, her superpower grants her the ability to make copies of herself in tens, hundreds, or even thousands. And whenever she is done or happy with the task that warranted her self-duplication, she can just merge back into just one person. She’s got a twin brother, called Multi-Paul. And yes, he equally has the same superpower of self-duplication as she does. But he is kinda like a villain. So we’ll stick with girl power for this allegory.

So ?

As you can see, both these cool superheroes (Dupli-kate and Ant-Man) exhibit similar powers of alteration of their self-mass. Either by Ant-Man deciding that he would rather increase his mass by stretching himself as a singular unit, or by Dupli-Kate or Multi-Paul, whose version of self-expansion is the creation of multiple copies of themselves.

Huh ???

If I were to read your mind right now, you should be pondering that I am supposed to be a DevOps guy. What’s my fuss bringing up comic books and fictional characters? Then meddling them with DevOps-related conversations? For what reasons or what purpose?

Well, it turns out Dupli-Kate and Ant-Man are practically smeared in every nook, cranny, and corner in DevOps. Everywhere you turn, there is a situation that begs you to summon Ant-Man; in some cases, it’s Dupli-Kate that comes to the rescue. You need them in cloud platforms like AWS, GCP, Azure, and even virtualisation environments like VMware, or how about Nomad, or Kubernetes.. Every corner you turn, they are smack there, spilling your hot coffee over your pretty white blouse like you never saw it coming.

Still dont know what I’m talking about ..

Its easy. Im talking about a concept in DevOps called Auto-Scalling. And Auto-Scalling. works in the same principle as Ant-Man and Dupli-Kate those. Regarding that Auto-Scalling.

Ok, no more riddles ..

What is Auto-scalling ?

Autoscaling is the automated process of dynamically adjusting computing resources such as virtual machines, containers, or CI/CD pipeline agents to match real time demand. It ensures optimal application performance during traffic spikes while scaling down during quieter periods to minimize cloud costs.

Types of Autoscaling

Systems can scale in two distinct ways to handle workload variations:

Dupli-Kate AKA Horizontal Scaling (Scale Out/In): Adding or removing the number of identical server instances or containers. This is the standard method for cloud-native applications.
Ant-Man AKA Vertical Scaling (Scale Up/Down): Increasing or decreasing the capacity of an existing server (e.g., upgrading RAM or CPU). Because this often requires temporary downtime, it is less commonly automated.

How Autoscaling Works

Autoscaling operates via user-defined policies and metrics to automate the lifecycle of compute resources. It typically relies on the following components:

Metric Thresholds: Monitoring services (such as Prometheus, Datadog, or AWS CloudWatch) continuously track parameters like CPU utilization, memory usage, or network traffic.
Minimum/Maximum Limits: Engineers define a floor and ceiling for resources so systems can scale outward without spiraling out of control financially.
Automated Actions: When a threshold is breached (e.g., CPU usage exceeds 70% for 5 minutes), the autoscaling engine automatically provisions or terminates instances.

Common Tools

To implement autoscaling in your own infrastructure, DevOps teams frequently leverage the following services:

Cloud Provider Services: AWS Auto Scaling, Microsoft Azure Autoscale, and Google Cloud Autoscaler.
Container & Kubernetes Orchestration: Kubernetes features tools like the Horizontal Pod Autoscaler (HPA) and KEDA to scale container replicas based on specific workload events.
Infrastructure as Code (IaC): Tools like Terraform and Ansible are used to write autoscaling policies as code, making environments repeatable and reliable.

SO, back to the fuss about our heroes Ant-Man & Dupli-Kate

Like I said, The fuss about Ant-Man and Dupli-Kate came about by virture of their expansion.

Ant-Man grows upwards, vertically towards the sky. Thats pretty much tantamount to scaling-up and scaling-down, which are atributes of vertical scaling.
The same goes for Dupli-Kate. By virture of her self-replication, spreading herself accross the horizon, She exhibits the atributes of scaling-out and scaling-in which is pretty much horizontal scaling

Lets try put it in some examples.

If we use AntMan to represent vertical scaling and Dupli-Kate to represent horizontal scaling, auto-scaling can be understood as a system that automatically adjusts resources based on demand.

1. Vertical Auto-Scaling (Ant-Man Model)

Think of Ant-Man as a single server that can grow or shrink in size.

Normal Traffic

One server:
- 2 CPUs
- 4 GB RAM
CPU utilization: 30%

Everything runs normally.

Traffic Spike

Suddenly, 10,000 users visit the application.

The auto-scaling system detects:

CPU utilization > 80%
Memory utilization > 75%

Instead of creating new servers, it upgrades the existing server: Basically scaling-up

Before	After
2 CPUs	8 CPUs
4 GB RAM	16 GB RAM

Like Ant-Man returning to normal size. Basically scaling-down

Advantages

Simpler architecture.
No need for load balancing.

Challenges

There is a maximum size a server can reach.
Scaling may require a restart or cause brief disruption depending on the platform.

2. Horizontal Auto-Scaling (Dupli-Kate Model)

Think of Dupli-kate as the ability to create multiple copies of the same server.

Normal Traffic

You start with:

  Load Balancer
      |
   Server-1

CPU utilization = 30%.

Traffic Spike

Demand increases dramatically.

The auto-scaler detects:

Average CPU > 70% for 5 minutes

Instead of making Server-1 larger, it creates copies:

           Load Balancer
         /      |      \
   Server-1 Server-2 Server-3

Requests are distributed across all servers.

Like Dupli-Kate creating clones of itself to share the workload. Basically scaling-out

Larger Spike

Traffic continues increasing:

                Load Balancer
      /      /      |      \      \
    S1      S2      S3      S4     S5

The auto-scaler keeps adding instances until the load returns to acceptable levels. Meaning it keeps scaling-out

Traffic Drops

When CPU utilization falls below 30%:

  Load Balancer
      |
   Server-1

Extra instances are automatically terminated.

Like Dupli-kate’s clones disappearing when they are no longer needed. Basically scaling-in

Advantages

Virtually unlimited scaling.
Better fault tolerance.
No single server bottleneck.

Challenges

Requires stateless application design.
Needs load balancing and distributed storage/session management.

3. Real Cloud Example

Suppose an e-commerce application is running on Amazon Web Services (AWS).

AntMan (Vertical Scaling)

At 9 AM:
- EC2 Instance
  - 2 vCPU
  - 4 GB RAM
At 12 PM during a flash sale:
- EC2 Instance
  - 16 vCPU
  - 64 GB RAM

The same machine becomes more powerful.

Dupli-kate (Horizontal Scaling)

At 9 AM:
- 1 EC2 Instance .
At 12 PM:
- 10 EC2 Instances

behind an Elastic Load Balancer

The workload is spread across multiple identical servers.

4. How Auto-Scaling Makes Decision

A typical rule might be:

Scale Out:
If CPU > 70% for 5 minutes
Add 1 instance
Scale In:
If CPU < 30% for 10 minutes
Remove 1 instance
Example Timeline

Time	Users	CPU	Action
09:00	500	25%	No action
10:00	2,000	75%	Add instance
10:05	5,000	80%	Add another instance
10:10	10,000	85%	Add another instance
13:00	1,000	20%	Remove instance
14:00	500	15%	Remove another instance

This is auto-scaling: monitoring workload metrics and automatically increasing or decreasing resources.

Summary

AntMan (Vertical Scaling)	Dupli-kate (Horizontal Scaling)
Makes one server bigger	Creates more servers
Scale up/down CPU & RAM	Add/remove instances
Easier to manage	More resilient
Limited by hardware size	Can scale almost indefinitely
Example: 2 CPU → 16 CPU	Example: 1 server → 10 servers

Auto-scaling can use AntMan scaling, Dupli-kate scaling, or a combination of both. Many modern cloud-native applications prefer the Dupli-kate approach because it provides higher availability and elasticity.

THANKS FOR READING GUYS !!

📋 JayJay's DevOps Diaries ..

Chronicling my journey through Cloud Native Infrastructure.. one step and tool at a time...

AntMan Vs Dupli-Kate

Ant-Man

Dupli-Kate

So ?

Huh ???

Still dont know what I’m talking about ..

What is Auto-scalling ?

Types of Autoscaling

How Autoscaling Works

Common Tools

SO, back to the fuss about our heroes Ant-Man & Dupli-Kate

Lets try put it in some examples.

1. Vertical Auto-Scaling (Ant-Man Model)

2. Horizontal Auto-Scaling (Dupli-Kate Model)

3. Real Cloud Example

4. How Auto-Scaling Makes Decision

THANKS FOR READING GUYS !!