
Creating an Additional AKS Node Pool with Terraform/OpenTofu (Step-by-Step)
- Posted by Martin Linxfeld
- Categories AKS – Azure Kubernetes Service, kubernetes, opentofu, terraform
- Date November 28, 2025
- Comments 0 comment
- Tags AKS custom node pools, AKS kubenet networking, AKS node pools, AKS taints and tolerations, AKS user node pool, AKS VMSS, AKS workload isolation, Azure Container Registry (ACR), Azure Kubernetes Service, Azure VNet Terraform, Deploy to AKS with Terraform, Kubernetes scheduling (taints, labels), OpenTofu AKS, Terraform AKS module
AKS additional node pool Terraform setups are one of the most practical ways to scale and isolate workloads in Azure Kubernetes Service. In this guide, we’ll create a fully functional user node pool and deploy workloads to it using Terraform/OpenTofu.
Azure Kubernetes Service (AKS) becomes dramatically more flexible when you split workloads across multiple node pools. A single system node pool is fine for demos or tiny clusters — but for real architectures, you need dedicated node groups, different VM sizes, isolated workloads, and optional autoscaling.
In this article, we build exactly that:
a separate, dedicated user node pool, fully automated with Terraform/OpenTofu.
We’ll create a node pool with custom labels and taints, generate a Kubernetes manifest that lands only on this pool, and validate everything in Azure Portal.
When building AKS clusters with Infrastructure-as-Code, the AKS additional node pool Terraform workflow becomes essential for isolating workloads and scaling teams independently.
Why an Additional Node Pool?
In many production architectures you want:
Different VM sizes (compute-optimized, memory-optimized, GPU nodes)
Workload isolation (system pods vs. application pods)
Taints to guarantee scheduling boundaries
Node selectors or affinity rules to pin workloads to specific pools
Separate autoscaling behavior
Terraform makes this repeatable, predictable, and version-controlled.
1. Defining the Additional Node Pool in Terraform
We start with a simple variable representing a list of extra node pools:
variable "additional_node_pools" {
description = "Additional Node Pool definition"
default = [
{
name = "userpool"
vm_size = "Standard_D2s_v3"
node_count = 2
mode = "User"
orchestrator_version = null
subnet_id = null
taints = ["dedicated=user:NoSchedule"]
labels = { workload = "apps", sku = "general" }
max_pods = 30
enable_auto_scaling = false
min_count = null
max_count = null
spot = false
}
]
}
We pass it into the AKS module:
module "aks" {
source = "github.com/foggykitchen/terraform-az-fk-aks"
name = "fk-aks-extra"
resource_group_name = azurerm_resource_group.foggykitchen_rg.name
location = azurerm_resource_group.foggykitchen_rg.location
create_networking = true
network_plugin = "kubenet"
additional_node_pools = var.additional_node_pools
}
This instructs Terraform to build an AKS cluster plus a second node pool named userpool. This Terraform module exposes a parameter where we pass our AKS additional node pool terraform structure as a complex variable with labels, taints, scheduling rules, and VM sizing.
2. Architecture Overview of the AKS Additional Node Pool (Terraform)
This architecture shows a clean AKS setup extended with a dedicated user node pool.
Here is what you see in the diagram:
Virtual network (
fk-aks-demo-vnet) defining the IP space for the clusterSubnet for AKS nodes (
fk-aks-demo-subnet) hosting both system and user poolsDefault system pool used only for core Kubernetes components
Additional user pool (“userpool”) created via Terraform with its own VM size, labels and taints — ensuring only application workloads land there
Optional Azure Container Registry on the left for pulling container images securely
In short: the design illustrates how AKS can be extended with extra compute groups without touching the system pool — a recommended pattern for scalable, production-grade clusters.
The diagram below illustrates how our AKS additional node pool Terraform configuration provisions a separate compute group inside the same cluster.
3. Kubernetes Manifest Pinned to the Node Pool
To ensure workloads land only on the new node pool, we include:
Tolerations → match the taints from Terraform
NodeSelector → match node labels
apiVersion: apps/v1
kind: Deployment
metadata:
name: app-on-userpool
spec:
replicas: 2
selector:
matchLabels: { app: demo }
template:
metadata:
labels: { app: demo }
spec:
tolerations:
- key: "dedicated"
operator: "Equal"
value: "user"
effect: "NoSchedule"
nodeSelector:
workload: "apps"
containers:
- name: web
image: nginx:stable
ports: [{ containerPort: 80 }]
Thanks to this configuration, even if other pools exist, the scheduler will place the pods exclusively on “userpool”.
4. Local Execution: kubectl apply
The Terraform module renders the manifest locally and applies it:
resource "null_resource" "kubectl_apply" {
depends_on = [
module.aks,
local_file.app-on-userpool
]
provisioner "local-exec" {
command = join(" && ", [
"az aks get-credentials -g ${azurerm_resource_group.foggykitchen_rg.name} -n ${module.aks.cluster_name} --overwrite-existing",
"kubectl get nodes -L agentpool,workload,sku",
"kubectl apply -f ${path.module}/generated/app-on-userpool.yaml",
"kubectl get pods -o wide"
])
}
}
Terraform configures kubectl, deploys the manifest, and prints pod placement.
5. Validating in Azure Portal
Everything matches what Terraform defined:
Mode: User
VM size: Standard_D2s_v3
Taint:
dedicated=user:NoScheduleLabels:
workload=apps,sku=generalNode count: 2
In Workloads → Deployments, we find:
app-on-userpoolrunning2 replicas placed on nodes belonging to the userpool
Clean scheduling based on node selectors and tolerations
This confirms that our Terraform + YAML combination behaves exactly as expected.
6. Read more about production AKS patterns with Terraform
If you’re following the AKS Terraform series, here are the previous articles:
🔗 AKS Kubenet vs Azure CNI — Networking trade-offs explained with Terraform
Understand how AKS networking choices impact Pod IP addressing, traffic flow, scalability, and what you actually observe in Azure Monitor. This guide explains the real production trade-offs between Kubenet and Azure CNI using Terraform examples.
🔗 AKS + Azure Container Registry with Terraform — Secure image supply chain for production clusters
Learn how to provision Azure Container Registry and integrate it with AKS using Terraform/OpenTofu. This guide covers private image pulls, secure authentication, and the baseline container supply chain for production AKS environments
🔗 Persistent Volumes in AKS with Terraform — The Role of Azure Managed Disks
Understand how AKS provisions persistent storage using the Azure CSI driver and how to automate disk-backed PersistentVolumes with Terraform/OpenTofu. This is the baseline pattern for running stateful workloads on AKS in production.
🔗 Azure Bastion with Terraform — Secure Access to Private AKS Clusters
A hands-on guide to deploying Azure Bastion with Terraform — including the required subnets, NSG rules, and a practical workflow for connecting securely to private AKS nodes. If you’re planning a private AKS cluster, this article explains the exact infrastructure you will need. It also includes screenshots and troubleshooting steps directly from the Azure Portal.
7. What’s Next? Auto-Scaling Node Pools
Adding a fixed-size node pool is just the beginning.
In the next article, we will create an autoscaling node pool, where AKS automatically adds or removes nodes based on cluster load and pod scheduling pressure.
This is one of the most important skills for production-grade AKS operations — and we will build it step by step with Terraform/OpenTofu.
⚡Course: “Azure Kubernetes Service (AKS) with Terraform/OpenTofu — Hands-On Fundamentals (2025 Edition)”
This blog post is part of the AKS course, where we go far deeper into:
AKS networking (Kubenet, Azure CNI, Overlay, dual-stack)
Node pools (system/user, taints, labels, tolerations, autoscaling)
ACR integration and CI/CD workflows
Observability and Log Analytics
Storage, identity, RBAC, and production-ready architecture patterns

Scale and Optimize AKS with Terraform/OpenTofu
Learn how to design and automate advanced AKS node pool strategies — including system vs. user pools, workload isolation, taints & tolerations, autoscaling, spot nodes, and GPU workloads — all provisioned with Terraform/OpenTofu.
🔒 Lifetime • ⏱️ Self-paced • 🧪 Real labs
Check also other courses:
Tag:AKS custom node pools, AKS kubenet networking, AKS node pools, AKS taints and tolerations, AKS user node pool, AKS VMSS, AKS workload isolation, Azure Container Registry (ACR), Azure Kubernetes Service, Azure VNet Terraform, Deploy to AKS with Terraform, Kubernetes scheduling (taints, labels), OpenTofu AKS, Terraform AKS module
