10 Key Enhancements in Kubernetes v1.36 for Workload-Aware Scheduling

Kubernetes v1.36 marks a significant leap forward in how the scheduler handles AI/ML and batch workloads. Building on the foundational Workload API introduced in v1.35, this release refines the architecture to improve scalability, atomicity, and flexibility. Below are the ten most important enhancements that make workload-aware scheduling more powerful and production-ready.

1. Clean Separation of Workload and PodGroup APIs

In v1.35, the Workload API bundled both the template and runtime state of Pod groups into a single resource. That design limited scalability because any change to a pod’s status required updating the parent Workload object. v1.36 splits these concerns: the Workload now acts exclusively as a static template, while the new PodGroup API handles runtime state. This separation allows controllers and the scheduler to work with lighter objects and enables per-replica sharding of status updates – a huge win for large-scale deployments.

10 Key Enhancements in Kubernetes v1.36 for Workload-Aware Scheduling

2. PodGroup API: The New Runtime Object

The PodGroup resource, part of scheduling.k8s.io/v1alpha2, captures the live scheduling state of a group of Pods that must run together. It holds the actual scheduling policy (e.g., gang scheduling parameters) and references the Workload template that created it. Its status field mirrors individual Pod conditions, giving the scheduler a consolidated view of the group’s health. This dedicated object simplifies debugging and allows independent lifecycle management for each Pod group.

3. Workload API Becomes a Pure Template

Now that runtime state lives in the PodGroup, the Workload resource is streamlined. It defines one or more podGroupTemplates that specify the scheduling policy – for example, a gang policy with a minimum count of Pods. Workload controllers like the Job controller stamp out these templates to create PodGroup instances. Because the scheduler no longer needs to parse the Workload object, its logic becomes simpler and faster.

4. New PodGroup Scheduling Cycle in kube-scheduler

The kube-scheduler now includes a dedicated PodGroup scheduling cycle. This cycle processes PodGroups atomically, considering the group as a single unit rather than scheduling Pods one by one. It evaluates whether the entire group can be placed before committing any Pod, avoiding partial scheduling failures. This foundational change opens the door to more advanced group-aware scheduling strategies in future releases.

5. Atomic Workload Processing

With the new scheduling cycle, the scheduler treats each PodGroup as an atomic unit. For gang scheduling, this means the scheduler will only schedule the group if all required Pods can be placed simultaneously. This eliminates the “thundering herd” problem and wasted scheduling attempts. Atomic processing also improves predictability for batch jobs, ensuring that resources are not partially allocated and later rolled back.

6. Topology-Aware Scheduling (First Iteration)

v1.36 introduces the first iteration of topology-aware scheduling for PodGroups. When a Workload template includes topology constraints (e.g., spread across zones or nodes), the scheduler now considers these during the PodGroup scheduling cycle. It can co-locate or distribute Pod groups based on hardware topology, reducing latency for tightly coupled AI/ML training jobs and improving resilience for distributed workloads.

7. Workload-Aware Preemption (First Iteration)

Preemption logic in the scheduler now understands PodGroups. If a higher-priority PodGroup cannot schedule due to resource shortage, the scheduler can preempt individual Pods from lower-priority groups – but only if doing so frees enough resources for the entire higher-priority group. This workload-aware preemption avoids breaking atomicity and ensures that preemption decisions are group-conscious, not just Pod-conscious.

8. Dynamic Resource Allocation via ResourceClaim

With the new ResourceClaim support for PodGroups, administrators can leverage Kubernetes’ Dynamic Resource Allocation (DRA) for batch workloads. A Workload can request a set of ResourceClaims that are claimed as a group. The scheduler then coordinates resource allocation and deallocation across all Pods in the group. This is especially valuable for workloads that require GPUs, TPUs, or other specialized hardware that must be provisioned together.

9. Job Controller Integration with New APIs

The Job controller has been updated to leverage the new Workload and PodGroup APIs. This first phase of integration allows the controller to create Workloads that define PodGroup templates, and then the controller automatically creates corresponding PodGroup instances. Jobs now natively benefit from gang scheduling and other group-aware features without custom code. This paves the way for seamless adoption by existing batch job users.

10. Performance and Scalability Improvements

Thanks to the API separation and new scheduling cycle, v1.36 delivers measurable performance gains. The PodGroup’s per-replica status sharding reduces contention on the Workload object. The scheduler’s direct PodGroup watch eliminates the overhead of parsing Workload objects. Early benchmarks show reduced scheduling latency for large clusters running thousands of Pod groups, making the system more scalable for production AI/ML training farms.

In summary, Kubernetes v1.36 transforms workload-aware scheduling from a promising preview into a robust, production-ready foundation. By decoupling templates from runtime state, introducing atomic scheduling cycles, and adding topology and preemption awareness, this release equips operators and developers with the tools needed to run complex batch and AI/ML workloads efficiently. The integration with the Job controller further underscores that these capabilities are ready for real-world use, and future releases will only continue to expand on this solid architectural base.

Tags:

Recommended

Discover More

Cybersecurity Experts Sentenced for Role in BlackCat Ransomware Attacks: Key Questions AnsweredImproving Man Pages: Practical Examples for tcpdump and digHow to Evaluate the Impact of Removing Open-Source Code for AI Security in HealthcareHow to Build Radical Possibility in Schools Without Losing YourselfPolymarket's Verification Crisis: Gamblers Tamper with Weather Sensors and Threaten Journalists to Rig Bets