Skip to content

Kueue v0.10.0-rc.1

Pre-release
Pre-release
Compare
Choose a tag to compare
@mimowo mimowo released this 26 Nov 09:55
· 193 commits to main since this release
v0.10.0-rc.1
f486009

Changes since v0.9.0:

Urgent Upgrade Notes

(No, really, you MUST read this before you upgrade)

  • Removed the v1alpha1 Visibility API.

    The v1alpha1 Visibility API is deprecated. Please use v1beta1 instead. (#3499, @mbobrovskyi)

  • The InactiveWorkload reason for the Evicted condition is renamed to Deactivated.
    Also, the reasons for more detailed situations are renamed:

    • InactiveWorkloadAdmissionCheck -> DeactivatedDueToAdmissionCheck
    • InactiveWorkloadRequeuingLimitExceeded -> DeactivatedDueToRequeuingLimitExceeded

If you were watching for the "InactiveWorkload" reason in the "Evicted" condition, you need
to start watching for the "Deactivated" reason. (#3593, @mbobrovskyi)

Changes by Kind

Feature

  • Allow mutating the queue-name label for non-running Deployments. (#3528, @mbobrovskyi)
  • Allowed StatefulSet scaling down to zero and scale up from zero. (#3487, @mbobrovskyi)
  • TAS: support rank-based ordering for JobSet (#3591, @mimowo)
  • TAS: support rank-based ordering for Kubeflow (#3604, @mbobrovskyi)
  • TAS: support rank-ordering of Pods for the Kubernetes batch Job. (#3539, @mimowo)

Bug or Regression

  • Added validation for Deployment queue-name to fail fast (#3555, @mbobrovskyi)
  • Added validation for StatefulSet queue-name to fail fast. (#3575, @mbobrovskyi)
  • Change, and in some scenarios fix, the status message displayed to user when a workload doesn't fit in available capacity. (#3536, @gabesaba)
  • Determine borrowing more accurately, allowing preempting workloads which fit in nominal quota to schedule faster (#3547, @gabesaba)
  • Fix accounting for usage coming from TAS workloads using multiple resources. The usage was multiplied
    by the number of resources requested by a workload, which could result in under-utilization of the cluster.
    It also manifested itself in the message in the workload status which could contain negative numbers. (#3490, @mimowo)
  • Fix computing the topology assignment for workloads using multiple PodSets requesting the same
    topology. In particular, it was possible for the set of topology domains in the assignment to be empty,
    and as a consequence the pods would remain gated forever as the TopologyUngater would not have
    topology assignment information. (#3514, @mimowo)
  • Fix dropping of reconcile requests for non-leading replica, which was resulting in workloads
    getting stuck pending after the rolling restart of Kueue. (#3612, @mimowo)
  • Fix running Job when parallelism < completions, before the fix the replacement pods for the successfully
    completed Pods were not ungated. (#3559, @mimowo)
  • Fix the bug which prevented the use of MultiKueue if there is a CRD which is not installed
    and removed from the list of enabled integrations. (#3603, @mszadkow)
  • Fix the flow of deactivation for workloads due to rejected AdmissionChecks.
    Now, all AdmissionChecks are reset back to the Pending state on eviction (and deactivation in particular),
    and so an admin can easily re-activate such a workload manually without tweaking the checks. (#3350, @KPostOffice)
  • Make topology levels immutable to prevent issues with inconsistent state of the TAS cache. (#3641, @mbobrovskyi)

Other (Cleanup or Flake)

  • Eliminate webhook validation in case Pod integration is used on 1.26 or earlier versions of Kubernetes. (#3247, @vladikkuzn)