-
Notifications
You must be signed in to change notification settings - Fork 40.6k
HPA wrongly assumes that terminated pods have an utilization of 100% #129866
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/sig autoscaling |
Hi...Can you try to reproduce this in latest version? Also maybe someone from sig cloud-provider can help |
we are reviewing this issue in the sig cloud provider office hours this week, we have a couple questions:
|
Hi..
|
This doesn't seem like a cloud-provider issue, as it seems limited to HPA only |
Going to remove sig cloud-provider, as I don't believe this is related /remove-sig cloud-provider |
We are observing similar behavior in the deployment of the nginx-controller. For some reason, after the update, some pods ended up with a Completed status, and the HPA 'counted' these pods, preventing further scaledown. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
I can take a look. |
Oh sorry, I see you already started working on that, @jm-franc. Are you planning to keep going with it? If not, I'm happy to take it over. |
Oh thanks Omer! I've been busy with other things but I think I'll have time to finish this soonish. |
Great, thanks! |
What happened?
A pod that terminated was considered by the HPA controller to be at its target utilization.
The controller logic (1, 2) is such that, while scaling up, it conservatively considers that a pod for which we couldn't get the utilization metric from a metrics API are at their target utilization. (On scale down, the assumption is conservatively that the utilization is 0.)
What did you expect to happen?
I expected the controller to assume that a terminated pod has an utilization of 0.
This is already correctly handled for pods that terminated with a failure, but the case where a pod terminated successfully isn't handled.
How can we reproduce it (as minimally and precisely as possible)?
Create a Deployment with pods that terminate (without a failure) and observe that an HPA targetting this Deployment will assume that terminated pods are at target utilization.
Anything else we need to know?
Handling the case where the pod is terminated normally here will fix this.
Kubernetes version
Cloud provider
OS version
Install tools
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, ...) and versions (if applicable)
The text was updated successfully, but these errors were encountered: