-
Notifications
You must be signed in to change notification settings - Fork 40.6k
[Flaky test] kubetest.diffResources #129953
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The deletion script seeems to ignore the Retry error
I don't know if there is a glcoud option that retries these kind of errors, but in this case the problem is that the error is ignored and the deletion seems to not be done, hence resoures are leaked and job fails kubernetes/cluster/gce/util.sh Lines 3699 to 3948 in fc268ec
/help |
@aojea: GuidelinesPlease ensure that the issue body includes answers to the following questions:
For more details on the requirements of such an issue, please see here and ensure that they are met. If this request no longer meets these requirements, the label can be removed In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
@aojea It looks like this issue won't be a blocker for tomorrow's alpha.1 cut for 1.33. |
We do also delete resources in "boskos", but there may be a delay, and finding undeleted resources can indicate a bug (e.g. consider PV drivers and storage e2e tests), in this case it seems we just need more robust retries cleaning up the VMs. |
not a blocker, is a CI / environment problem, not a kubernetes problem |
I will work on this issue as a new contributor. |
/triage accepted @iosebisg please do! Though I will warn this has only met our self-imposed bar for "help wanted" as opposed to "good first issue", these scripts can only be tested in CI or with a GCP account and are barely maintained with limited docs. That said any help is welcome, just if you're looking for an approachable issue you might look elsewhere. BTW checkout our contributor guide at https://www.kubernetes.dev/docs/ |
No failures showing up in testgrid anymore, appears to be resolved. |
Hi folks, thanks a lot for your support and attention on this issue! /milestone v1.34 |
Still intermittent in some other jobs: I don't know if we want to track that here. |
Having this issue in v1.34 CI Signal Board as a Non-Blocker. Triage still shows flakes in sig-release dashboards. |
Uh oh!
There was an error while loading. Please reload this page.
Which jobs are flaking?
sig-release-master-blocking
Which tests are flaking?
kubetest.diffResources
Triage Link
Since when has it been flaking?
1/21/2025, 10:21:57 PM
1/24/2025, 10:25:44 PM
1/25/2025, 4:02:44 AM
2/2/2025, 2:27:01 PM
Testgrid link
https://testgrid.k8s.io/sig-release-master-blocking#gce-cos-master-alpha-features
Reason for failure (if possible)
{ Error: 2 leaked resources
+NAME MACHINE_TYPE PREEMPTIBLE CREATION_TIMESTAMP
+bootstrap-e2e-minion-template e2-standard-2 2025-02-02T01:02:43.323-08:00}
Anything else we need to know?
https://kubernetes.slack.com/archives/CN0K3TE2C/p1738574990672559?thread_ts=1738570442.425769&cid=CN0K3TE2C
Relevant SIG(s)
/sig testing
The text was updated successfully, but these errors were encountered: