-
Notifications
You must be signed in to change notification settings - Fork 40.6k
Cron job staggering / randomization / concurrency limiting #91652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
/sig apps |
Do you mean something like RandomizedDelaySec? Here is a workaround: run command like sleep $(shuf -i 10-20 -n 1) before your job cmd
|
/assign @soltysh |
/cc @MikeSpreitzer |
I briefly thought about the new API Priority and Fairness feature in the apiservers, but that will not directly do the trick because your concern is not with apiserver requests but rather running workload in pods. A generalization of leader-election to directly enforce a concurrency limit would probably do the trick, but may be heavier weight than is needed here. I like the random delay idea. |
I would like to have the Jenkins-Syntax supported! |
Randomizing execution would be awesome. +1 for 'H' support (Hashed) https://en.wikipedia.org/wiki/Cron#Non-standard_characters |
The Jenkins syntax can be found at https://www.jenkins.io/doc/book/pipeline/syntax/#cron-syntax
But maybe a more modern route is to go with options like for the systemd.timer using systemd.time format . This sample from
It does not have the concurrency limit mentioned earlier, but the AccuracySec can allow adjusting pod starts for example on load. With the status info on the Jobs previously run there would be potential for some smart scheduling if anyone finds that useful. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
Is this planned? |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close not-planned |
@k8s-triage-robot: Closing this issue, marking it as "Not Planned". In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/reopen |
@szuecs: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/remove-lifecycle rotten |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
Is this being worked on? |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
We have a use-case where we run a periodic synchronization cronjob across projects, and as all these cronjobs start their pod at the same time, it increases database load significantly (as the underlying database is the same for all these projects). This load increase would be reduced if the cronjobs would not all start at the same time. For such short-lived cronjobs, the suggested "workaround" of using an initContainer with a random |
Another way to resolve this would be to implement the RANDOM_DELAY variable like cronie does. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
What would you like to be added: CronJobs should have fields allowing for their start time to be randomized (as systemd's Timers have), or even distributed within a range to spread out in equal intervals with other CronJobs of a set. At the very least, it would be useful to specify concurrency policies like "do not run more than two Jobs with the selected annotation", so that scheduling will be postponed until that policy could be met.
Why is this needed: I have a cluster with very few cores, and KubeApps adds a CronJob to sync every repo in the cluster, one for each repo, each one scheduled to run every 10 minutes. This would only take a few seconds per repo, but when it tries to run the CronJob at the same time for every repo, it causes a "Three Stooges effect", and most of the synchronization jobs end up timing out for lack of available processing. If I could stagger these synchronizations, they could all succeed, and my cluster wouldn't be momentarily starved of resources while it happens.
The text was updated successfully, but these errors were encountered: