-
Notifications
You must be signed in to change notification settings - Fork 111
Fix preempt on Phoenix, add Frontier walltime #954
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR improves CI job handling on HPC systems by addressing job preemption scenarios and extending walltime allocations. The changes focus on making the CI more robust when dealing with SLURM scheduler behavior on Phoenix and Frontier systems.
- Adds PREEMPTED state handling to Phoenix job submission scripts to properly fail jobs that get preempted
- Increases walltime from ~2 hours to ~3 hours for Frontier CI jobs to reduce timeout issues
- Removes emoji characters from log messages for cleaner, more professional output
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
File | Description |
---|---|
.github/workflows/phoenix/submit.sh |
Adds PREEMPTED state handling and removes emojis from status messages |
.github/workflows/phoenix/submit-bench.sh |
Adds PREEMPTED state handling and removes emojis from status messages |
.github/workflows/frontier/submit.sh |
Increases job walltime from 01:59:00 to 02:59:00 |
.github/workflows/frontier/submit-bench.sh |
Increases job walltime from 01:59:00 to 02:59:00 |
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #954 +/- ##
=======================================
Coverage 44.06% 44.06%
=======================================
Files 68 68
Lines 18220 18220
Branches 2292 2292
=======================================
Hits 8029 8029
Misses 8821 8821
Partials 1370 1370 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
User description
preempt
statePR Type
Bug fix, Enhancement
Description
Fix Phoenix job handling for
PREEMPTED
stateIncrease Frontier CI job walltime from 2 to 3 hours
Replace emoji output with plain text messages
Improve SLURM job state monitoring reliability
Diagram Walkthrough
File Walkthrough
submit-bench.sh
Increase benchmark job walltime
.github/workflows/frontier/submit-bench.sh
submit.sh
Increase standard job walltime
.github/workflows/frontier/submit.sh
submit-bench.sh
Fix preemption handling and messaging
.github/workflows/phoenix/submit-bench.sh
PREEMPTED
to terminal SLURM job statessubmit.sh
Fix preemption handling and messaging
.github/workflows/phoenix/submit.sh
PREEMPTED
to terminal SLURM job states