-
Notifications
You must be signed in to change notification settings - Fork 111
Probe WRT on GPUs #964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Probe WRT on GPUs #964
Conversation
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #964 +/- ##
==========================================
- Coverage 44.08% 44.03% -0.05%
==========================================
Files 69 69
Lines 19573 19630 +57
Branches 2428 2428
==========================================
+ Hits 8628 8645 +17
- Misses 9444 9484 +40
Partials 1501 1501 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Is src/simulation/m_time_steppers.fpp [1122-1134] not a race condition on GPU? |
@sbryngelson I don't think so, because each thread is only reading and writing from one location in the array |
User description
Description
Bug fix for probe_wrt on GPUs, where the acceleration and center of mass are written. Previously gave NaNs and had poor performance since this subroutine was not ported to GPUs
Fixes #(issue) [optional]
Type of change
Please delete options that are not relevant.
Scope
If you cannot check the above box, please split your PR into multiple PRs that each have a common goal.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes.
Provide instructions so we can reproduce.
Please also list any relevant details for your test configuration
Test Configuration:
Checklist
docs/
)examples/
that demonstrate my new feature performing as expected.They run to completion and demonstrate "interesting physics"
./mfc.sh format
before committing my codeIf your code changes any code source files (anything in
src/simulation
)To make sure the code is performing as expected on GPU devices, I have:
nvtx
ranges so that they can be identified in profiles./mfc.sh run XXXX --gpu -t simulation --nsys
, and have attached the output file (.nsys-rep
) and plain text results to this PR./mfc.sh run XXXX --gpu -t simulation --rsys --hip-trace
, and have attached the output file and plain text results to this PR.PR Type
Enhancement
Description
Add GPU support for probe write functionality
Implement GPU memory management for finite difference coefficients
Convert array operations to GPU-compatible loops
Add atomic operations for thread-safe center of mass calculations
Diagram Walkthrough
File Walkthrough
m_checker.fpp
Prohibit probe writes with IGR
src/simulation/m_checker.fpp
m_derived_variables.fpp
GPU support for derived variables computation
src/simulation/m_derived_variables.fpp
m_time_steppers.fpp
GPU-compatible time step cycling
src/simulation/m_time_steppers.fpp