Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

单机8卡训练 loss 上万 正常吗 #402

Open
wenshuaishuai123 opened this issue Dec 25, 2024 · 2 comments
Open

单机8卡训练 loss 上万 正常吗 #402

wenshuaishuai123 opened this issue Dec 25, 2024 · 2 comments

Comments

@wenshuaishuai123
Copy link

2024-12-25 09:58:34,125 [WARNING] overflow, still update, loss scale adjust to 1024.0
2024-12-25 09:58:42,787 [INFO] Epoch 1/200, Step 100/3995, imgsize (640, 640), loss: 11.6707, lbox: 3.6262, lcls: 5.5689, dfl: 2.4756, cur_lr: 0.09924906492233276
2024-12-25 09:58:42,800 [INFO] Epoch 1/200, Step 100/3995, step time: 5736.71 ms
2024-12-25 09:59:48,228 [WARNING] overflow, still update, loss scale adjust to 1024.0
2024-12-25 09:59:48,236 [INFO] Epoch 1/200, Step 200/3995, imgsize (640, 640), loss: 8.8465, lbox: 2.9380, lcls: 3.8950, dfl: 2.0134, cur_lr: 0.09849812090396881
2024-12-25 09:59:48,241 [INFO] Epoch 1/200, Step 200/3995, step time: 654.40 ms
2024-12-25 10:00:57,649 [WARNING] overflow, still update, loss scale adjust to 1024.0
2024-12-25 10:00:57,655 [INFO] Epoch 1/200, Step 300/3995, imgsize (640, 640), loss: 8712.5938, lbox: 4.1268, lcls: 8705.2168, dfl: 3.2503, cur_lr: 0.09774718433618546
2024-12-25 10:00:57,659 [INFO] Epoch 1/200, Step 300/3995, step time: 694.17 ms
2024-12-25 10:02:03,713 [WARNING] overflow, still update, loss scale adjust to 1024.0
2024-12-25 10:02:03,724 [INFO] Epoch 1/200, Step 400/3995, imgsize (640, 640), loss: 15806.3818, lbox: 3.8463, lcls: 15799.6025, dfl: 2.9331, cur_lr: 0.0969962477684021
2024-12-25 10:02:03,733 [INFO] Epoch 1/200, Step 400/3995, step time: 660.74 ms
2024-12-25 10:03:09,462 [WARNING] overflow, still update, loss scale adjust to 1024.0
2024-12-25 10:03:09,468 [INFO] Epoch 1/200, Step 500/3995, imgsize (640, 640), loss: 14542.0029, lbox: 3.6080, lcls: 14535.4277, dfl: 2.9665, cur_lr: 0.09624530375003815
2024-12-25 10:03:09,472 [INFO] Epoch 1/200, Step 500/3995, step time: 657.38 ms
2024-12-25 10:04:13,982 [WARNING] overflow, still update, loss scale adjust to 1024.0
2024-12-25 10:04:13,988 [INFO] Epoch 1/200, Step 600/3995, imgsize (640, 640), loss: 13640.3447, lbox: 3.4523, lcls: 13633.9424, dfl: 2.9506, cur_lr: 0.09549436718225479
2024-12-25 10:04:13,992 [INFO] Epoch 1/200, Step 600/3995, step time: 645.19 ms
2024-12-25 10:05:18,784 [WARNING] overflow, still update, loss scale adjust to 1024.0
2024-12-25 10:05:19,022 [INFO] Epoch 1/200, Step 700/3995, imgsize (640, 640), loss: 14865.3457, lbox: 2.9226, lcls: 14859.6777, dfl: 2.7448, cur_lr: 0.09474343061447144
2024-12-25 10:05:19,026 [INFO] Epoch 1/200, Step 700/3995, step time: 650.34 ms
2024-12-25 10:06:24,077 [WARNING] overflow, still update, loss scale adjust to 1024.0
2024-12-25 10:06:24,083 [INFO] Epoch 1/200, Step 800/3995, imgsize (640, 640), loss: 13313.3662, lbox: 3.0049, lcls: 13307.6836, dfl: 2.6778, cur_lr: 0.09399249404668808
2024-12-25 10:06:24,087 [INFO] Epoch 1/200, Step 800/3995, step time: 650.60 ms
2024-12-25 10:07:28,845 [WARNING] overflow, still update, loss scale adjust to 1024.0
2024-12-25 10:07:28,851 [INFO] Epoch 1/200, Step 900/3995, imgsize (640, 640), loss: 12766.6729, lbox: 3.1789, lcls: 12761.0127, dfl: 2.4816, cur_lr: 0.09324155002832413
2024-12-25 10:07:28,855 [INFO] Epoch 1/200, Step 900/3995, step time: 647.67 ms
2024-12-25 10:08:33,407 [WARNING] overflow, still update, loss scale adjust to 1024.0
2024-12-25 10:08:33,412 [INFO] Epoch 1/200, Step 1000/3995, imgsize (640, 640), loss: 12249.1543, lbox: 3.0418, lcls: 12243.4072, dfl: 2.7049, cur_lr: 0.09249061346054077
2024-12-25 10:08:33,417 [INFO] Epoch 1/200, Step 1000/3995, step time: 645.61 ms
2024-12-25 10:09:37,932 [WARNING] overflow, still update, loss scale adjust to 1024.0
2024-12-25 10:09:37,938 [INFO] Epoch 1/200, Step 1100/3995, imgsize (640, 640), loss: 11691.6152, lbox: 2.8525, lcls: 11686.2617, dfl: 2.5009, cur_lr: 0.09173967689275742
2024-12-25 10:09:37,942 [INFO] Epoch 1/200, Step 1100/3995, step time: 645.24 ms
2024-12-25 10:10:42,984 [WARNING] overflow, still update, loss scale adjust to 1024.0
2024-12-25 10:10:42,990 [INFO] Epoch 1/200, Step 1200/3995, imgsize (640, 640), loss: 10623.4971, lbox: 2.7333, lcls: 10618.3301, dfl: 2.4338, cur_lr: 0.09098873287439346
2024-12-25 10:10:42,994 [INFO] Epoch 1/200, Step 1200/3995, step time: 650.52 ms
2024-12-25 10:11:48,073 [WARNING] overflow, still update, loss scale adjust to 1024.0
2024-12-25 10:11:48,078 [INFO] Epoch 1/200, Step 1300/3995, imgsize (640, 640), loss: 10460.2178, lbox: 2.6495, lcls: 10455.1865, dfl: 2.3816, cur_lr: 0.09023779630661011
2024-12-25 10:11:48,083 [INFO] Epoch 1/200, Step 1300/3995, step time: 650.88 ms
2024-12-25 10:12:52,814 [WARNING] overflow, still update, loss scale adjust to 1024.0
2024-12-25 10:12:52,819 [INFO] Epoch 1/200, Step 1400/3995, imgsize (640, 640), loss: 9755.9609, lbox: 2.7133, lcls: 9750.8545, dfl: 2.3935, cur_lr: 0.08948685973882675
2024-12-25 10:12:52,823 [INFO] Epoch 1/200, Step 1400/3995, step time: 647.40 ms
2024-12-25 10:13:57,465 [WARNING] overflow, still update, loss scale adjust to 1024.0
2024-12-25 10:13:57,470 [INFO] Epoch 1/200, Step 1500/3995, imgsize (640, 640), loss: 9220.7178, lbox: 2.7529, lcls: 9215.6152, dfl: 2.3498, cur_lr: 0.0887359231710434
2024-12-25 10:13:57,475 [INFO] Epoch 1/200, Step 1500/3995, step time: 646.51 ms
2024-12-25 10:15:02,098 [WARNING] overflow, still update, loss scale adjust to 1024.0
2024-12-25 10:15:02,104 [INFO] Epoch 1/200, Step 1600/3995, imgsize (640, 640), loss: 8534.4043, lbox: 2.8290, lcls: 8529.2344, dfl: 2.3404, cur_lr: 0.08798497915267944
2024-12-25 10:15:02,108 [INFO] Epoch 1/200, Step 1600/3995, step time: 646.32 ms
2024-12-25 10:16:07,562 [WARNING] overflow, still update, loss scale adjust to 1024.0
2024-12-25 10:16:07,568 [INFO] Epoch 1/200, Step 1700/3995, imgsize (640, 640), loss: 8492.0391, lbox: 2.7059, lcls: 8487.0264, dfl: 2.3066, cur_lr: 0.08723404258489609
2024-12-25 10:16:07,573 [INFO] Epoch 1/200, Step 1700/3995, step time: 654.64 ms
2024-12-25 10:17:12,422 [WARNING] overflow, still update, loss scale adjust to 1024.0
2024-12-25 10:17:12,427 [INFO] Epoch 1/200, Step 1800/3995, imgsize (640, 640), loss: 7517.5137, lbox: 2.8103, lcls: 7512.5049, dfl: 2.1989, cur_lr: 0.08648310601711273
2024-12-25 10:17:12,432 [INFO] Epoch 1/200, Step 1800/3995, step time: 648.58 ms
2024-12-25 10:18:17,123 [WARNING] overflow, still update, loss scale adjust to 1024.0
2024-12-25 10:18:17,128 [INFO] Epoch 1/200, Step 1900/3995, imgsize (640, 640), loss: 6724.4546, lbox: 2.8944, lcls: 6719.3188, dfl: 2.2414, cur_lr: 0.08573216199874878
2024-12-25 10:18:17,132 [INFO] Epoch 1/200, Step 1900/3995, step time: 647.00 ms
2024-12-25 10:19:21,831 [WARNING] overflow, still update, loss scale adjust to 1024.0
2024-12-25 10:19:21,836 [INFO] Epoch 1/200, Step 2000/3995, imgsize (640, 640), loss: 5919.8125, lbox: 2.4447, lcls: 5915.3486, dfl: 2.0192, cur_lr: 0.08498122543096542
2024-12-25 10:19:21,841 [INFO] Epoch 1/200, Step 2000/3995, step time: 647.07 ms

@WongGawa
Copy link
Collaborator

您好,可以的话请提供一下相关的模型和背景信息

@wenshuaishuai123
Copy link
Author

用的是yolov8 我调小学习率 0.0001训练就正常了 现在只有2.几

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy