Fix fine-tuning training loss accumulation #725

celestinoalan · 2024-10-14T02:50:37Z

What does this PR do?

Problem:

In /src/llama_recipes/utils/train_utils.py the training loss is correctly divided by the # of gradient accumulation steps to scale down the gradient:

loss = loss / gradient_accumulation_steps

The training loss is then accumulated

total_loss += loss.detach().float()

and used in the following to calculate the average loss across all samples in the epoch:

train_epoch_loss = total_loss / len(train_dataloader)

As the accumulated loss is scaled down by gradient_accumulation_steps and len(train_dataloader) includes all steps (even the gradient accumulation ones), train_epoch_loss is gradient_accumulation_steps times lower than it should be.

Solution:

Accumulate the loss
total_loss += loss.detach().float()

before scaling it down

loss = loss / gradient_accumulation_steps

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
[ X ] Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Thanks for contributing 🎉!

mreso

LGTM, thanks for contributing!

Fix fine-tuning loss accumulation

4eb82ae

facebook-github-bot added the cla signed label Oct 14, 2024

mreso approved these changes Oct 15, 2024

View reviewed changes

mreso merged commit d6ae203 into meta-llama:main Oct 15, 2024
2 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix fine-tuning training loss accumulation #725

Fix fine-tuning training loss accumulation #725

Uh oh!

celestinoalan commented Oct 14, 2024

Uh oh!

mreso left a comment

Uh oh!

Uh oh!

Uh oh!

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Fix fine-tuning training loss accumulation #725

Fix fine-tuning training loss accumulation #725

Uh oh!

Conversation

celestinoalan commented Oct 14, 2024

What does this PR do?

Problem:

Solution:

Before submitting

Uh oh!

mreso left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.