Skip to content

Track the accurate check regress for DebertaForQuestionAnswering and nanogpt #122987

@shunting314

Description

@shunting314

🐛 Describe the bug

I could repro locally.

time python benchmarks/dynamo/huggingface.py --backend inductor --amp --accuracy --only DebertaForQuestionAnswering --training 

The error message:

E0329 13:56:13.126000 139908396307456 torch/_dynamo/utils.py:1405] RMSE (res-fp64): 0.01772, (ref-fp64): 0.00543 and shape=torch.Size([2]). res.dtype: torch.float32, multiplier: 3.000000, tol: 0.010000

indicate that increasing tolerance to 0.02 can make the test pass.

But on the other hand, dashboard run uses a different cuda version (12.1. Local server uses 12.0) and cause different numerical results. According to the error message on the dashboard run

2024-03-28T19:23:00.3165713Z E0328 19:23:00.315000 140426862847168 torch/_dynamo/utils.py:1405] RMSE (res-fp64): 0.01778, (ref-fp64): 0.00506 and shape=torch.Size([2]). res.dtype: torch.float32, multiplier: 3.000000, tol: 0.010000

we need tolerance 0.03 to pass the test.

I've also done ablation test. I'm on commit 57a9a64 (clean trunk) and then revert #122848, #122841, #121692 in that order, than the test pass again locally.

nanogpt fail for the same reasons.

Update: DebertaForQuestionAnswering starts passing the accurate test from June14 2024.

Error logs

.

Minified repro

No response

Versions

.

cc @ezyang @chauhang @penguinwu @msaroufim @bdhirsh @anijain2305 @zou3519 @Chillee

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: pt2 accuracyoncall: pt2pt2-pass-rate-regressionTrack regression of PT2 dashboard pass ratetriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      pFad - Phonifier reborn

      Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

      Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


      Alternative Proxies:

      Alternative Proxy

      pFad Proxy

      pFad v3 Proxy

      pFad v4 Proxy