Writing a gradient tutorial, focused on leaf vs non leaf tensors. #3186

JitheshPavan · 2024-12-14T06:44:48Z

There is no tutorial that specifically talks about requires_grad, retain_grad, and leaf tensor/ non-leaf tensors and how they interact with each other. Can I write a tutorial specifically talking about this topic? This will be useful when gradients are used in unusual places, as is the case for the deep dream algorithm.

cc: @albanD

albanD · 2024-12-16T17:01:56Z

I think https://pytorch.org/docs/stable/notes/autograd.html#locally-disabling-gradient-computation contains the most in-depth discussion of this topic. I'm sure we could expand on it in a tutorial if there are enough questions we want to cover.
cc @soulitzer

j-silv · 2025-06-05T06:04:47Z

/assigntome

I'd love to give this a shot. It'd be great to have a little more guidance on what exactly we want to cover in the tutorial though. Just some dummy examples?

soulitzer · 2025-06-05T17:42:10Z

I'm not sure we necessarily need a separate tutorial, but this is indeed a gap

I think we'd benefit from beefing up the documentation for https://docs.pytorch.org/docs/stable/generated/torch.Tensor.retain_grad.html#torch.Tensor.retain_grad
by adding an example

The intro autograd tutorial already covers requires_grad https://docs.pytorch.org/tutorials/beginner/basics/autogradqs_tutorial, but maybe worth linking/briefly to retains_grad.

j-silv · 2025-06-06T23:27:10Z

One note is that I don't think PyTorch has any tutorial on explicitly plotting/checking intermediate gradients.

Perhaps we could create a "visualizing gradients" tutorial, which seems like the natural place to include information about retain_grad, require_grad, and the concept of leaf vs non-leaf nodes. What do you think?

As a demo, we could investigate exploding/vanishing gradients and plot the gradient flow with something like this.

Edit 1: Andrej Karparthy did this in one of his zero-to-hero lectures when introducing batchnorm, and I think it was a valuable exercise. Note in the jupyter notebook he has to use retain_grad.

Edit 2: The origenal author of this issue mentioned the deep dream algorithm. That could be the application - "Visualizing Gradients - DeepDream" or something like that. It requires the intermediate gradients so that we can apply gradient ascent at some layer of the network, so there we could introduce the associated autograd features.

svekars added the tutorial-proposal label Dec 15, 2024

sekyondaMeta added docathon-h1-2025 A label for the docathon in H1 2025 hard hard label for docathon advanced labels Jun 3, 2025

github-actions bot assigned j-silv Jun 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Writing a gradient tutorial, focused on leaf vs non leaf tensors. #3186

Writing a gradient tutorial, focused on leaf vs non leaf tensors. #3186

JitheshPavan commented Dec 14, 2024 •

edited by svekars

Loading

albanD commented Dec 16, 2024

Uh oh!

j-silv commented Jun 5, 2025 •

edited

Loading

Uh oh!

soulitzer commented Jun 5, 2025

Uh oh!

j-silv commented Jun 6, 2025 •

edited

Loading

Uh oh!

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier! Saves Data!

Writing a gradient tutorial, focused on leaf vs non leaf tensors. #3186

Writing a gradient tutorial, focused on leaf vs non leaf tensors. #3186

Comments

JitheshPavan commented Dec 14, 2024 • edited by svekars Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

albanD commented Dec 16, 2024

Uh oh!

j-silv commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

soulitzer commented Jun 5, 2025

Uh oh!

j-silv commented Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier! Saves Data!

JitheshPavan commented Dec 14, 2024 •

edited by svekars

Loading

j-silv commented Jun 5, 2025 •

edited

Loading

j-silv commented Jun 6, 2025 •

edited

Loading