Why can’t non-finite values be detected in forward pass of Pytorch Lightning model?
I’m working on a Vision Transformer using the Pytorch Lightning framework. I ran into an issue where the gradient and predictions would end up as nan after a few iterations of the training step.
Why can’t non-finite values be detected in forward pass of Pytorch Lightning model?
I’m working on a Vision Transformer using the Pytorch Lightning framework. I ran into an issue where the gradient and predictions would end up as nan after a few iterations of the training step.