How to solve exploding gradient problem in VAE training?
I was trying to implement VAE on the CelebA dataset inspired by the Tensorflow implementation of MNIST. I have tried varying batch size but there seems to be no effect from that. The image formed is mostly grey all the time. Ideally, we want KL divergence and Reconstruction loss both to be close to zero but in my case, both are increasing exponentially.
This is the loss curve I am getting.
Here is the Loss function definition block: