LSTM Losses Diverge quite drastically
I’m trying to train a model that can predict UFC fights. My data is set up so that it takes the historical fights of a fighter and their opponent and matches their most recent fights together all the way down to the oldest. I add padding if one fighter has more fights than another. My problem is though after a number of epochs, my test and train losses diverge rapidly. I already tried a good amount of regularization techniques such as a learning rate scheduler, weight decay, gradient clipping etc, but I am unable to get normal loss behavior. What other techniques can I try?
Train error decreases consistently, but test error does not, even when test dataset is a subset of train dataset
My data comprise of 6 features coming from sensors. I am training an LSTM network on this data to predict three values.
During training, my training loss was consistently decreasing with each epoch, but test loss did not decrease much after couple of epochs.
This was the case when there was no overlap between training and test data. So I tried using subset of training data as test data.
But, still the same behavior, the test loss was still not decreasing.