Relative Content

Tag Archive for pythonpytorchnlpautoencodertransformer-model

Issue with PyTorch’s transformer Model repeating last token during inference

I’ve been trying to implement PyTorch’s nn.TransformerEncoder and nn.TransformerDecoder solutions into a simple model, but I’m running into an issue that I’m unable to resolve where during inference the model only produces the last token fed into it.
For example lets say I have a tensor [1,2,3,4,5] the model will continue the sequence with [1,2,3,4,5,5,5,5,5,5,…] or if I had [5,2,8,3] it would continue to produce [5,2,8,3,3,3,3,3,3,3,…] even when using training data as input although when using a new randomly initialized model it will produce diverse output although since not trained is useless.