How do I correct the perplexity of my Bigram language model with good turing smoothing increasing with the size of training sets?
So as an assignment in my masters course we were given an assignment to create bigram and trigram language models and use different forms of smoothing to see how each method affects the perplexity of the model. Our smoothing methods consist of Good-Turing smoothing and Kneser-ney smoothing, but I am stuck at the Good-Turing smoothing method.