Relative Content

Tag Archive for nlpgpt-2perplexity

Why is Perplexity not reliable for open domain text generation tasks?

In the paper here, it says that perplexity as an automated metric is not reliable for open domain text generation tasks, but it instead uses lm-score, a model based metric to produce perplexity like values. What additional benefits does lm-score give instead of perplexity metric?