Sample weights for loss computing by huggingface transformer model
I’m training a GPT2LMHeadModel
in Python using huggingface’s transformers
library. The task is next token prediction. If I understand correctly, if this object is provided a labels
argument, it should automatically compute loss for each pair of tokens, i.e. if I have input [1,2,3]
, then the object should compute the loss for each of the (input, next_token) combinations: ([1], [2])
and ([1,2], [3])