Are there any potential issues training a T5-small from scratch on a task with very limited vocabulary?

  Kiến thức lập trình

Suppose you would like to train a sequence-to-sequence model like T5-small from scratch on a task where the vocabulary is quite limited compared to the tokenizer of T5 which was trained on much larger vocabulary.

For instance, the data have the following format:

Can you please add A and B?

e.g.
Can you please add 45 and 56?
Can you please add 87 and 34?

A and B are just placeholders for integer numbers.

Instead the tokenizer of T5 was trained to represent a vocabulary of approximately something like 32-50K tokens.

What would be some consideration and issues taken into account since in the data only a few tokens change every time?

Basically only tokens A and B change every time.

Is that still possible?

Theme wordpress giá rẻ Theme wordpress giá rẻ Thiết kế website Kho Theme wordpress Kho Theme WP Theme WP

LEAVE A COMMENT