Error loading HF model and tokenizer from local files: ‘tokenizer’ is not defined
I’m trying to load a HF model (Mistral-7B-v0.1) and tokenizer offline, from local files and using transformers library. I’ve downloaded the whole repository (config, model, weights, tokenizers, etc.) and I’m using the following code to load them which outputs : “name ‘tokenizer’ is not defined”
Failed to import transformers.integrations.peft
RuntimeError: Failed to import transformers.models.bert.modeling_bert because of the following error (look up to see its traceback):
Failed to import transformers.integrations.peft because of the following error (look up to see its traceback):
/usr/local/lib/python3.10/dist-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops5zeros4callEN3c108ArrayRefINS2_6SymIntEEENS2_8optionalINS2_10ScalarTypeEEENS6_INS2_6LayoutEEENS6_INS2_6DeviceEEENS6_IbEE
Memory Error While Fine-tuning 13B parameters on 8 H100 GPUs
I am currently trying to fine-tune an AYA model which has 12.95 B parameters on 8 H100 GPUs, but I’m encountering a memory error. My system has 640 GB of GPU RAM, which I assumed would be sufficient for this task. I’m not using PEFT or LoRA, and my batch size is set to 1.
I’m wondering if anyone has encountered a similar issue and could provide some guidance. How many GPUs are typically recommended for this task? Any help would be greatly appreciated.
Memory Error While Fine-tuning AYA on 8 H100 GPUs
I am currently trying to fine-tune an AYA model on 8 H100 GPUs, but I’m encountering a memory error. My system has 640 GB of GPU RAM, which I assumed would be sufficient for this task. I’m not using PEFT or LoRA, and my batch size is set to 1.
I’m wondering if anyone has encountered a similar issue and could provide some guidance. How many GPUs are typically recommended for this task? Any help would be greatly appreciated.