Relative Content

Tag Archive for nlpartificial-intelligencelarge-language-modelllamafine-tuning

LLaMA 3.1 Fine-tuning with QLoRA – CUDA Out of Memory Error

I’m trying to fine-tune the LLaMA 3.1 8 billion parameters model using the QLoRA technique with the help of the 4-bit bitsandbytes library on a mental health conversations dataset from Hugging Face. However, when I run the code, I’m encountering a torch.cuda.OutOfMemoryError. I’ve tried using multiple GPUs and also higher GPU memory, but the error persists.