Relative Content

Tag Archive for gpunvidialarge-language-modelllama-cpp-python

Utilising GPU on vast.ai instead of just CPU

I am running inference on vast.ai with NVIDIA-SMI 560.28.03 and CUDA Version: 12.6. I am using llama.cpp to run a GGUF version of Mistral. My code, when I run it, uses only CPU.
Any help is appreciated that will make my code run on GPU.