Relative Content

Tag Archive for streamlitlarge-language-modelollamapgvectorllama3

How to reduce the latency in RaG based local chatbot for pdf (Ollama,Llama3,pgvector)?

i am trying to build a local chatbot for pdf’s using RAG,Ollama,llama3 ,pgvector and streamlit. It is working fine but the time take to generate first token is almost 262.5005s or even more. I don’t have a GPU. Working on windows 11 and CPU of 16gb RAM.When i run the app and upload any pdf it takes almost 7-8minutes to respond to each query. I was thinking if there’s any way we can preprocess the pdf(1000pdf) beforehand and than inject to the vectordata base? Any suggestion would be helpful.

Thiết kế website giá rẻ

Danh mục

Relative Content

Tag Archive for streamlitlarge-language-modelollamapgvectorllama3

How to reduce the latency in RaG based local chatbot for pdf (Ollama,Llama3,pgvector)?