Relative Content

Tag Archive for pythonhtmlsplitlangchainpy-langchain

Splitting HTML File and Saving Chunks using LangChain

I’m very new to LangChain, and I’m working with around 100-150 HTML files on my local disk that I need to upload to a server for NLP model training. However, I have to divide my information into chunks because each file is only permitted to have a maximum of 20K characters. I’m trying to use the LangChain library to do so, but I’m not being successful in splitting my files into my desired chunks. For reference,