Python PDF searcher overflows the RAM
As part of my program, I’m trying to use the pdfminer
third-party library in Python to open and read the PDF pages, and then use regular expressions to search for specific patterns. I’m also using multiprocessing
to parallelize this, because I have a large number of PDFs to analyze. Each process should be handling a single PDF.