Pysprak running out of memory using splink
I am using the splink library for user clustering.
I am currently using DuckDB as my database, I read that SparkDB is more efficient and I wanted to try it.
The problem is that I get an OutOfMemoryError, even tho I’m just using the code from the splink documentation.
This is the code