Fuzzy and exact matching using Arrow and Duckdb R
I have a large dataset of over 43 million rows and 3.84 GB and another dataset of over 6000 rows and 459 KB. I am trying to do an inner_join()
based on two columns: One exact column based on a common id
and the a fuzzy match based on fullname
. I tried the following based on the information I found on this post, however I am facing memory issues: