Efficient way to perform large spatial join between expansive point dataset and global country polygons?
I would like to partition the Overture Maps global building polygon dataset by country and I’m currently attempting to use Apache Sedona to accomplish this in a distributed manner, by spatially joining the building polygons (over 2B rows) to the country polygons (200 or so rows, but very complex geometry with many vertices) to assign a ‘Country’ column to the buildings. I tried to map the building polygons into centroids so I can do a more simple point-in-polygon join rather than polygon-in-polygon join. However, my job is taking considerably long to run, I last checked it at over 24 Hours and killed the job to conserve the resources.