Relative Content

Tag Archive for apache-sparkapache-iceberg

Extremely slow MERGE INTO statements

We have an Apache Iceberg data lake. We are using structured streaming and getting batches of approximately 10 records. When we merge this data frame into a table of approximately 600 records, we are seeing approximately 2 minutes of delay wall clock time. It is an EMR cluster and not under load.

Why is Spark SQL running extremely slow?

I’m using Spark SQL to perform a simple query from my Iceberg table. Some info about the table itself because that might be useful (state from the moment of posting this question):