Can we create multiple Spark executors within a single driver node on a Databricks cluster?
I have a power user compute with a single driver node and I’m trying to parallelize forecasting across multiple series by aggregating the data and doing a groupBy and then an apply on the groupBy.
Pyspark transformation causing out of memory issues
So I have a spark dataframe with multiple columns which are complex structs. I am trying to transform the value of a field in one of the struct columns based on the value of a field in another struct column. I am working with spark 3.5
so I am using the withField
function. The transformation: