Relative Content

Tag Archive for pythonapache-sparkpysparkapache-spark-sql

Needing help installing Pyspark on Windows 10

I am trying to install Pyspark on my laptop and went through all the steps as per
https://medium.com/@deepaksrawat1906/a-step-by-step-guide-to-installing-pyspark-on-windows-3589f0139a30

Configure PySpark to use Sunday as Start of Week

According to the pyspark docs for weekOfYear(), Monday is considered to be the start of the week. However, dayOfWeek() uses Sunday as the start date. I do a lot of reporting on previous periods where I’m calculating change week over week but also for the same period last year. This becomes problematic because I am reliant on both weekOfYear() and dayOfWeek() to correctly calculate these time periods but in order to properly calculate, they both need to start on the same day (which in my case, should be Sunday). Does anyone know of a way to change a config or something in pyspark so that it will consider Sunday as the start of the week for ALL datetime calculations (including weekOfYear())? I really don’t want to have to write a custom function to do this.