sparkSession can’t connect to hive databases
I created a spark session like that:
from pyspark.sql import SparkSession
How to load .dat file to Hive with additional columns?
I want to load .dat(without headers) file to hive external table.
But in hive table there are extra columns like cob_date
, region
, file_name
which are not present in .dat file.
cob_date will be the date present in feed name like customer_data_20240708
.dat
region and file_name cab be hard coded.
How to properly pass spark session with hive configuration to a function in pyspark?
I tried running a script like this: