Relative Content

Tag Archive for apache-sparkpysparkjvmout-of-memoryjava-heap

Facing “java.lang.OutOfMemoryError: Java heap space error” for the following script

import json from datetime import datetime from pathlib import Path import os import shutil from pyspark.sql import functions as F from pyspark.sql.functions import rtrim, udf, unix_timestamp from pyspark.sql.types import BinaryType, StringType, TimestampType from pyspark.sql import SparkSession from pyspark.conf import SparkConf from tika import detector import sys from datetime import datetime spark = SparkSession .builder .master(“local[1]”) […]