how to reduce memory footprint when reading parquet file
I want to read a parquet file batch by batch in parallelism. I achieve this by merge multiple continuous rowgroup together and read them by arrow::RecordBatchReader. When I monitor the memory usage during reading, I noticed that the memory was increasing until overall reading done. However, I want to reduce the memory footprint, releasing the memory as soon as current thread finish current batch reading.
how to reduce memory footprint when reading parquet file
I want to read a parquet file batch by batch in parallelism. I achieve this by merge multiple continuous rowgroup together and read them by arrow::RecordBatchReader. When I monitor the memory usage during reading, I noticed that the memory was increasing until overall reading done. However, I want to reduce the memory footprint, releasing the memory as soon as current thread finish current batch reading.
how to reduce memory footprint when reading parquet file
I want to read a parquet file batch by batch in parallelism. I achieve this by merge multiple continuous rowgroup together and read them by arrow::RecordBatchReader. When I monitor the memory usage during reading, I noticed that the memory was increasing until overall reading done. However, I want to reduce the memory footprint, releasing the memory as soon as current thread finish current batch reading.
how to reduce memory footprint when reading parquet file
I want to read a parquet file batch by batch in parallelism. I achieve this by merge multiple continuous rowgroup together and read them by arrow::RecordBatchReader. When I monitor the memory usage during reading, I noticed that the memory was increasing until overall reading done. However, I want to reduce the memory footprint, releasing the memory as soon as current thread finish current batch reading.