How to get the values of a dictionary type from a parquet file using pyarrow?
I have a parquet file which I am reading with pyarrow.
How can I extract data from parquet files using pyarrow?
I’m trying to perform data analysis on a large number of quite large parquet files. The analysis itself is relatively simple, but using e.g. pandas requires nested for loops to slice the data into increasingly small bites to then extract the data.
Efficient way to list parquet file partitions in python
I have a partitioned parquet file that I want to read each partition iteratively.