Can Pyarrow Write Multiple Parquet Files To A Folder Like Fastparquet's File_scheme='hive' Option?
I have a multi-million record SQL table that I'm planning to write out to many parquet files in a folder, using the pyarrow library. The data content seems too large to store in a
Solution 1:
Try pyarrow.parquet.write_to_dataset
https://github.com/apache/arrow/blob/master/python/pyarrow/parquet.py#L938.
I opened https://issues.apache.org/jira/browse/ARROW-1858 about adding some more documentation about this.
I recommend seeking support for Apache Arrow on the mailing list dev@arrow.apache.org. Thanks!
Post a Comment for "Can Pyarrow Write Multiple Parquet Files To A Folder Like Fastparquet's File_scheme='hive' Option?"