Skip to content Skip to sidebar Skip to footer

Can Pyarrow Write Multiple Parquet Files To A Folder Like Fastparquet's File_scheme='hive' Option?

I have a multi-million record SQL table that I'm planning to write out to many parquet files in a folder, using the pyarrow library. The data content seems too large to store in a

Solution 1:

Try pyarrow.parquet.write_to_datasethttps://github.com/apache/arrow/blob/master/python/pyarrow/parquet.py#L938.

I opened https://issues.apache.org/jira/browse/ARROW-1858 about adding some more documentation about this.

I recommend seeking support for Apache Arrow on the mailing list dev@arrow.apache.org. Thanks!

Post a Comment for "Can Pyarrow Write Multiple Parquet Files To A Folder Like Fastparquet's File_scheme='hive' Option?"