Python: Extract Gz Files With And Honor Original Filenames And File Extensions
Under a folder, I have many .gz files and within these gz files some are .txt, some are .csv, some are .xml, or some other extensions. E.g. gz (the original/compressed file in()) f
Solution 1:
import gzip
import os
INPUT_DIRECTORY = 'C:\Xiang'
OUTPUT_DIRECTORY = 'C:\UnZipGz'
GZIP_EXTENSION = '.gz'defmake_output_path(output_directory, zipped_name):
""" Generate a path to write the unzipped file to.
:param str output_directory: Directory to place the file in
:param str zipped_name: Name of the zipped file
:return str:
"""
name_without_gzip_extension = zipped_name[:-len(GZIP_EXTENSION)]
return os.path.join(output_directory, name_without_gzip_extension)
for file in os.scandir(INPUT_DIRECTORY):
ifnot file.name.lower().endswith(GZIP_EXTENSION):
continue
output_path = make_output_path(OUTPUT_DIRECTORY, file.name)
print('Decompressing', file.path, 'to', output_path)
with gzip.open(file.path, 'rb') as file:
withopen(output_path, 'wb') as output_file:
output_file.write(file.read())
Explanation:
- Iterate through all files in the folder with the relevant extension.
- Generate a path to the new directory without the gzip extension.
- Open the file and write its decompressed contents to the new path.
To retrieve the original file name, you can use gzinfo
:
https://github.com/PierreSelim/gzinfo
>>> import gzinfo
>>> info = gzinfo.read_gz_info('bar.txt.gz')
>>> info.fname
'foo.txt'
References to extract original file name:
Post a Comment for "Python: Extract Gz Files With And Honor Original Filenames And File Extensions"