How To Iterate Over Arbitrary Number Of Files In Parallel In Python?
I have a list of file objects in a list called paths I'd like to be able to go through and read the first line of each file, do something with this n-tuple of data, then move on th
Solution 1:
import itertools
for line_tuple in itertools.izip(*files):
whatever()
I'd use zip
, but that would read the entire contents of the files into memory. Note that files
should be a list of file objects; I'm not sure what you mean by "list of file handlers".
Solution 2:
This depends on how "arbitrary" it actually is. As long as the number is less than the limit of your OS, then itertools.izip
should work just fine (or itertools.izip_longest
as appropriate).
files = [open(f) for f in filenames]
for lines in itertools.izip(*files):
# do something
for f in files:
f.close()
If you can have more files than your OS will allow you to open, then you're out of luck (at least as far as an easy solution is concerned).
Solution 3:
the first idea pop into my mind the following code , it seems too Straightforward
fp_list = []
for file in path_array:
fp = open(file)
fp_list.append(fp)
line_list = []
for fp in fp_list:
line = fp.readline()
line_list.append(line)
## you code here process the line_list
for fp in fp_list:
fp.close()
Post a Comment for "How To Iterate Over Arbitrary Number Of Files In Parallel In Python?"