Skip to content Skip to sidebar Skip to footer

How To Open More Than 19 Files In Parallel (python)?

I have a project that needs to read data, then write in more than 23 CSV files in parallel depending on each line. For example, if the line is about temperature, we should write to

Solution 1:

Each open is a nested context, its just that python syntax allows you to put them in a comma-separated list. contextlib.ExitStack is a context container that lets you put as many contexts as you like in a stack and exits each of them when you are done. So, you could do

import contextlib

files_to_process = (
    ('Results\\GHCN_Daily\\MetLocations.csv', 'locations'),
    ('Results\\GHCN_Daily\\Tmax.csv', 'tmax_d'),
    ('Results\\GHCN_Daily\\Tmin.csv', 'tmin_d'),
    # ...
)

with contextlib.ExitStack() as stack:
    files = {varname:stack.enter_context(open(filename, 'rb'))
        for filename, varname in files_to_process}
    # and for instance...
    files['locations'].writeline('my location\n')

If you find dict access less tidy than attribute access, you could create a simple container class

classSimpleNamespace:

    def__init__(self, name_val_pairs):
        self.__dict__.update(name_val_pairs)

with contextlib.ExitStack() as stack:
    files = SimpleNamespace(((varname, stack.enter_context(open(filename, 'rb')))
        for filename, varname in files_to_process))
    # and for instance...
    files.locations.writeline('my location\n')

Solution 2:

i would have a list of possible files = ['humidity','temperature',...] make a dic that contain the possible file, a dataframe, a path to the file, for example:

main_dic = {}

for file in possible_files:

    main_dic[file][path] = '%s.csv' %file
    main_dic[file][data] = pd.DataFrame([], columns=['value','other_column','another_column', ....])

afterwards, i wld read whatever doc you are getting the values from and store em on the proper dictionary dataframe.

when finished just save the data on csv, example:

forfileinmain_dic:

     main_dic[file][data].to_csv('%s.csv' %file, index=False)

hope it helps

Solution 3:

If the data is not very huge, why not read in all the data and group the data by categories ( e.g. put all data about temperature into one group ), then write the grouped data into corresponding files at one go?

Solution 4:

It would be ok to open >20 files in this way.

# your list of file names
file_names = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u'] 
fh = [] # list of file handlersfor idx,f inenumerate(files):
    fileName = f + '.txt'
    fh.append(open(fileName,'w'))

# do what you need hereprint"done"for f in fh:
    f.close() 

though not sure if you really need to do so.

Post a Comment for "How To Open More Than 19 Files In Parallel (python)?"