Skip to content Skip to sidebar Skip to footer

Pandas Reading Csv Files With Partial Wildcard

I'm trying to write a script that imports a file, then does something with the file and outputs the result into another file. df = pd.read_csv('somefile2018.csv') The above code wo

Solution 1:

glob returns a list, not a string. The read_csv function takes a string as the input to find the file. Try this:

for f in glob('somefile*.csv'):
    df = pd.read_csv(f)
    ...
    # the rest of your script

Solution 2:

To read all of the files that follow a certain pattern, so long as they share the same schema, use this function:

import glob
import pandas as pd

defpd_read_pattern(pattern):
    files = glob.glob(pattern)

    df = pd.DataFrame()
    for f in files:
        df = df.append(pd.read_csv(f))

    return df.reset_index(drop=True)

df = pd_read_pattern('somefile*.csv')

This will work with either an absolute or relative path.

Solution 3:

You can get the list of the CSV files in the script and loop over them.

from os import listdir
from os.path import isfile, join
mypath = os.getcwd()

csvfiles = [f for f in listdir(mypath) if isfile(join(mypath, f)) if'.csv'in f]

for f in csvfiles:
    pd.read_csv(f)
# the rest of your script

Solution 4:

Loop over each file and build a list of DataFrame, then assemble them together using concat.

Post a Comment for "Pandas Reading Csv Files With Partial Wildcard"