Skip to content Skip to sidebar Skip to footer

Returning Latest File In Directory For Specific Format

I have a directory with files of the format: test_report-01-13-2014.11_53-en.zip test_report-12-04-2013.11_53-en.zip and I need to return the last files based on the date in the f

Solution 1:

glob returns matching paths in an arbitrary order, and it doesn't understand %m-%d-%Y (its not that smart).

You need to read the list of paths, extract the file name, then get the date from the file name. This will be the key that you will use to sort the list of files.

Here is one way to do just that:

import glob
import os
import datetime

def sorter(path):
    filename = os.path.basename(path)
    return datetime.datetime.strptime(filename[12:22], '%m-%d-%Y')

pattern= "test_report-*"
search_path = r'C:\temp\test\' # or'c:/temp/test/'

file_list = glob.glob(pattern+search_path)

# Orderby the date
ordered_list = sorted(file_list, key=sorter, reverse=True)

os.path.basename is a function to return the last component of a path; since glob will return the full path, the last component will be the file name.

As your file name has a fixed format - instead of mucking with regular expressions I just grabbed the date part by slicing the file name, and converted it to a datetime object.

Finally, sortedreturns the result of the sort (the normal sort method is an in place sort). The key function is what extract the date and returns it, reverse=True is required to get the returned list in the order of latest first.

You can shorten the code a bit by passing the result of glob.glob directly to sorted:

ordered_list = sorted(glob.glob(pattern+search_path), key=sorter, reverse=True)

To combine this with the function you have written:

import glob, os, datetime

def sorter(path):
    filename = os.path.basename(path)
    return datetime.datetime.strptime(filename[12:22], '%m-%d-%Y')

def getLatestFile(path="./", pattern="*"):
   fformat = path + pattern
   archives = glob.glob(fformat)

   iflen(archives):
      return sorted(archives, key=sorter, reverse=True)[0]

Solution 2:

The order of archives is arbitrary, but not only that your filenames can't be sorted alphabetically (month comes before year). Easiest way is to sort your list with a key function that extracts a datetime object from the filename:

import datetime

defgetDateFromFilename(filename):
    try:
        return datetime.datetime.strptime(timestamp[12:-7], '%m-%d-%Y.%H_%M')
    except ValueError:
        return -1

archives.sort(key=getDateFromFilename)

Solution 3:

Thanks a lot for the input. I used a little bit of everything and ended up with this, which works fine for my purposes.

defgetDateFromFilename(filename):
    try:
        return datetime.datetime.strptime(filename, myPattern + '%m-%d-%Y.%H_%M-en.zip')
    except ValueError:
        return -1defgetLatestFile(path, pattern):
    files = sorted([f for f in os.listdir(myPath) if f.startswith(pattern)])
    files.sort(key=getDateFromFilename)

    iflen(files) > 0:
        return files[-1]
    else:
        returnNone

Solution 4:

If you would like to sort your list by name, just do sorted(archives = glob.glob(fformat))

Post a Comment for "Returning Latest File In Directory For Specific Format"