Skip to content Skip to sidebar Skip to footer

Reading Csv File In Pandas With Historical Dates

I'm trying to read a file in with dates in the (UK) format 13/01/1800, however some of the dates are before 1667, which cannot be represented by the nanosecond timestamp (see http:

Solution 1:

you can try to do it this way:

fn = r'D:\temp\.data\36987699.csv'defdt_parse(s):
    d,m,y = s.split('/')
    return pd.Period(year=int(y), month=int(m), day=int(d), freq='D')


df = pd.read_csv(fn, parse_dates=[0], date_parser=dt_parse)

Input file:

Date,col1
13/01/1800,aaa
25/12/1001,bbb
01/03/1267,ccc

Test:

In[16]: dfOut[16]:
        Datecol101800-01-13aaa11001-12-25bbb21267-03-01cccIn[17]: df.dtypesOut[17]:
Dateobjectcol1objectdtype: objectIn[18]: df['Date'].dt.yearOut[18]:
018001100121267Name: Date, dtype: int64

PS you may want to add try ... catch block in the dt_parse() function for catching ValueError: exceptions - result of int()...

Post a Comment for "Reading Csv File In Pandas With Historical Dates"