Skip to content Skip to sidebar Skip to footer

Convert String Date To A Different Format In Pandas Dataframe

I have been looking for this answer in the community so far, could not have. I have a dataframe in python 3.5.1 that contains a column with dates in string imported from a CSV file

Solution 1:

If you convert the column of strings to a time series, you could use the dt.strftime method:

import numpy as np
import pandas as pd
nan = np.nan
df = pd.DataFrame({'TBD': [nan, nan, nan], 'TBD.1': [nan, nan, nan], 'TBD.2': [nan, nan, nan], 'TimeStamp': ['2016/06/08 17:19:53', '2016/06/08 17:19:54', '2016/06/08 17:19:54'], 'Value': [0.062941999999999998, 0.062941999999999998, 0.062941999999999998]})
df['TimeStamp'] = pd.to_datetime(df['TimeStamp']).dt.strftime('%m/%d/%Y %H:%M:%S')
print(df)

yields

   TBD  TBD.1  TBD.2            TimeStamp     Value
0NaNNaNNaN06/08/201617:19:530.0629421NaNNaNNaN06/08/201617:19:540.0629422NaNNaNNaN06/08/201617:19:540.062942

Since you want to convert a column of strings to another (different) column of strings, you could also use the vectorized str.replace method:

import numpy as np
import pandas as pd
nan = np.nan
df = pd.DataFrame({'TBD': [nan, nan, nan], 'TBD.1': [nan, nan, nan], 'TBD.2': [nan, nan, nan], 'TimeStamp': ['2016/06/08 17:19:53', '2016/06/08 17:19:54', '2016/06/08 17:19:54'], 'Value': [0.062941999999999998, 0.062941999999999998, 0.062941999999999998]})
df['TimeStamp'] = df['TimeStamp'].str.replace(r'(\d+)/(\d+)/(\d+)(.*)', r'\2/\3/\1\4')
print(df)

since

In [32]: df['TimeStamp'].str.replace(r'(\d+)/(\d+)/(\d+)(.*)', r'\2/\3/\1\4')
Out[32]: 
0    06/08/201617:19:531    06/08/201617:19:542    06/08/201617:19:54
Name: TimeStamp, dtype: object

This uses regex to rearrange pieces of the string without first parsing the string as a date. This is faster than the first method (mainly because it skips the parsing step), but it also has the disadvantage of not checking that the date strings are valid dates.

Solution 2:

For most common date and datetime formats, pandas .to_datetime function can parse them without we providing format. For example:

df.TimeStamp.apply(lambda x: pd.to_datetime(x))

And in the example given from the question,

df['TimeStamp'] = pd.to_datetime(df['TimeStamp']).dt.strftime('%m/%d/%Y %H:%M:%S')

will give us the same result.

Using .apply will be efficient if you have multiple columns.

Of course, providing the parsing format is necessary for many situations. For a full list of formats, please see https://docs.python.org/3/library/datetime.html.

Post a Comment for "Convert String Date To A Different Format In Pandas Dataframe"