Pandas: Drop Duplicates In Col[a] Keeping Row Based On Condition On Col[b]
Given the dataframe: df = pd.DataFrame({'col1': ['A', 'A', 'A','B','B'], 'col2': ['type1', 'type2', 'type1', 'type2', 'type1'] , 'hour': ['18:03:30','18:00:48', '18:13:46', '18:11:
Solution 1:
df.drop_duplicates(['col1','col2'] , keep = 'last')
Solution 2:
Following anky_91's comment I solved it like this:
df.sort_values('hour').drop_duplicates(['col1','col2'] , keep = 'last')
This sorts based on the column 'hour' so that you are sure that keep='last' gets the last element
Post a Comment for "Pandas: Drop Duplicates In Col[a] Keeping Row Based On Condition On Col[b]"