Pandas: Find First Occurrences Of Elements That Appear In A Certain Column
Let's assume that I have the following data-frame: df_raw = pd.DataFrame({'id': [102, 102, 103, 103, 103], 'val1': [9,2,4,7,6], 'val2': [np.nan, 3, np.nan, 4, 5], 'val3': [4, np.na
Solution 1:
IIUC using drop_duplicates
then concat
df1=df_raw.drop_duplicates('id').fillna(-1)target=pd.concat([df1,df_raw.loc[~df_raw.index.isin(df1.index)]]).sort_index()targetdateidval1val2val302002-01-01 1029-1.04.012002-03-03 10223.0NaN22003-04-04 1034-1.0-1.032003-08-09 10374.05.042005-02-03 10365.01.0
Solution 2:
You can use pd.Series.duplicated
with Boolean row indexing:
mask = ~df_raw['id'].duplicated()
val_cols = ['val2', 'val3']
df_raw.loc[mask, val_cols] = df_raw.loc[mask, val_cols].fillna(-1)
print(df_raw)
id val1 val2 val3 date
0 102 9 -1.0 4.0 2002-01-01
1 102 2 3.0 NaN 2002-03-03
2 103 4 -1.0 -1.0 2003-04-04
3 103 7 4.0 5.0 2003-08-09
4 103 6 5.0 1.0 2005-02-03
Post a Comment for "Pandas: Find First Occurrences Of Elements That Appear In A Certain Column"