Skip to content Skip to sidebar Skip to footer

Pandas: Find First Occurrences Of Elements That Appear In A Certain Column

Let's assume that I have the following data-frame: df_raw = pd.DataFrame({'id': [102, 102, 103, 103, 103], 'val1': [9,2,4,7,6], 'val2': [np.nan, 3, np.nan, 4, 5], 'val3': [4, np.na

Solution 1:

IIUC using drop_duplicates then concat

df1=df_raw.drop_duplicates('id').fillna(-1)target=pd.concat([df1,df_raw.loc[~df_raw.index.isin(df1.index)]]).sort_index()targetdateidval1val2val302002-01-01  1029-1.04.012002-03-03  10223.0NaN22003-04-04  1034-1.0-1.032003-08-09  10374.05.042005-02-03  10365.01.0

Solution 2:

You can use pd.Series.duplicated with Boolean row indexing:

mask = ~df_raw['id'].duplicated()
val_cols = ['val2', 'val3']

df_raw.loc[mask, val_cols] = df_raw.loc[mask, val_cols].fillna(-1)

print(df_raw)

    id  val1  val2  val3       date
0  102     9  -1.0   4.0 2002-01-01
1  102     2   3.0   NaN 2002-03-03
2  103     4  -1.0  -1.0 2003-04-04
3  103     7   4.0   5.0 2003-08-09
4  103     6   5.0   1.0 2005-02-03

Post a Comment for "Pandas: Find First Occurrences Of Elements That Appear In A Certain Column"