Pandas - Create A New Column With Aggregation Of Previous Column
I have a dataframe with 2 columns: CLASS STUDENT 'Sci' 'Francy' 'math' 'Alex' 'math' 'Arthur' 'math' 'Katy' 'eng' 'Jack' 'eng' 'Paul' 'eng' 'Francy' I want to add a
Solution 1:
IIUC Using condition assign with the groupby
+ transform
df.loc[df.CLASS=='math','New']=df.groupby('CLASS').STUDENT.transform(','.join)
df
Out[290]:
CLASS STUDENT New
0 Sci Francy NaN
1math Alex Alex,Arthur,Katy
2math Arthur Alex,Arthur,Katy
3math Katy Alex,Arthur,Katy
4 eng Jack NaN
5 eng Paul NaN
6 eng Francy NaN
More info, since I compute all the group by groupby
, so that you can assign them all or just pick what you need conditional assign
df.groupby('CLASS').STUDENT.transform(','.join)
Out[291]:
0 Francy
1 Alex,Arthur,Katy
2 Alex,Arthur,Katy
3 Alex,Arthur,Katy
4 Jack,Paul,Francy
5 Jack,Paul,Francy
6 Jack,Paul,Francy
Name: STUDENT, dtype: object
Solution 2:
You can just use str.join
:
df.loc[df['CLASS'] == 'math', 'new_col'] = ', '.join(df.loc[df['CLASS'] == 'math', 'STUDENT'])
Solution 3:
You can do this:
df = pd.DataFrame({"CLASS":['sci','math','math','math','eng','eng','eng'],"STUDENT":['Francy','Alex','Arthur','Katy','Jack','Pauk','Francy']})
step 1: define your function
def get_student_list(class_name):
students = list(df[df['CLASS']==class_name]['STUDENT'])
return", ".join(students)
Step 2: use the numpy where func:
requested_class = 'math'df['NEW_COL']=np.where(df['CLASS']==requested_class,get_student_list(requested_class),np.NaN)
Desired result:
Solution 4:
Another way using pivot_table
and map
:
df['NEW_COL'] = df.CLASS.map(pd.pivot_table(df, 'STUDENT', 'CLASS', 'CLASS', aggfunc=','.join)['math']).fillna(np.nan)
Out[331]:
CLASS STUDENT NEW_COL
0 Sci Francy NaN
1 math Alex Alex,Arthur,Katy
2 math Arthur Alex,Arthur,Katy
3 math Katy Alex,Arthur,Katy
4 eng Jack NaN
5 eng Paul NaN
6 eng Francy NaN
Post a Comment for "Pandas - Create A New Column With Aggregation Of Previous Column"