Summing Rows Based On Keyword Within Index
I am trying to sum multiple rows together based on a keyword that is part of the index - but it is not the entire index. For example, the index could look like
Solution 1:
May be this:
df.groupby(df.index.to_series()
.str.split('_', expand=True)[1]
)['Count'].sum()
Output:
1
Apple 45
Banana 100
Name: Count, dtype: int64
Solution 2:
Given the following dataframe:
raw_data = {'id': ['1234_Banana_Green', '4321_Banana_Yellow',
'2244_Banana_Brown', '12345_Apple_Red',
'1267_Apple_Blue']}
df = pd.DataFrame(raw_data).set_index(['id'])
Try this code:
df = df.reset_index()
df['extracted_keyword'] = df['id'].apply(lambda x: x.split('_')[1])
df.groupby(["extracted_keyword"]).count()
And gives:
id
extracted_keyword
Apple 2
Banana 3
if you want restore the index, add in the end:
df = df.set_index(['id'])
Post a Comment for "Summing Rows Based On Keyword Within Index"