Skip to content Skip to sidebar Skip to footer

Manipulating Data Frames

I have a data frame df with one of the columns called Rule_ID. It has data like - Rule_ID [u'2c78g',u'df567',u'5ty78'] [u'2c78g',u'd67gh',u'df890o'] [u'd67gh',u'df890o',u'5ty78'] [

Solution 1:

Option 1

df.Rule_ID.apply(pd.Series).stack().value_counts()

df890o    35ty78     32c78g     3
d67gh     2
df567     1
dtype: int64

Option 2

pd.value_counts(pd.np.concatenate(df.Rule_ID.values))

df890o    35ty78     32c78g     3
d67gh     2
df567     1
dtype: int64

If those are strings, do this:

from ast import literal_eval

pd.value_counts(pd.np.concatenate([literal_eval(x) for x in df.Rule_ID.values]))
# or
# df.Rule_ID.apply(literal_eval).apply(pd.Series).stack().value_counts()

df890o    35ty78     32c78g     3
d67gh     2
df567     1
dtype: int64

Post a Comment for "Manipulating Data Frames"