Skip to content Skip to sidebar Skip to footer

Random Sample By Group: How To Specify N, Not Weight? (using Dataframegroupby.sample)

This question follows this question (I was asked to post it as a new question by other contributors). We have this mock df: df = pd.DataFrame({ 'id': [1, 2, 3, 4, 5, 6, 7,

Solution 1:

You can group the dataframe on country then .sample each group separately where the number of samples to take can be obtained from the dictionary, finally .concat all the sampled groups:

d = {'USA': 4, 'Canada': 2} # mapping dict
pd.concat([g.sample(d[k]) for k, g in df.groupby('country', sort=False)])

   id country
0   1     USA
4   5     USA
1   2     USA
2   3     USA
6   7  Canada
9  10  Canada

Post a Comment for "Random Sample By Group: How To Specify N, Not Weight? (using Dataframegroupby.sample)"