Random Sample By Group: How To Specify N, Not Weight? (using Dataframegroupby.sample)

May 29, 2023 Post a Comment

This question follows this question (I was asked to post it as a new question by other contributors). We have this mock df: df = pd.DataFrame({ 'id': [1, 2, 3, 4, 5, 6, 7,

Solution 1:

You can group the dataframe on country then .sample each group separately where the number of samples to take can be obtained from the dictionary, finally .concat all the sampled groups:

d = {'USA': 4, 'Canada': 2} # mapping dict
pd.concat([g.sample(d[k]) for k, g in df.groupby('country', sort=False)])

   id country
0   1     USA
4   5     USA
1   2     USA
2   3     USA
6   7  Canada
9  10  Canada

Python Freelancers

Random Sample By Group: How To Specify N, Not Weight? (using Dataframegroupby.sample)

Solution 1:

Post a Comment for "Random Sample By Group: How To Specify N, Not Weight? (using Dataframegroupby.sample)"