Random Sample By Group: How To Specify N, Not Weight? (using Dataframegroupby.sample)
This question follows this question (I was asked to post it as a new question by other contributors). We have this mock df: df = pd.DataFrame({ 'id': [1, 2, 3, 4, 5, 6, 7,
Solution 1:
You can group
the dataframe on country
then .sample
each group separately where the number of samples to take can be obtained from the dictionary, finally .concat
all the sampled groups:
d = {'USA': 4, 'Canada': 2} # mapping dict
pd.concat([g.sample(d[k]) for k, g in df.groupby('country', sort=False)])
id country
0 1 USA
4 5 USA
1 2 USA
2 3 USA
6 7 Canada
9 10 Canada
Post a Comment for "Random Sample By Group: How To Specify N, Not Weight? (using Dataframegroupby.sample)"