Skip to content Skip to sidebar Skip to footer

Python Pandas Check If A Value Occurs More Then Once In The Same Day

I have a Pandas dataframe as below. What I am trying to do is check if a station has variable yyy and any other variable on the same day (as in the case of station1). If this is tr

Solution 1:

I might index using a boolean array. We want to delete rows (if I understand what you're after, anyway!) which have yyy and more than one dateuse/station combination.

We can use transform to broadcast the size of each dateuse/station combination up to the length of the dataframe, and then select the rows in groups which have length > 1. Then we can & this with where the yyys are.

>>>multiple = df.groupby(["dateuse", "station"])["variable1"].transform(len) > 1>>>must_be_isolated = df["variable1"] == "yyy">>>df[~(multiple & must_be_isolated)]
               dateuse   station variable1
0  2012-08-12 00:00:00  station1       xxx
2  2012-08-23 00:00:00  station2       aaa
3  2012-08-23 00:00:00  station3       bbb
4  2012-08-25 00:00:00  station4       ccc
5  2012-08-25 00:00:00  station4       ccc
6  2012-08-25 00:00:00  station4       ccc

Post a Comment for "Python Pandas Check If A Value Occurs More Then Once In The Same Day"