Skip to content Skip to sidebar Skip to footer

How Do I Use A Specific Column's Value In A Pandas DataFrame Where Clause?

I'm trying to select all cells in a pandas DataFrame that meet a certain criteria when a specific column also meets a separate criteria. Given the following DataFrame: A B

Solution 1:

This should work directly, but pandas doesn't have a broadcasting and operator (will happenin 0.14). Here's a workaround.

In [74]: df
Out[74]: 
     A  B  C  D
1/1  0  1  0  1
1/2  2  1  1  1
1/3  3  0  1  0
1/4  1  0  1  2
1/5  1  0  1  1
1/6  2  0  2  1
1/7  3  5  2  3

This is a where operation, essentially put np.nan where the condition is False

In [78]: x = df[df>df.shift(1)]

In [79]: x
Out[79]: 
      A   B   C   D
1/1 NaN NaN NaN NaN
1/2   2 NaN   1 NaN
1/3   3 NaN NaN NaN
1/4 NaN NaN NaN   2
1/5 NaN NaN NaN NaN
1/6   2 NaN   2 NaN
1/7   3   5 NaN   3

Select by the 2nd condition

In [80]: x[df.D>1]
Out[80]: 
      A   B   C  D
1/4 NaN NaN NaN  2
1/7   3   5 NaN  3

Solution 2:

I think the problem is actually that the boolean array from the shift operation is one short of the the other conditional. Try adding a false to the first conditional at index zero you should then be able to combine the two conditionals.

I'd the problem really is with the second conditional could you post the result of

DF.dtypes

it looks like it's not int type given the nan array error


Post a Comment for "How Do I Use A Specific Column's Value In A Pandas DataFrame Where Clause?"