Selecting Multiple (neighboring) Rows Conditionally
Solution 1:
Here's a try. You would maybe want to use rolling
or expanding
(for speed and elegance) instead of explicitly looping with range
, but I did it that way so as to be able to print out the rows being used to calculate each boolean.
df = df[['X','Y','Z']] # remove the "total" columninorder
# to make the syntax a little cleaner
df = df.head(4) # keep the example more manageable
for i inrange(len(df)):
for k inrange( i+1, len(df)+1 ):
df_sum = df[i:k].sum()
print( "rows", i, "to", k, (df_sum>0).all() & (df_sum.sum()>10) )
rows0to1Truerows0to2Truerows0to3Truerows0to4Truerows1to2Falserows1to3Truerows1to4Truerows2to3Truerows2to4Truerows3to4True
Solution 2:
I am not too sure if I understood your question correctly, but if you are looking to put multiple conditions within a dataframe, you can consider this approach:
new_df = df[(df["X"] > 0) & (df["Y"] < 0)]
The &
condition is for AND, while replacing that with |
is for OR condition. Do remember to put the different conditions in ()
.
Lastly, if you want to remove duplicates, you can use this
new_df.drop_duplicates()
You can find more information about this function at here: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.drop_duplicates.html
Hope my answer is useful to you.
Post a Comment for "Selecting Multiple (neighboring) Rows Conditionally"