Skip to content Skip to sidebar Skip to footer

Filtering Multiple Columns Pandas

I have a method which takes a pandas dataframe as an input: def dfColumnFilter(df, columnFilter, columnName): ''' Returns a filtered DataFrame Keyword arguments: df

Solution 1:

You can use the *args keyword to pass a list of pairs:

def filter_df(df, *args):
    for k, v in args:
        df = df[df[k] == v]
    returndf

It can be used like this:

df = pd.DataFrame({'a': [1, 2, 1, 1], 'b': [1, 3, 3, 3]})

>>> filter_df(df, ('a', 1), ('b', 2))
    a   b
2   1   3
3   1   3

Note

In theory, you could use **kwargs, which would have a more pleasing usage:

filter_df(df, a=1, b=2)

but then you could only use it for columns whose names are valid Python identifiers.

Edit

See comment below by @Goyo for a better implementation point.

Solution 2:

You can use as below

filtered_df = df[(df[column1]=='foo') & (df[column2]=='bar')]

and you can continue with & and parentesis statements.

Post a Comment for "Filtering Multiple Columns Pandas"