Skip to content Skip to sidebar Skip to footer

How To Check If Pandas Rows Contain Any Full String Or Substring Of A List?

I have a list of strings list_ = ['abc', 'def', 'xyz'] And I have a df with column CheckCol, that I want to check if the values in CheckCol contains any of the whole of substring

Solution 1:

I think the "easiest" implemented solution would be to use a regex-expression. In regex the pipe | means or. By doing '|'.join(yourlist) we get the substrings we want to check.

import pandas as pd
import numpy as np

list_ = ['abc', 'def', 'xyz']

df = pd.DataFrame({
    'CheckCol': ['a','ab','abc','abd-def']
})

df['NewCol'] = np.where(df['CheckCol'].str.contains('|'.join(list_)), df['CheckCol'], '')

print(df)

#  CheckCol   NewCol#0        a         #1       ab         #2      abc      abc#3  abd-def  abd-def

NOTE: Your variable name list was changed to list_. Try to avoid using the reserved Python namespace.

Post a Comment for "How To Check If Pandas Rows Contain Any Full String Or Substring Of A List?"