How To Check If Pandas Rows Contain Any Full String Or Substring Of A List?
I have a list of strings list_ = ['abc', 'def', 'xyz'] And I have a df with column CheckCol, that I want to check if the values in CheckCol contains any of the whole of substring
Solution 1:
I think the "easiest" implemented solution would be to use a regex-expression. In regex the pipe |
means or. By doing '|'.join(yourlist)
we get the substrings we want to check.
import pandas as pd
import numpy as np
list_ = ['abc', 'def', 'xyz']
df = pd.DataFrame({
'CheckCol': ['a','ab','abc','abd-def']
})
df['NewCol'] = np.where(df['CheckCol'].str.contains('|'.join(list_)), df['CheckCol'], '')
print(df)
# CheckCol NewCol#0 a #1 ab #2 abc abc#3 abd-def abd-def
NOTE: Your variable name list
was changed to list_
. Try to avoid using the reserved Python namespace.
Post a Comment for "How To Check If Pandas Rows Contain Any Full String Or Substring Of A List?"