How To Check If Pandas Rows Contain Any Full String Or Substring Of A List?
I have a list of strings list_ = ['abc', 'def', 'xyz'] And I have a df with column CheckCol, that I want to check if the values in CheckCol contains any of the whole of substring
Solution 1:
I think the "easiest" implemented solution would be to use a regex-expression. In regex the pipe | means or. By doing '|'.join(yourlist) we get the substrings we want to check.
import pandas as pd
import numpy as np
list_ = ['abc', 'def', 'xyz']
df = pd.DataFrame({
'CheckCol': ['a','ab','abc','abd-def']
})
df['NewCol'] = np.where(df['CheckCol'].str.contains('|'.join(list_)), df['CheckCol'], '')
print(df)
# CheckCol NewCol#0 a #1 ab #2 abc abc#3 abd-def abd-defNOTE: Your variable name list was changed to list_. Try to avoid using the reserved Python namespace.
Post a Comment for "How To Check If Pandas Rows Contain Any Full String Or Substring Of A List?"