Skip to content Skip to sidebar Skip to footer

How To Print Missing Item Of The List While Using Str.contains Pandas

I am filtering some data from the csv file which works fine, but while matching a list items via str.conatin regex in pandas it prints the result for the items which is finds but i

Solution 1:

Instead of using .str.contains, use .str.extractall to get exactly the substrings that match the items in your list. Then check which elements in the list matched to at least one thing either using .isin (or set logic).

pat = '(' + '|'.join(search_list) + ')'#'(kpc2021|kpc8291|kpc8471|kpc8472|kpc1165)'

result = pd.DataFrame({'item': search_list})
result['in_df'] = result['item'].isin(df['Server Name'].str.extractall(pat)[0])

print(result)

      item  in_df
0  kpc2021   True1  kpc8291   True2  kpc8471   True3  kpc8472  False4  kpc1165  False

Using .str.extractall we get a Series of the substrings that we matched. There's a MultiIndex, the outer level is the original DataFrame index, the inner level is a counter for the number of items it matched on that line (.extractall can have multiple matches).

df['Server Name'].str.extractall(pat)[0]
#   match#00kpc2021#10kpc8291#20kpc8471#Name: 0, dtype: object

Solution 2:

I think you can try by comparing two list:

serverName_list=df['Server Name'].unique().tolist()

If all élément of Server Name column has thé same format, you should clean data with for example:

serverName_clean_list=[] 
for element in serverName_list:
    serverName_clean_list.append(element.split(".")[0])

And according to Python find elements in one list that are not in the other

import numpy as np
main_list = np.setdiff1d(serverName_clean_list, search_list).tolist()
# yields the elements in `list_2` that are NOT in `list_1`

Solution 3:

To return non-matches add a ~:

df = df[~df['Server Name'].astype(str).str.contains('|'.join(search_list))]

Solution 4:

Try with the ~ symbol:

df = df[~df['Server Name'].astype(str).str.contains('|'.join(search_list))]
print(df)

Post a Comment for "How To Print Missing Item Of The List While Using Str.contains Pandas"