Skip to content Skip to sidebar Skip to footer

Select All Divs Except Ones With Certain Classes In Beautifulsoup

As discussed in this question one can easily get all divs with certain classes. But here, I have a list of classes that I want to exclude & want to get all divs that doesn't ha

Solution 1:

Using CSS selector, try this:

divs = soup.select("div:not('.class1, .class2, .class3')")

Reference

  1. Link 1
  2. Link 2

Solution 2:

Alternate solution

soup.find_all('div', class_=lambda x: x not in classToIgnore)

Example

from bs4 import BeautifulSoup
html = """
<div class="c1"></div>
<div class="c1"></div>
<div class="c2"></div>
<div class="c3"></div>
<div class="c4"></div>
"""
soup = BeautifulSoup(html, 'html.parser')
classToIgnore = ["c1", "c2"]
print(soup.find_all('div', class_=lambda x: x notin classToIgnore))

Output

[<divclass="c3"></div>, <divclass="c4"></div>]

If you are dealing with nested classes then try deleting the inner unwanted classes using decompose and then just find_all('div')

fordivinsoup.find_all('div', class_=lambda x: x in classToIgnore):
    div.decompose()
print(soup.find_all('div'))

This might leave some extra spaces but you can strip that off easily later.

Post a Comment for "Select All Divs Except Ones With Certain Classes In Beautifulsoup"