How to find tags with only certain attributes - BeautifulSoup

Question

How would I  using BeautifulSoup  search for tags containing ONLY the attributes I search for    For example  I want to find all  lt td valign  top  gt  tags    The following code  raw card data   soup fetch  td     valign  re compile  top      gets all of the data I want  but also grabs any  lt td gt  tag that has the attribute valign top   I also tried  raw card data   soup findAll re compile   lt td valign  top  gt     and this returns nothing  probably because of bad regex   I was wondering if there was a way in BeautifulSoup to say  Find  lt td gt  tags whose only attribute is valign top   UPDATE FOr example  if an HTML document contained the following  lt td gt  tags    lt td valign  top  gt       lt  td gt  lt br   gt   lt td width  580  valign  top  gt         lt  td gt  lt br   gt   lt td gt       lt  td gt  lt br   gt    I would want only the first  lt td gt  tag   lt td width  580  valign  top  gt   to return

User · Answer

Just pass it as an argument of findAll    gt  gt  gt  from BeautifulSoup import BeautifulSoup  gt  gt  gt  soup   BeautifulSoup          lt html gt       lt head gt  lt title gt My Title  lt  title gt  lt  head gt       lt body gt  lt table gt       lt tr gt  lt td gt First  lt  td gt       lt td valign  top  gt Second  lt  td gt  lt  tr gt       lt  table gt  lt  body gt  lt html gt            gt  gt  gt   gt  gt  gt  soup findAll  td     lt td gt First  lt  td gt    lt td valign  top  gt Second  lt  td gt    gt  gt  gt   gt  gt  gt  soup findAll  td   valign  top     lt td valign  top  gt Second  lt  td gt

User · Answer

The easiest way to do this is with the new CSS style select method   soup   BeautifulSoup html  results   soup select  td valign  top

User · Answer

if you want to only search with attribute name with any value  from bs4 import BeautifulSoup import re  soup  BeautifulSoup html text  lxml   results   soup findAll  td     valign    re compile r          as per Steve Lorimer better to pass True instead of regex  results   soup findAll  td     valign    True

User · Answer

find using an attribute in any tag  lt th class  quot team quot  data-sort  quot team quot  gt Team lt  th gt      soup find all attrs   quot class quot    quot team quot       lt th data-sort  quot team quot  gt Team lt  th gt    soup find all attrs   quot data-sort quot    quot team quot

User · Answer

As explained on the BeautifulSoup documentation You may use this   soup   BeautifulSoup html  results   soup findAll  quot td quot     quot valign quot     quot top quot     EDIT   To return tags that have only the valign  quot top quot  attribute  you can check for the length of the tag attrs property   from BeautifulSoup import BeautifulSoup  html     lt td valign  quot top quot  gt       lt  td gt            lt td width  quot 580 quot  valign  quot top quot  gt         lt  td gt            lt td gt       lt  td gt    soup   BeautifulSoup html  results   soup findAll  quot td quot     quot valign quot     quot top quot     for result in results       if len result attrs     1           print result  That returns    lt td valign  quot top quot  gt       lt  td gt

User · Answer

Adding a combination of Chris Redford s and Amr s answer  you can also search for an attribute name with any value with the select command   from bs4 import BeautifulSoup as Soup html     lt td valign  top  gt       lt  td gt        lt td width  580  valign  top  gt         lt  td gt        lt td gt       lt  td gt   soup   Soup html   lxml   results   soup select  td valign

User · Answer

You can use lambda functions in findAll as explained in documentation  So that in your case to search for td tag with only valign    top  use following   td tag list   soup findAll                  lambda tag tag name     td  and                 len tag attrs     1 and                 tag  valign       top

[python] How to find tags with only certain attributes - BeautifulSoup

Examples related to python

Examples related to beautifulsoup