Note that this isn't perfect, since if you had something like, say, <a title=">">
it would break. However, it's about the closest you'd get in non-library Python without a really complex function:
import re
TAG_RE = re.compile(r'<[^>]+>')
def remove_tags(text):
return TAG_RE.sub('', text)
However, as lvc mentions xml.etree
is available in the Python Standard Library, so you could probably just adapt it to serve like your existing lxml
version:
def remove_tags(text):
return ''.join(xml.etree.ElementTree.fromstring(text).itertext())