Beautifulsoup Remove Html Tags
Python/BeautifulSoup – how to remove all tags from an element?
How can I simply strip all tags from an element I find in BeautifulSoup?
Hugo23. 9k6 gold badges70 silver badges88 bronze badges
asked Apr 25 ’13 at 4:26
With BeautifulStoneSoup gone in bs4, it’s even simpler in Python3
from bs4 import BeautifulSoup
soup = BeautifulSoup(html)
text = t_text()
print(text)
answered Jan 27 ’15 at 2:47
shawnlshawnl1, 6511 gold badge12 silver badges15 bronze badges
3
answered Apr 29 ’14 at 0:40
BobbyBobby6, 6101 gold badge19 silver badges25 bronze badges
Use get_text(), it returns all the text in a document or beneath a tag, as a single Unicode string.
For instance, remove all different script tags from the following text:
The expected result is:
Signal et Communication
Ingénierie Réseaux et Télécommunications
Here is the source code:
#! /usr/bin/env python3
text = ”’
”’
soup = BeautifulSoup(text)
print(t_text())
answered Jul 20 ’15 at 16:37
SparkAndShineSparkAndShine14. 9k17 gold badges76 silver badges120 bronze badges
You can use the decompose method in bs4:
soup = autifulSoup(‘