How to Remove tags using BeautifulSoup in Python?

Prerequisite- Beautifulsoup module
In this article, we are going to draft a python script that removes a tag from the tree and then completely destroys it and its contents. For this, decompose() method is used which comes built into the module.
Syntax:
Beautifulsoup.Tag.decompose()
Tag.decompose() removes a tag from the tree of a given HTML document, then completely destroys it and its contents.
Implementation:
Example 1:
Python3
# import modulefrom bs4 import BeautifulSoup# URL for scraping data# get URL htmlsoup = BeautifulSoup(markup, 'html.parser')# display before decomposeprint("Before Decompose")print(soup.a)# decomposing the# soup datanew_tag = soup.a.decompose()print("After decomposing:")print(new_tag) |
Output:
Before Decompose
<a href=”https://www.zambiatek.com/”>Welcome to <i>zambiatek.com</i></a>
After decomposing:
None
Example 2: Implementation of given URL to scrape the HTML document.
Python3
# import modulefrom bs4 import BeautifulSoupimport requests# Get URL html# Scraping the data from# Html docreqs = requests.get(url)soup = BeautifulSoup(reqs.text, 'html.parser')# Before decomposingprint("Before Decomposing")print(soup)# decompose the soupresult = soup.decompose()print("After decomposing:")print(result) |
Output:
Before Decomposing
<!DOCTYPE html>
<!–[if IE 7]>
<html class=”ie ie7″ lang=”en-US” prefix=”og: http://ogp.me/ns#”>
<![endif]–>
<!–[if IE 8]>
<html class=”ie ie8″ lang=”en-US” prefix=”og: http://ogp.me/ns#”>
<![endif]–>
<!–[if !(IE 7) | !(IE 8) ]><!–>
<html lang=”en-US” prefix=”og: http://ogp.me/ns#”>
<!–<![endif]–>
<head>
<meta charset=”utf-8″/>..
……
After decomposing:
None



