Removing webpage newline characters in Python

An issue I recently came across whilst using the Python requests module was that while I was trying to parse HTML text, I couldn’t remove the newline characters ‘
‘ with strip().

The solution is to run the decode() method on the webpage content before you want to parse the text. That will eliminate the behaviour.


import requests

url = 'google.com'
page = requests.get(url)
page.content.decode()

The following two tabs change content below.
Computational biology PhD candidate at the Australian National University. I love writing (both articles and software), learning more about the world around us, and beekeeping. I also write for BioSky.co

Latest posts by Jack Simpson (see all)

Comments are closed.