Removing webpage newline characters in Python

by Jack Simpson December 27, 2016

written by Jack Simpson December 27, 2016

An issue I recently came across whilst using the Python requests module was that while I was trying to parse HTML text, I couldn’t remove the newline characters ‘
‘ with strip().

The solution is to run the decode() method on the webpage content before you want to parse the text. That will eliminate the behaviour.


import requests

url = 'google.com'
page = requests.get(url)
page.content.decode()

Python

Removing webpage newline characters in Python

Sign up to my newsletter

You may also like