Home Content Removing webpage newline characters in Python

Removing webpage newline characters in Python

by Jack Simpson

An issue I recently came across whilst using the Python requests module was that while I was trying to parse HTML text, I couldn’t remove the newline characters ‘
‘ with strip().

The solution is to run the decode() method on the webpage content before you want to parse the text. That will eliminate the behaviour.


import requests

url = 'google.com'
page = requests.get(url)
page.content.decode()

Sign up to my newsletter

Sign up to receive the latest articles straight to your inbox

You may also like