Data scientist at Port Jackson Partners in Sydney, Australia. My PhD was in computational biology. In my spare time I write about medical research at BioSky.co.CVAbout
An issue I recently came across whilst using the Python requests module was that while I was trying to parse HTML text, I couldn’t remove the newline characters ‘
‘ with strip().
The solution is to run the decode() method on the webpage content before you want to parse the text. That will eliminate the behaviour.
import requests url = 'google.com' page = requests.get(url) page.content.decode()