I am trying to get a name from a public Linkedin url via python requests (2.7).
The code used to work fine.
import requests
from bs4 import BeautifulSoup
url = "https://www.linkedin.com/in/linustorvalds"
html = requests.get(url).content
link = BeautifulSoup(html).title.text.split("|")[0].replace(" ","")
print link
The desired output is:
linustorvalds
I am getting the following error message:
AttributeError: 'NoneType' object has no attribute 'text'
The issue seems to be that html is not returning the real content of the page. So there is no 'title' found. This is the result of printing html:
<html><head>
<script type="text/javascript">
window.onload = function() {
var newLocation = "";
if (window.location.protocol == "http:") {
var cookies = document.cookie.split("; ");
for (var i = 0; i < cookies.length; ++i) {
if ((cookies[i].indexOf("sl=") == 0) && (cookies[i].length > 3)) {
newLocation = "https:" + window.location.href.substring(window.location.protocol.length);
}
}
}
if (newLocation.length == 0) {
var domain = location.host;
var newDomainIndex = 0;
if (domain.substr(0, 6) == "touch.") {
newDomainIndex = 6;
}
else if (domain.substr(0, 7) == "tablet.") {
newDomainIndex = 7;
}
if (newDomainIndex) {
domain = domain.substr(newDomainIndex);
}
newLocation = "https://" + domain + "/uas/login?trk=sentinel_org_block&session_redirect=" + encodeURIComponent(window.location)
}
window.location.href = newLocation;
}
</script>
</head></html>
Am I being blocked? What are the possible suggestions to make this code work as before?
Thanks a lot!
window.location.href = newLocation. You probably need to follow that redirect.