4

I am attempting to log in to a website using Requests and seem to be hitting a wall. Any advice would be appreciated.

I'm attempting to log in to economist.com (no reason, just something I have a username and password for), whose login page is at https://www.economist.com/user/login and whose login form has the attribute action="https://www.economist.com/user/login?destination=%2F".

Using the Chrome developer tools, the form data for a login request is as follows:

name: ///////// 
pass: ////////
form-build-id: form-483956e97a61f73fbc0ebf06b04dbe3f
form_id: user_login
securelogin_original_baseurl: https://www.economist.com
op: Log in

My code GETs the login page, uses BeautifulSoup to determine the form_id; attempts to POST to the login using my username and password, the retrieved form_id, and the other hidden variables; and then uses BeautifulSoup to check the homepage to see if the banner has a login or logout link to determine if I have actually logged in.

The code is as follows:

import requests
from bs4 import BeautifulSoup

# Setting user agent to a real browser instead of requests
headers = requests.utils.default_headers()
headers.update(
    {
        'User-Agent': 'Mozilla/5.0',
    }
)

# create a session and login
s = requests.Session()
login_page = s.get('https://www.economist.com/user/login', headers=headers)
login = BeautifulSoup(login_page.text, 'lxml')
form = login.select_one("form > div > input")
payload = {
            'name' : '////////////',
            'pass' : '////////',
            'form_build_id' : form['value'],
            'form_id' : 'user_login',
            'securelogin_original_baseurl' : 'https://www.economist.com',
            'op' : 'Log in'
            }
response = s.post("https://www.economist.com/user/login?destination=%2F",
data=payload, headers=headers)

# check homepage banner to see if login or logout link is there
url = "https://www.economist.com/"
r = s.get(url, headers=headers)
soup = BeautifulSoup(r.text, 'lxml')
banner = soup.select("div > div > span > a")
for table_row in banner:
    print(table_row['href'])

When run, this code shows that the banner still has the login link instead of the logout link, which, I assume, means that it's not logged in. I know I must have made some very simple mistake in here, but after reading through other similar questions on here, I can't seem to find where I'm going awry. I'd appreciate any advice on making this work.

1 Answer 1

1

I tried your code and only 1 thing did not work with me.

form = login.select_one("form > div > input") 

To:

form = login.find('input', attrs={'name': "form_build_id"})

Then login normally, and to make sure if am logged in or not, i get a page that only logged in users can visit. http://www.economist.com/subscriptions/activation

if you can visit this page, then you are logged in, or you will be redirected to https://www.economist.com/user/register?destination=subscriptions%2Factivation&rp=activating

import requests
from bs4 import BeautifulSoup

# Setting user agent to a real browser instead of requests
headers = requests.utils.default_headers()
headers.update(
    {
        'User-Agent': 'Mozilla/5.0',
    }
)

# create a session and login
s = requests.Session()
login_page = s.get('https://www.economist.com/user/login', headers=headers)
login = BeautifulSoup(login_page.text, 'lxml')
form = login.find('input', attrs={'name': "form_build_id"})#works

payload = {
            'name' : '*****',
            'pass' : '*****',
            'form_build_id' : form['value'],
            'form_id' : 'user_login',
            'securelogin_original_baseurl' : 'https://www.economist.com',
            'op' : 'Log in'
            }
response = s.post("https://www.economist.com/user/login?destination=%2F",
data=payload, headers=headers)

activation_page = s.get('http://www.economist.com/subscriptions/activation', headers=headers)
if activation_page.url == 'https://www.economist.com/user/register?destination=subscriptions%2Factivation&rp=activating':
    print"Failed to login"
elif activation_page.url == 'http://www.economist.com/subscriptions/activation':
    print"Logged In Successfully!"
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks. This works. I think my form was working, though yours is a much less breakable form. I think my problem all along was a crappy test to see if I was logged in. Yours is much more elegant (though /activation redirects me to /thankyou, but the premise is the same).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.