1

I am trying to use requests to try to scrape some data on this website, but I am encountering some issues:

My code is the following:

import requests
from requests.auth import HTTPBasicAuth
r = requests.get("https://v4.fitnessandlifestylecentre.com/WebAccess/login.aspx", auth=HTTPBasicAuth('atoto', 'password'))
print(r.text)

(the login password combination is not a valid one for obvious reasons)

However when returning this, I do not get the page I would get after a successful login attempt but I get back the login page but slightly different (probably because the website considered it to be an unsucessful login attempt.

Please can you help me to understand what is going wrong ?

EDIT: I have tried to post the arguments in the following manner:

payload = {'edUsername': 'atoto', 'edPassword': 'password'}
r = requests.get("https://v4.fitnessandlifestylecentre.com/WebAccess/login.aspx", data=payload)

but the result is the same. I noticed some hidden variables in the form, should I post them as well ?

1
  • print (r.status_code, r.headers, r.request.headers) Commented Dec 23, 2013 at 15:32

3 Answers 3

1

You should check the form for any hidden fields (there are some there)

Propably there is some field for csrf protection. So inspect the form and the response you get from requests closely, to check if there are any errors (not http errors obviously)

Sign up to request clarification or add additional context in comments.

Comments

1

I have observed that when logging in, the following data are posted to the server:

enter image description here

So I think you have to include those fields into a dict variable and then post them to the server, for example:

>>> payload = {'_VIEWSTATE': 'THE_LONG_STRING', '_EVENTVALIDATION': 'THE_LONG_STRING', 'edUsername': YOUR_USER_NAME, ...} # SOME OTHER DATA  
>>> res = requests.post(url, data=payload)

Comments

0

It may be that the website does not support HTTP Basic Authentication. So you would need to submit the form data values for the fields presented on the login form to the login.aspx url using a HTTP Post request. eg.:

>>> payload = {'key1': 'value1', 'key2': 'value2'}
>>> r = requests.post("http://httpbin.org/post", data=payload)

See http://docs.python-requests.org/en/latest/user/quickstart/#more-complicated-post-requests

Also, perhaps the login form page is responding with cookies. In that case you need to make two requests. One to retrieve the login form page (and cookies)..The second request submitting your form data along with the cookie data. See http://docs.python-requests.org/en/latest/user/quickstart/#cookies

Also, ensure the hidden form values you submit in your second request match the values in the form in your first response.

UPDATE:

The login form is setting cookies so to emulate a normal browser login you should return those in your second request.

Your first request would be like this:

>>> import requests
>>> url = "https://v4.fitnessandlifestylecentre.com/WebAccess/login.aspx"
>>> r1 = requests.get(url)

You can access the cookies using the response objects cookies property

>>> r1.cookies
<<class 'requests.cookies.RequestsCookieJar'>[Cookie(version=0, name='ASP.NET_SessionId', value='plhmrq3syuqgcyab1g52nq55', port=None, port_specified=False, domain='v4.fitnessandlifestylecentre.com', domain_specified=False, domain_initial_dot=False, path='/', path_specified=True, secure=False, expires=None, discard=True, comment=None, comment_url=None, rest={'HttpOnly': None}, rfc2109=False), Cookie(version=0, name='SDAWA_culture', value='en-US', port=None, port_specified=False, domain='v4.fitnessandlifestylecentre.com', domain_specified=False, domain_initial_dot=False, path='/', path_specified=True, secure=False, expires=1392999422, discard=False, comment=None, comment_url=None, rest={}, rfc2109=False)]>

Your second request should submit the cookies like so (assuming your credentials / form data are in a dict called payload)

r2 = requests.post(url, data=payload, cookies=r1.cookies)

5 Comments

I have tried this already as well but will amend my question.
Yes, the server may well be checking for the hidden form variable so include that too. Any normal sign in with a web browser will submit all form variables including hidden ones.
For some reason I am unable to come up with something that works
I have expanded my answer.
If cookies need to be maintained across requests, rather than using the system above you should use a Requests Session object, which will do the hard work for you.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.