In my case, I need to extract data from a production machine. My co-workers design the following strategy (API) for me.
- Login into the application - use cookie to setup your application / scripts;
- Access the project home page - lots of projects information in JSON format;
- Extract project ID by using the filter and use special API to get the detailed information of each qualified project ...
Here is my code:
import requests
s = requests.session()
login_data=dict(User='XXXXXX',Password='XXXXXX')
url = 'http://internal-pilot-XXXXXXX-elb-15h4lq2sm46fi-6574XXXXX.us-east-1.elb.amazonaws.com/XXXX-editor-web/spring_security_login'
s.post(url, data=login_data)
r = s.get('http://internal-pilot-XXXXXXX-elb-15h4lq2sm46fi-6574XXXXX.us-east-1.elb.amazonaws.com/XXXX-editor-web/api/projects')
I executed the code line by line. After running 's.post(url, data=login_data)' I noticed that though I got 'Response [200]' but the session cookies is empty.
>>> s.post(url, data=login_data)
<Response [200]>
>>> s.cookies
<RequestsCookieJar[]>
After running s.get('...'), I noticed that even I got 'Response [403]', the session cookies is no longer empty.
>>> s.get('http://internal-pilot-XXXXXXX-elb-15h4lq2sm46fi-6574XXXXX.us-east-1.elb.amazonaws.com/oaqc-editor-web/api/projects')
<Response [403]>
>>> s.cookies.get_dict()
{'AWSELB': '39E1F543067A169F5670C20A97C217D25E0183C29D4C14F38EFC1FC58E993C6F96E88F97B58950E092F4C948A0A99AE42DED20A93E542EFD80F074EB26477729DB0DD1B5469C655062CB6005E3C6F5BDDDCEA57A12', 'JSESSIONID': '61A6DFB9DDDF1BFD7FD4F6B47E7E2B2D'}
Then I tried r = s.get('...', cookies=s.cookies) but still got 'Response [403]' which means the cookies was not successfully stored and passed into the following request.
Do I make any mistake here? I searched and found many similar discussions in Stack Overflow but none of them solve my issue. Thanks a lot.