12

I'm trying to login to https://www.voxbeam.com/login using requests to scrape data. I'm a python beginner and I have done mostly tutorials, and some web scraping on my own with BeautifulSoup.

Looking at the HTML:

<form id="loginForm" action="https://www.voxbeam.com//login" method="post" autocomplete="off">

<input name="userName" id="userName" class="text auto_focus" placeholder="Username" autocomplete="off" type="text">

<input name="password" id="password" class="password" placeholder="Password" autocomplete="off" type="password">

<input id="challenge" name="challenge" value="78ed64f09c5bcf53ead08d967482bfac" type="hidden">

<input id="hash" name="hash" type="hidden">

I understand I should be using the method post, and sending userName and password

I'm trying this:

import requests
import webbrowser

url = "https://www.voxbeam.com/login"
login = {'userName': 'xxxxxxxxx',
         'password': 'yyyyyyyyy'}

print("Original URL:", url)

r = requests.post(url, data=login)

print("\nNew URL", r.url)
print("Status Code:", r.status_code)
print("History:", r.history)

print("\nRedirection:")
for i in r.history:
    print(i.status_code, i.url)

# Open r in the browser to check if I logged in
new = 2  # open in a new tab, if possible
webbrowser.open(r.url, new=new)

I’m expecting, after a successful login to get in r the URL to the dashboard, so I can begin scraping the data I need.

When I run the code with the authentication information in place of xxxxxx and yyyyyy, I get the following output:

Original URL: https://www.voxbeam.com/login

New URL https://www.voxbeam.com/login
Status Code: 200
History: []

Redirection:

Process finished with exit code 0

I get in the browser a new tab with www.voxbeam.com/login

Is there something wrong in the code? Am I missing something in the HTML? It’s ok to expect to get the dashboard URL in r, or to be redirected and trying to open the URL in a browser tab to check visually the response, or I should be doing things in a different way?

I been reading many similar questions here for a couple of days, but it seems every website authentication process is a little bit different, and I checked http://docs.python-requests.org/en/latest/user/authentication/ which describes other methods, but I haven’t found anything in the HTML that would suggest I should be using one of those instead of post

I tried too

r = requests.get(url, auth=('xxxxxxxx', 'yyyyyyyy')) 

but it doesn’t seem to work either.

1
  • 1
    You should post all form fields ( userName, password, challenge, hash ) Commented Apr 7, 2017 at 19:33

4 Answers 4

17

As said above, you should send values of all fields of form. Those can be find in the Web inspector of browser. This form send 2 addition hidden values:

url = "https://www.voxbeam.com//login"
data = {'userName':'xxxxxxxxx','password':'yyyyyyyyy','challenge':'zzzzzzzzz','hash':''}  
# note that in email have encoded '@' like uuuuuuu%40gmail.com      

session = requests.Session()
r = session.post(url, headers=headers, data=data)

Also, many sites have protection from a bot like hidden form fields, js, send encoded values, etc. As variants you could:

1) Use a cookies from manual login:

url = "https://www.voxbeam.com"
headers = {'user-agent': "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.98 Safari/537.36"}
cookies = {'PHPSESSID':'zzzzzzzzzzzzzzz', 'loggedIn':'yes'}

s = requests.Session()
r = s.post(url, headers=headers, cookies=cookies)

2) Use module Selenium:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

url = "https://www.voxbeam.com//login"
driver = webdriver.Firefox()
driver.get(url)

u = driver.find_element_by_name('userName')
u.send_keys('xxxxxxxxx')
p = driver.find_element_by_name('password')
p.send_keys('yyyyyyyyy')
p.send_keys(Keys.RETURN)
Sign up to request clarification or add additional context in comments.

3 Comments

Thank you for your help. I'm still working on the login, but you set me on the right track. The comment about using %40 instead of @ was a great detail, as I was doing it the wrong way.
You're manually defining cookies? That's not how any website uses cookies... You have to access cookies from the response.
if the login form is generated by js from backend server, this method wont work because driver.find_element_by_name('userName') will fail
3

Try to specify the URL more clearly as follows :

  url=https://www.voxbeam.com//login?id=loginForm

This will setFocus on the login form so that POST method applys

Comments

1

It's very tricky depending on how the website handles the login process but what I did was that I used Charles which is a proxy application and listened to requests that my browser sent to the website's server while I was logging in manually. Afterwards I copied the exact same header and cookie that was shown in Charles into my own python code and it worked! I assume the cookie and header are used to prevent bot logging in.

Comments

0
from webbot import Browser

web = Browser() # this will navigate python to browser

link = web.go_to('enter your login page url') 
#remember click the login button then place here

login = web.click('login') #if you have login button in your web , if you have signin button then replace login with signin, in my case it is login


id = web.type('enter your Id/Username/Emailid',into='Id/Username/Emilid',id='txtLoginId') #id='txtLoginId' this varies from web to web find this by inspecting the Id/Username/Emailid Button, in my case it is txtLoginId

next = web.click('NEXT', tag='span')

passw = web.type('Enter Your Password', into='Password', id='txtpasswrd')
#id='txtpasswrd' (this also varies from web to web similiarly inspect the Password Button)in my case it is txtpasswrd

home = web.click('NEXT', id="fa fa-home", tag='span') 
# id="fa fa-home" (Now inspect all necessary Buttons and move accordingly) in my case it is fa fa-home
next11 = web.click('NEXT', tag='span')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.