0

Getting a blank screen on executing the python program.

Please help. It may be a duplicate question, but I don't know Python very much, because I am an Android developer.

Here is my code:

import sys
import requests
from bs4 import BeautifulSoup, SoupStrainer

home_url = 'https://parivahan.gov.in/rcdlstatus/'
post_url = 'https://parivahan.gov.in/rcdlstatus/vahan/rcDlHome.xhtml'
# Everything before the last four digits: GJ03KA
first = sys.argv[1]
# The last four digits: 0803
second = sys.argv[2]

r = requests.get(url=home_url)
cookies = r.cookies
soup = BeautifulSoup(r.text, 'html.parser')
viewstate = soup.select('input[name="javax.faces.ViewState"]')[0]['value']

data = {
    'javax.faces.partial.ajax':'true',
    'javax.faces.source': 'form_rcdl:j_idt32',
    'javax.faces.partial.execute':'@all',
    'javax.faces.partial.render': 'form_rcdl:pnl_show form_rcdl:pg_show form_rcdl:rcdl_pnl',
    'form_rcdl:j_idt32':'form_rcdl:j_idt32',
    'form_rcdl':'form_rcdl',
    'form_rcdl:tf_reg_no1': first,
    'form_rcdl:tf_reg_no2': second,
    'javax.faces.ViewState': viewstate,
}

r = requests.post(url=post_url, data=data, cookies=cookies)
soup = BeautifulSoup(r.text, 'html.parser')
table = SoupStrainer('tr')
soup = BeautifulSoup(soup.get_text(), 'html.parser', parse_only=table)
print(soup.get_text())
4
  • 1
    r returns response 500, ie Internal Server Error. Visiting the URL parivahan.gov.in/rcdlstatus/vahan/rcstatus.xhtml on browser also returns error code 500, with a "Bad request" message. Are you sure it is the right address? Commented May 21, 2018 at 16:15
  • 1
    Always check those error codes! And note in http all the ones in the 200's are success. Commented May 21, 2018 at 16:17
  • @Claire i edited the code please help me i tried it with many code and changes but i didn't get success on python code and even i don't know much more about it. Really today i touched it first time in my life. Commented May 21, 2018 at 16:58
  • Try this, better, free and, legal way shrouded-falls-48764.herokuapp.com youtube.com/watch?v=bMj-1BGbxfc Commented Sep 20, 2020 at 4:16

4 Answers 4

2

If you print out the result from the requests post (r), you're getting a 500 error which is a generic http response for a server error. My guess is the url resource is bad or the data being posted to it isn't formatted correctly

Sign up to request clarification or add additional context in comments.

Comments

1

Let me open a new answer in response to the renewed question.

After trying some methods with just requests and urllib, I think it is better to use the selenium webdriver controller.

The following code will grab the table rows as you want.

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
from bs4 import BeautifulSoup

url = 'https://parivahan.gov.in/rcdlstatus/'

# Optional: Getting "Headless" browser, ie suppressing the browser window from showing
chrome_options = Options()  
chrome_options.add_argument("--headless")  

# Let the driver open, fill and submit the form
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get(url)
driver.delete_all_cookies()
wait = WebDriverWait(driver, 10)
wait.until(EC.element_to_be_clickable((By.NAME, 'form_rcdl:j_idt34')))
input1 = driver.find_element_by_name('form_rcdl:tf_reg_no1')
input1.send_keys('GJ03KA')
input2 = driver.find_element_by_name('form_rcdl:tf_reg_no2')
input2.send_keys('0803')
driver.find_element_by_name('form_rcdl:j_idt34').click()
wait = WebDriverWait(driver, 10)

# Get the result table
try:
    element = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.ID, "form_rcdl:j_idt63"))
    )
    result_html = driver.page_source
    #print(result_html)
    soup = BeautifulSoup(result_html, 'lxml')
    print(soup.findAll('tr'))
except TimeoutException:
    driver.quit()
    print('Time out.')

Below demonstrates the result of printing out the table html tags in soup.

enter image description here

I hope the government does not find out and block this way before you try out lol

Hope this helps! You may refer to the following references if interested:

8 Comments

Hi @Claire it will open this thing in browser but i need it in the form of api or Web Page HTML source in the soup because than and than i will able to make the api for them.
I will got the HTML source in the result_html when i am trying to print the commented line. But as per your code when its try to print the soup.findAll('tr') at that time it will throws me the TimeoutException.
Hi @ShubhamSejpal With the "headless" argument as specified in answer, it will not open a browser window.
@ShubhamSejpal I'm not sure. The code prints the tags in soup alright here, with or without the commented line. Here's a demo for printing out the Web Page HTML source result_html: imgur.com/a/5Jaf8Yh You can manipulate the soup (soup form of result_html), to extract information and further handle it as you wish.
@ShubhamSejpal Maybe could you provide the Traceback to have a look?
|
0

The URL that actually returns a valid form webpage is 'https://parivahan.gov.in/rcdlstatus/'.

By inputting the example ID (Reg No.) in browser, error message "Registration No. does not exist!!! Please check the number." pops up. (which makes total sense. I do hope you didn't put a real ID in public lol)

Since I don't have a valid ID to test. Please see if this solves your problem.

Another thing noticed is that the fields for inputting the registration number should be "form_rcdl:tf_reg_no1" and "form_rcdl:tf_reg_no2". You can view the HTML source of the webpage (e.g. Ctrl+C in Chrome) to verify.

enter image description here

3 Comments

Now i changed the registered number also with my original bike number in the comment.
@ShubhamSejpal Let me have a look. Nice that you are now dynamically retrieving the viewstate already. It's a long way you've come so far for the first day! :D
@ShubhamSejpal Done :) Just posted a new answer to keep comments relevant, so new comers will know what's going on
0

You have hardcoded jdt32 as button id... please note that button id in this website is dynamic.... your program should dynamically pickup the right button id. here is the solution

import sys
import re
import requests
from bs4 import BeautifulSoup, SoupStrainer

home_url = 'https://parivahan.gov.in/rcdlstatus/?pur_cd=102'
post_url = 'https://parivahan.gov.in/rcdlstatus/vahan/rcDlHome.xhtml'
# Everything before the last four digits: MH02CL
first = sys.argv[1]
# The last four digits: 0555
second = sys.argv[2]

r = requests.get(url=home_url)
cookies = r.cookies
soup = BeautifulSoup(r.text, 'html.parser')
viewstate = soup.select('input[name="javax.faces.ViewState"]')[0]['value']
#print soup.findAll('button', id=re.compile('form_rcdl^'))
#print soup.findAll('button', id=lambda x: x and x.startswith('form_rcdl'))
i = 0
for match in soup.find_all('button', id=re.compile("form_rcdl")):
  if i ==  0:
    button_id= match.get('id')
  i = 1

data = {
    'javax.faces.partial.ajax':'true',
    'javax.faces.source':button_id,
    'javax.faces.partial.execute':'@all',
    'javax.faces.partial.render': 'form_rcdl:pnl_show form_rcdl:pg_show form_rcdl:rcdl_pnl',
    button_id:button_id,
    'form_rcdl':'form_rcdl',
    'form_rcdl:tf_reg_no1': first,
    'form_rcdl:tf_reg_no2': second,
    'javax.faces.ViewState': viewstate,
}

r = requests.post(url=post_url, data=data, cookies=cookies)
#print (r.text)
soup = BeautifulSoup(r.text, 'html.parser')
table = SoupStrainer('tr')
soup = BeautifulSoup(soup.get_text(), 'html.parser', parse_only=table)
print(soup.get_text())

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.