Extract Button link text from a website python selenium

Question

Here is the link for which I want to extract a button link text, but I'm unable to do it so After the website opens, I'm selecting an option from a "Choose a Product" , suppose I choose first option i.e "Acrylic Coatings", then 3 types appears, which is "Primers", "Intermediates", "Finishes", I want to extract their text which I'm unable to do.

import requests
from bs4 import BeautifulSoup
driver = webdriver.Chrome('~/chromedriver.exe')

driver.get('http://www.asianpaintsppg.com/applications/protective_products.aspx')
lst_name = ['Acrylic Coatings','Glass Flake Coatings']

for i in lst_name:
    print(i)
    driver.find_element_by_xpath("//select[@name='txtProduct']/option[text()="+"'"+str(i)+"'"+"]").click()
    page = requests.get("http://www.asianpaintsppg.com/applications/protective_products.aspx")
    soup = BeautifulSoup(page.content, 'html.parser')
    for div in soup.findAll('table', attrs={'id':'dataLstSubCat'}):
      print(div.find('a')['href'])

But I get empty values here. Any help would be appreciated.

I think this will help you: Previously asked similar question — hitesh kaushik
– hitesh kaushik, Commented May 24, 2019 at 6:26

SIM · Accepted Answer · 2019-05-24 06:58:00Z

2

There are options to get the subcategories without using selenium. Try using post requests like I've shown below.

import requests
from bs4 import BeautifulSoup

url = "http://www.asianpaintsppg.com/applications/protective_products.aspx"

with requests.Session() as s:
    r = s.get(url)
    soup = BeautifulSoup(r.text,"lxml")
    payload = {i['name']: i.get('value', '') for i in soup.select('input[name]')}
    payload['txtProduct'] = '2' #This is the dropdown number
    res = s.post(url,data=payload)
    sauce = BeautifulSoup(res.text,"lxml")
    subcat = [item.text for item in sauce.select("[id^='dataLstSubCat_']")]
    print(subcat)

Output you may get:

['Primers', 'Intermediates', 'Finishes']

answered May 24, 2019 at 6:58

SIM

22.5k6 gold badges45 silver badges116 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

QHarr Over a year ago

Much better there :-)

Andre_k Over a year ago

@SIM if you could please explain me your code, its functioning it would be appreciated. Thanks..

QHarr · Accepted Answer · 2019-05-24 07:51:19Z

1

You want .text not href and also a wait condition to allow page to update:

#dataLstSubCat a

Then extract .text in loop|comprehension

items = [item.text for item in soup.select('#dataLstSubCat a')]

You can do whole thing with selenium - you need a wait condition to ensure content present and an additional wait condition for the text to change after iteration 1. I use time.sleep which is suboptimal.

items = [item.text for item in  WebDriverWait(driver,5).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "#dataLstSubCat a")))]

Additional imports:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

You could probably do the whole thing with POST requests, and an initial GET, as it looks like the page uses __doPostBack (.aspx) where the value from the dropdown above is used to return the subitems.

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium import webdriver
import time

driver = webdriver.Chrome() #'~/chromedriver.exe')
driver.get('http://www.asianpaintsppg.com/applications/protective_products.aspx')

lst_name = ['Acrylic Coatings','Glass Flake Coatings']

for i in lst_name:
    driver.find_element_by_xpath("//select[@name='txtProduct']/option[text()="+"'"+str(i)+"'"+"]").click()
    items = [item.text for item in  WebDriverWait(driver,5).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "#dataLstSubCat a")))]
    print(items)
    time.sleep(2)

edited May 24, 2019 at 7:51

answered May 24, 2019 at 6:22

QHarr

84.5k14 gold badges58 silver badges105 bronze badges

8 Comments

Andre_k Over a year ago

it gives [] / empty list

QHarr Over a year ago

Did you use the wait condition?

Andre_k Over a year ago

No, Have not used any wait condition, Rather how do I implement this in my code ?

QHarr Over a year ago

try that first as shown above as you may be trying to access before DOM has been updated.

Andre_k Over a year ago

Your code gives this output " Acrylic Coatings ['Primers', 'Intermediates', 'Finishes'] Glass Flake Coatings ['Primers', 'Intermediates', 'Finishes']" for "Acrylic Coatings" its right but for "Glass Flake Coatings" its not giving proper output

|

KunduK · Accepted Answer · 2019-05-24 06:59:44Z

0

Use the following code.It gives me following output.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions

driver = webdriver.Chrome('~/chromedriver.exe')
driver.get('http://www.asianpaintsppg.com/applications/protective_products.aspx')
lst_name = ['Acrylic Coatings','Glass Flake Coatings']

for i in lst_name:

    driver.find_element_by_xpath("//select[@name='txtProduct']/option[text()="+"'"+str(i)+"'"+"]").click()
    elements=WebDriverWait(driver, 10).until(expected_conditions.presence_of_all_elements_located((By.XPATH, '//table[@id="dataLstSubCat"]//tr//td//a[starts-with(@id,"dataLstSubCat_LnkBtnSubCat_")]')))
    for ele in elements:
        print(ele.text)

edited May 24, 2019 at 6:59

answered May 24, 2019 at 6:48

KunduK

33.4k5 gold badges19 silver badges42 bronze badges

4 Comments

Andre_k Over a year ago

This is not I'm expecting as output, please refer the question

KunduK Over a year ago

@deepesh : sorry for that.Just try the updated code.

Andre_k Over a year ago

This give O/P as "Primers Intermediates Finishes" for 1st item from list and also"Primers Intermediates Finishes" for 2nd item in the lst_name, which actually is not the case

KunduK Over a year ago

this is because it is in loop if you see your code you are selecting drop down value each time and it gives the results.

Collectives™ on Stack Overflow

Extract Button link text from a website python selenium

3 Answers 3

2 Comments

8 Comments

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

8 Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related