1

Currently, I have used Selenium to extract text from a table on a website. Following is the code:

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager


# Using Chrome to access web
browser = webdriver.Chrome(ChromeDriverManager().install())

# Open the website
browser.get('https://launchstudio.bluetooth.com/Listings/Search')
element = browser.find_element_by_id('searchButton').click()

table_text = browser.find_element_by_class_name('table').text

while len(table_text) < 80:
    table_text = browser.find_element_by_class_name('table').text

print(table_text)

browser.close()

However, I am trying to find a way to do the same with Requests/Beautiful soup or any other library where I can schedule this as a task in windows and store the result in a table at every x interval. Obviously, since I want all this to happen in the background and then trigger a notification etc.

What I Want is- Open this website, click on the search button (or trigger the corresponding javascript), and then export the table as a Dataframe or whatever.

Can you please guide me here?

thanks in advance!!

4
  • Do you realize that bs4 and requests won't be able to trigger JavaScript? Commented Jan 25, 2021 at 12:58
  • I do, but one thing I have learnt is, there is always a way. Is it possible to do all of this without the chrome window popping up? I mean all this in memory is what I mean. Commented Jan 25, 2021 at 13:07
  • Well, why don't you use the headless mode? Commented Jan 25, 2021 at 13:07
  • @baduker : There is an API which can retrieve all the values so selenium is not necessary here. Commented Jan 25, 2021 at 13:15

1 Answer 1

5

If you go to Network Tab you will get the API . You can use this post request to get all the value.Using max result field you can limit the results as well.

enter image description here

https://platformapi.bluetooth.com/api/platform/Listings/Search

import requests
import  pandas as pd
data={
"searchString" : "",
"searchQualificationsAndDesigns": True,
"searchDeclarationOnly": True,
"bqaApprovalStatusId" : -1,
"bqaLockStatusId" : -1,
"layers" : [],
"listingDateEarliest" : "",
"listingDateLatest" : "",
"maxResults": 5000,
"memberId": "",
"productTypeId" : 0,
"searchDeclarationOnly" : True,
"searchEndProductList" : False,
"searchMyCompany" : False,
"searchPRDProductList" : True,
"searchQualificationsAndDesigns" : True,
"searchString" : "",
"specName": 0,
"userId" : 0
}

headers = {'User-Agent':
       'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36'}

url="https://platformapi.bluetooth.com/api/platform/Listings/Search"
response=requests.post(url,headers=headers,data=data).json()
df=pd.DataFrame(response)
print(df)

You can import to csv file.

df.to_csv("testresult.csv")

enter image description here

Sign up to request clarification or add additional context in comments.

1 Comment

The platformapi.bluetooth.com seems to be broken. Does anyone know what happened to it?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.