Getting the value of a script's "var" using Selenium with Python

Question

from selenium import webdriver

driver = webdriver.Chrome()
driver.get("url_goes_here")

p_id = driver.find_elements_by_tag_name("script")

This procures me the script I need. I don't need to execute it, as it's already executed and running upon initial page load. It contains a variable named "task". How do I access its value with Selenium?

0buz · Accepted Answer · 2020-05-06 17:29:16Z

1

The regex module re can help you with that:

import re
from selenium import webdriver

driver = webdriver.Chrome()
driver.get("url_goes_here")

p_id = driver.find_elements_by_tag_name("script")

for script in p_id:
    innerHTML=script.get_property('innerHTML')
    task=re.search('var task = (.*);',innerHTML)
    if task is not None:
        print(task.group(1))

What this does is look through the innerHTML of each script and, from the defined search pattern ('var task = (.*);'), capture the matching string group ((.*)). Print out the group if a match is found.

answered May 6, 2020 at 17:29

0buz

3,5352 gold badges12 silver badges31 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Alexander Over a year ago

First off, you're the only person in this thread so far who seems to have gotten what I'm actually asking, namely to extract the inner contents of the appropriate <script> tag which I've already scraped, and go from there. So thanks! Now, as for this: script.get_property('innerHTML') It returns an empty string! :(

0buz Over a year ago

Strange..if you do a .get_attribute instead of .get_property do you still get an empty string?

Alexander Over a year ago

I never got around to trying .get_attribute, but I eventually solved my problem by extracting the appropriate element via the .find_elements_by_xpath rather than via by tag name as depicted in the original question and THEN using your .get_property('innerHTML') suggestion. My issue is solved now, and since you were the only one who at least put me on the right path to its solution, I will mark your answer as correct! Thank you!!! :)

rocknrold · Accepted Answer · 2020-05-06 03:51:38Z

0

you can access value of tag or any element of html via .text or .getText()

answered May 6, 2020 at 3:51

rocknrold

661 silver badge4 bronze badges

1 Comment

Alexander Over a year ago

I was aware of these methods, but the first one returns an empty string to me, no matter what I try it on. Whereas the second gives: AttributeError: 'WebElement' object has no attribute 'getText'

E_net4 · Accepted Answer · 2020-05-11 10:51:46Z

0

Since you are using find_elements_by_tag_name() which returns list of elements. Iterate that list and check element.text contains task then print text of that element.

p_id = driver.find_elements_by_tag_name("script")
for id in p_id:
    if 'task' in id.text:
        print(id.text)

edited May 11, 2020 at 10:51

E_net4

30.5k13 gold badges118 silver badges155 bronze badges

answered May 6, 2020 at 9:15

KunduK

33.4k5 gold badges19 silver badges42 bronze badges

4 Comments

Alexander Over a year ago

Since you are using find_elements_by_tag_name() which returns list of elements.. Any better suggestion for scraping for scripts specifically? Would searching by... let's say, XPath, produce a better formatted output or something? Second, this is the actual content of an element of the output list that gets produced by the execution of the find_elements_by_tag_name():

<selenium.webdriver.remote.webelement.WebElement (session="53d2976784532dd4717abff68170b22a", element="8d9f432d-7a2e-47e0-8023-d6c092ee9620")>

As you can see, it's not very legible.

KunduK Over a year ago

@Alexander : I am extremely sorry if haven't understood your requirement.I believe you are searching script tag which contains the text task is that right?

Alexander Over a year ago

the script tag is already FOUND and stored in a list. Question is, how do I extract the value of its variable "task" from that raw data stored in the list.

KunduK Over a year ago

Well for that you need to post that script details.Regular expression will definitely work to extract the value.

Akzy · Accepted Answer · 2023-01-27 04:46:39Z

0

#Use Xpath instead:

from selenium import webdriver

driver = webdriver.Chrome()

driver.get("url_goes_here")

p_id = driver.find_element(By.XPATH,"ADDXPATH")

p_id.get_attribute('outerHTML')

edited Jan 27, 2023 at 4:46

Akzy

1,8891 gold badge11 silver badges21 bronze badges

answered Jan 25, 2023 at 16:23

Nasir

1

Collectives™ on Stack Overflow

Getting the value of a script's "var" using Selenium with Python

4 Answers 4

3 Comments

1 Comment

4 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

3 Comments

1 Comment

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related