2

I'm new to python and webscraping so I'm not sure what the name of the value inbetween the <div>'s in an element is called. Sorry for not being able to specify.

<div class="syllable">value</div>

Is there a way to have the value inbetween the <div>'s get assigned to a string variable in python using selenium using XPath? For example, the "value" in the element would be a string and it would print out:

value

I'm new to python and selenium so I can't figure it out.

3 Answers 3

3

To print out the text of the element.

elem=driver.find_element_by_class_name("syllable")
print(elem.text)

xpath:

elem=driver.find_element_by_xpath("//div[@class='syllable']/text()")
print(elem)
Sign up to request clarification or add additional context in comments.

Comments

2

it is called html innerText

you can retrieve this value using text in selenium , or get_attribute.

This returns the rendered text (means displayed text)

elem=driver.find_element_by_class_name("syllable")
print(elem.text)

This return the text with out checking the style attribute meaning returns value even if its not displayed in UI

elem=driver.find_element_by_class_name("syllable")
print(elem.get_attribute("textContent")

you can find elem using this text also:

// partial match
elem=driver.find_element_by_xpath("//div[contains(text(),'value')])
print(elem.text)

// exact match 
elem=driver.find_element_by_xpath("//div[text()='value')])
print(elem.text)

// exact match of the elements text if there is any child element like span it won't return the element
elem=driver.find_element_by_xpath("//div[.='value')])
print(elem.text)

Also note:

Other things you could read about outerHTML , innerHTML

Comments

2

To print the text value you can use either of the following Locator Strategies:

  • Using class_name and get_attribute("textContent"):

    print(driver.find_element_by_class_name("syllable").get_attribute("textContent"))
    
  • Using css_selector and get_attribute("innerHTML"):

    print(driver.find_element_by_css_selector("div.syllable").get_attribute("innerHTML"))
    
  • Using xpath and text attribute:

    print(driver.find_element_by_xpath("//div[@class='syllable']").text)
    

Ideally you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:

  • Using CLASS_NAME and get_attribute("textContent"):

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CLASS_NAME, "syllable"))).get_attribute("textContent"))
    
  • Using CSS_SELECTOR and text attribute:

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.syllable"))).text)
    
  • Using XPATH and get_attribute():

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='syllable']"))).get_attribute("innerHTML"))
    
  • Console Output:

    value
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python


References

Link to useful documentation:

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.