0

My code goes into a website and scrapes rows of information (title and time).

However, there is one tag ('p') that I am not sure how to get using 'get element by'.

On the website, it is the information under each title.

Here is my code so far:

import time

from selenium import webdriver
from bs4 import BeautifulSoup
import requests

driver = webdriver.Chrome()
driver.get('https://www.nutritioncare.org/ASPEN21Schedule/#tab03_19')
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
eachRow = driver.find_elements_by_class_name('timeline__item')
time.sleep(1)
for item in eachRow:
    time.sleep(1)
    title = item.find_element_by_class_name('timeline__item-title')
    tim = item.find_element_by_class_name('timeline__item-time')
    tex = item.find_element_by_tag_name('p') # This is the part I don’t know how to scrape
    print(title.text, tim.text, tex.text)

3 Answers 3

1

I checked the page and there are several p tags, I suggest to use find_elements_by_tag_name instead of find_element_by_tag_name (to get all the p tags including the p tag that you want) and iterate over all the p tags elements and then join the text content and do strip on it.

from selenium import webdriver
from bs4 import BeautifulSoup
import time
import requests
driver = webdriver.Chrome()

driver.get('https://www.nutritioncare.org/ASPEN21Schedule/#tab03_19')
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
eachRow = driver.find_elements_by_class_name('timeline__item')
time.sleep(1)
for item in eachRow:
    time.sleep(1)
    title=item.find_element_by_class_name('timeline__item-title')
    tim=item.find_element_by_class_name('timeline__item-time')
    tex=item.find_elements_by_tag_name('p')
    text = " ".join([i.text for i in tex]).strip()
    print(title.text,tim.text, text)
Sign up to request clarification or add additional context in comments.

Comments

1

Since the webpage has several p tags, it would be better to use the .find_elements_by_class() method. Replace the print call in the code with the following:

    print(title.text,tim.text)
    for t in tex:
        if t.text == '':
            continue
        print(t.text)

1 Comment

From a comment: "find_element_by_* and find_elements_by_* are removed in Selenium 4.3.0. Use find_element instead.". Though it doesn't really answer the question what can be done if the number of elements is different from exactly one. There may be a canonical Stack Overflow answer somewhere.
0

Maybe try using different find_elements_by_class... I don't use Python that much, but try this unless you already have.

5 Comments

The p tag does not have a class name unfortunately
what does 'p' represent?
paragraph, not sure if its considered tag or css selector etc
id know then because tag name shouldwork but if it doesn't i guess i can't help sorry
unless xpath. (//p[text() = 'JBL']) works

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.