0

I am using Python/Selenium to extract some text from a website to further sort it in Google Sheets.

There are 15 headers for which I need to extract text. The text is found under each header in tag h5.

Here's one extract of a header:

<tr class="dayHeader"><td colspan="7" style="padding:10px 0;"><hr><h5>&nbsp;&nbsp;Thursday - 28 January 2021</h5></td></tr>
  <td colspan="7" style="padding:10px 0;"><hr><h5>&nbsp;&nbsp;Thursday - 28 January 2021</h5></td>
    <hr>
    <h5>&nbsp;&nbsp;Thursday - 28 January 2021</h5>
    </td>
  </tr>

What I have done is the following:

headers = driver.find_elements_by_tag_name('h5')
results = []

for header in headers:
    result = header.text
    results.append(result)

The for loop above outputs the following list:

['Result 1']
['Result 1', 'Result 2']
['Result 1', 'Result 2', 'Result 3']

Instead, how can I get it to output:

['Result 1', 'Result 2', 'Result 3']
1
  • The above for loop shouldn't be outputting anything... did you miss a print statement when you copied your code over? Commented Jan 29, 2021 at 7:13

1 Answer 1

1

A wrong indent of your print put it out of your loop like:

headers = driver.find_elements_by_tag_name('h5')
results = []

for header in headers:
    result = header.text
    results.append(result)

print(results)

Will only print ones:

['Result 1', 'Result 2', 'Result 3']

instead of :

headers = driver.find_elements_by_tag_name('h5')
results = []

for header in headers:
    result = header.text
    results.append(result)
    print(results)

Will every iteration print:

['Result 1']
['Result 1', 'Result 2']
['Result 1', 'Result 2', 'Result 3']
Sign up to request clarification or add additional context in comments.

1 Comment

Of course! How could I not see this. Thanks for your help!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.