selenium scraping returns empty string after first few elements

Question

I am scraping a website using selenium in python. The xpath is able to find the 20 elements, which contain the search results. However, the content is available only for the first 6 elements, and the rest has empty strings. This is true for all the pages of the results

The xpath used:

results = driver.find_elements_by_xpath("//li[contains(@class, 'search-result search-result__occluded-item ember-view')]")

xpath finds 20 elements in chrome

Text inside the results

[tt.text for tt in results]

anonymized output:

['Abcddwedwada',
 'Asefdasdfaca',
 'Asdaafcascac',
 'Asdadaacjkhi',
 'Sfskjfbsfvbkd',
 'Fjsbfksjnsvas',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '']

I have tried extracting the id of the 20 elements and used driver.find_element_by_id, but still I get empty strings after the first 6 elements.

could you share link to page?

Andersson
– Andersson

2017-03-03 06:51:18 +00:00
Commented Mar 3, 2017 at 6:51 — Andersson
– Andersson, Commented Mar 3, 2017 at 6:51
linkedin.com/search/results/people/…

mrbot
– mrbot

2017-03-03 07:48:32 +00:00
Commented Mar 3, 2017 at 7:48 — mrbot
– mrbot, Commented Mar 3, 2017 at 7:48

Chanda Korat · Accepted Answer · 2017-03-03 10:02:19Z

1

Try this ,

[str(tt.text) for tt in results if str(tt.text) !='']

OR

 [tt.text for tt in results if len(tt.text) > 0]

edited Mar 3, 2017 at 10:02

answered Mar 3, 2017 at 6:52

Chanda Korat

2,5812 gold badges22 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

mrbot Over a year ago

This filters out the results with empty strings

Chanda Korat Over a year ago

@mrbot what is the type of empty string ' ' ? unicode or string?

mrbot Over a year ago

type of the empty string is str

Andersson · Accepted Answer · 2017-03-03 12:02:06Z

1

I can assume that the reason of such result is following: when you opens the page there are 20 entries (<li> elements in <ul>), but only content of 6 displayed. Content of other elements could be displayed after scrolling down - content of those 14 entries generated dynamically from XHR requests.

So you might need to perform scrolling down to the last element in list:

from selenium.webdriver.support.ui import WebDriverWait as wait 

wait(driver, 10).until(lambda x: len(driver.find_elements_by_xpath("//li[contains(@class, 'search-result search-result__occluded-item ember-view') and not(text()='')]")) == 20)
results = driver.find_elements_by_xpath("//li[contains(@class, 'search-result search-result__occluded-item ember-view')]")
results[-1].location_once_scrolled_into_view
[tt.text for tt in results]

Try and let me know results

edited Mar 3, 2017 at 12:02

answered Mar 3, 2017 at 8:48

Andersson

52.8k18 gold badges83 silver badges132 bronze badges

5 Comments

mrbot Over a year ago

It didn't work. I thought of that and tried: driver.execute_script("window.scrollTo(0, Y);")

mrbot Over a year ago

Does anything have to do with using pyvirtualdisplay?

mrbot Over a year ago

all the 20 elements return True for is_displayed()

Andersson Over a year ago

Try code from updated answer and let me know if it still doesn't work as expected

mrbot Over a year ago

the new wait statement returned True, and the results still have empty strings after 6th result

Collectives™ on Stack Overflow

selenium scraping returns empty string after first few elements

2 Answers 2

3 Comments

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related