I am new to webscraping, and using beautifulsoup and selenium. I am trying to scrape data from the following webpage:
https://epl.bibliocommons.com/item/show/2300646980
I am trying to scrape the section: "Staff Lists that Include that Title". In particular, I wanted to grab the number of <li> tags, as I only need the number of items/links on that staff list.
I have tried the following on the HTML code provided by "Inspect"-ing the page. The following is the block of HTML code I am trying to scrape from:
<div class="ugc_bandage">
<div class="lists_heading clearfix">
<h3 data-test-id="ugc-lists-heading">
Listed
</h3>
<div class="ugc_add_link">
<div class="dropdown saveToButton clearfix" id="save_to_2300646980_id_7a3ateh0panp1uv0he1v7aqmj9" data-test-id="add-to-list-dropdown-container">
<a href="#" aria-expanded="false" aria-haspopup="true" class=" dropdown-toggle dropdown-toggle hide_trigger_icon" data-test-id="add-to-list-save-button" data-toggle="dropdown" id="save_button_2300646980_id_7a3ateh0panp1uv0he1v7aqmj9" rel="nofollow">
<i aria-hidden="true" class=" icon-plus"></i>
<span aria-hidden="true">Add</span><span class="sr-only" data-js="sr-only-dropdown-toggle" data-text-collapsed="Add, collapsed" data-text-expanded="Add, expanded">Add, collapsed</span><span aria-hidden="true" class="icon-arrow"></span></a>
<ul class="dropdown-menu">
<li>
<a href="/user_lists/new?bib=2300646980&origin=https%3A%2F%2Fepl.bibliocommons.com%2Fitem%2Fload_ugc_content%2F2300646980" class="newList">Create a New List</a>
</li>
<li>
<a href="/lists/add_bib/mine?bib=2300646980_fangirl" data-js="cp-overlay" id="more_lists_id_7a3ateh0panp1uv0he1v7aqmj9">Existing Lists »</a>
</li>
</ul>
</div>
</div>
</div>
<h4 data-test-id="staff-lists-that-include-this-title">Staff Lists that include this Title</h4>
<div data-analytics="{ "SubFeature": "Lists that include this title" }" class="expand clearfix" id="all_lists_expand" testid="text_listsincluding">
<ul class="further_list">
<li> [LIST ENTRIES START HERE, BUT THERE'S SO MANY, IT WOULD MAKE THIS POST TO LONG.] </li>
- I have scraped the above code using the xpath, copied from inspecting the staff list section (
id="all_lists_expand"):
element = driver.find_elements_by_xpath('//*[@id="rightBar"]/div[3]/div')
- I tried scraping the section by scraping using the class name:
element = driver.find_element_by_class_name('expand clearfix')
- I also tried scraping using the css selector:
element = driver.find_element_by_css_selector('#all_lists_expand')
I have also done other variants of the code above, looking for classes of the element's parents, xpaths, etc.
All of the above attempts return NONE. I am not sure what I am doing wrong, am I supposed to trigger an event or something using selenium? I am not even clicking on any of the links listed in the list, or even keeping a list of the links, I just need to count how many links there are to begin with.
