Selenium Python Text Extraction

Question

I am trying to extract some information from a web page, but do not know how to define how to get specifically what I want.

Here is my code:

from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary

capabilities = webdriver.DesiredCapabilities().FIREFOX
capabilities["marionette"] = True
binary = FirefoxBinary("C:/PATH/Mozilla Firefox/firefox.exe")
driver = webdriver.Firefox(firefox_binary=binary, capabilities=capabilities, executable_path="geckodriver.exe")
driver.get("https://www.iparkit.com/Minneapolis")
content = driver.page_source

I would like to extract the addresses that are in the side bar. Here is an attempt to obtain the addresses:

address = driver.find_element_by_class_name('sidebar')
address.text


' SORT BY DISTANCE\n SORT BY PRICE\nLooking For A Specific Event?\nBUY\n1\nGateway Garage\n\n400 S 3rd Street\nMinneapolis, MN 55415\n 3 mins | Walk Distance\n (612) 338-2643\n$8.00\nBUY\n2\nGovernment Center Garage\n\n415 South 5th Street\nMinneapolis, MN 55415\n 5 mins | Walk Distance\n (612) 338-2643\n$13.00\nBUY\n3\n517 MARQUETTE\n\n517 MARQUETTE AVE\nMINNEAPOLIS, MN 55402\n 6 mins | Walk Distance\n (612) 746-3045\n$14.00\nBUY\n4\nMidtown Garage\n\n11 South 4th St.\nMinneapolis, MN 55402\n 7 mins | Walk Distance\n (612) 333-3940\n$13.00\nBUY\n5\nCentre Village Garage\n\n700 5th Avenue South\nMinneapolis, MN 55415\n 8 mins | Walk Distance\n (612) 338-2643\n$11.00\nBUY\n6\nGaviidae Commons Garage\n\n61 South 6th Street\nMinneapolis, MN 55402\n 8 mins | Walk Distance\n\n$15.00\nBUY\n7\nMarTen\n\n921 Marquette Avenue\nMinneapolis, MN 55402\n 13 mins | Walk Distance\n (612) 334-3498\n$9.00\nBUY\n8\nLoring Garage\n\n1300 Nicollet Mall\nMinneapolis, MN 55403\n 21 mins | Walk Distance\n (612) 338-2643\n$7.00'

How would I go about this to try and get the following result:

400 S 3rd Street
415 South 5th Street
517 MARQUETTE AVE
...

demouser123 · Accepted Answer · 2018-05-08 18:17:14Z

1

Why are you using address = driver.find_element_by_class_name('sidebar') - this is the reason why you are getting a lot of unwanted text in your code.

The text that you want to get is rendered in a div which is a result of an repeater - since the page is an Angular page.

<div ng-show=" ! searchInProgress" ng-repeat="result in results track by result.id" ng-click="goToLocation(result)" class="module shade mar-15-bot ng-scope" style="cursor: pointer;">

You should probably do something like this - not sure if the code is going to be accurate

get_all_divs = self.driver.find_elements_by_css_selector('.module.shade.mar-15-bot.ng-scope')

This will get you all divs inside the given repeater. Now the text that you want is inside the first div in a p tag.

for i in get_all_divs:
   print i.find_element_by_css_selector('div > p').text

You get inside the element with class and then inside that, you get the immediate div child and the p tag and the text inside it.

A bit rusty with Python so you might have to make changes to the for loop that I have written.

edited May 8, 2018 at 18:17

answered May 8, 2018 at 17:05

demouser123

4,27212 gold badges54 silver badges88 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

mm_nieder Over a year ago

Hi @demouser123 thank you for your response, and apologies I'm quite new with selenium. I am running into some errors on get_all_divs = self.driver.find_elements_by_class_name('.module.shade.mar-15-bot.ng-scope'). When I write get_all_divs = driver.find_elements_by_class_name('.module.shade.mar-15-bot.ng-scope') I'm receiving this:

InvalidSelectorException: Message: Given css selector expression "..module.shade.mar-15-bot.ng-scope" is invalid: SyntaxError: '..module.shade.mar-15-bot.ng-scope' is not a valid selector

demouser123 Over a year ago

See edited content. Instead of using class_name using css_selector to get the list of elements.

Collectives™ on Stack Overflow

Selenium Python Text Extraction

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related