0

I am trying to take three selenium.text objects scraped from a website and I want to put them in a table format so that I have a array of 3 columns by 25 rows.

Example of desired end output:

Country Date Election Type
Zambia August 12, 2021 (All day) General

Current code below:

# Installing selenium in Jupyter notebook
!pip install selenium

#checking my file path since we'll need to make sure our webdriver is in the same path
import os
import sys
os.path.dirname(sys.executable)

#Opens webbrowser chrome
from selenium import webdriver

#Manually directing the webdriver path and specifying the Chrome browser
browser = webdriver.Chrome('/Users/peterschoffelen/Documents/Copulus/chromedriver')
type(browser)

#Pulls up the webpage for the elections calendar
browser.get('https://www.ndi.org/elections-calendar')

#Clicking load more button

#tools needed
import time
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

#Finding the link
linkElem = browser.find_element_by_link_text('LOAD MORE')
type(linkElem)

#linkElem.click() #Clicks the load more button to make sure all available links are shown
while True:
    try:
        loadMoreButton = browser.find_element_by_link_text('LOAD MORE')
        time.sleep(2)
        linkElem.click()
        time.sleep(5)
    except Exception as e:
        print (e)
        break
print ("Complete")
time.sleep(10)

#creating an object for each of the elements I am interested in
election_date = browser.find_elements_by_css_selector('span.date-display-single')
election_country = browser.find_elements_by_css_selector('div.election-title')
election_type = browser.find_elements_by_css_selector('div.election-type')

#checking the type for the created objects and looking at the first element in each object
print(type(election_date))
print(type(election_date[0]))
print(election_date[0])

#Our above checking showed us that we need to specfically extract the text from the elements we scraped
print(election_date[0].text)
print(election_country[0].text)
print(election_type[0].text)

#With this number being greater then 10, we can be confident we have gotten all the elections available
print(len(election_date))

#Creating that list as an object

elections = []

for d, t, c in zip(election_date, election_type, election_country):
    date_text = d.text
    type_text = t.text
    country_text = c.text
    elections.append(country_text)
    elections.append(type_text)
    elections.append(date_text)

#Show us the whole table we have now created of three columns
print(elections)

#
print(elections[0:3])

import numpy as np

democracy = np.array(elections)

#Shows that I just have created a single column 75 row arrary which is not what I want :(
print(democracy.shape)

Thanks in advance for any help!

2
  • going to test it now, but pretty sure you want to say elections.append([country_text, type_text, date_text]) Commented Jul 19, 2021 at 18:03
  • Since asking my question I have made some progress: ``` date_text = [] for date in election_date: text = date.text date_text.append(text) print(date_text) type_text = [] for t in election_type: text = t.text type_text.append(text) print(type_text) country_text = [] for c in election_country: text = c.text country_text.append(text) print(country_text) #x=[] #for i in document: # x.append(i.text) election_table = np.vstack([date_text, type_text, country_text]) print(election_table) print(election_table.shape) ``` Commented Jul 19, 2021 at 18:08

1 Answer 1

1

If I replace your 3 append lines with elections.append([country_text, type_text, date_text]) I get out a 25 by 3, which I think is what you wanted. Output looks like:

[['ZAMBIA', 'GENERAL', 'AUGUST 12, 2021 (ALL DAY)'], ['MOROCCO', 'GENERAL', 'SEPTEMBER 8, 2021 (ALL DAY)'], ['RUSSIA', 'LEGISLATIVE', 'SEPTEMBER 19, 2021 (ALL DAY)'], ['HAITI', 'PRESIDENTIAL AND PARLIAMENTARY', 'SEPTEMBER 26, 2021 (ALL DAY)'], ['SOMALIA', 'PRESIDENTIAL', 'OCTOBER 10, 2021 (ALL DAY)'], ['IRAQ', 'PARLIAMENTARY', 'OCTOBER 10, 2021 (ALL DAY)'], ['BULGARIA', 'PRESIDENTIAL', 'OCTOBER 20, 2021 (ALL DAY)'], ['CHAD', 'LEGISLATIVE', 'OCTOBER 24, 2021 (ALL DAY)'], ['UZBEKISTAN', 'GENERAL', 'OCTOBER 24, 2021 (ALL DAY)'], ['CAPE VERDE', 'PRESIDENTIAL', 'OCTOBER 2021'], ['MALI', 'REFERENDUM', 'OCTOBER 31, 2021 (ALL DAY)'], ['NICARAGUA', 'GENERAL', 'NOVEMBER 7, 2021 (ALL DAY)'], ['ARGENTINA', 'LEGISLATIVE', 'NOVEMBER 14, 2021 (ALL DAY)'], ['VENEZUELA', 'MUNICIPAL, REGIONAL', 'NOVEMBER 21, 2021 (ALL DAY)'], ['CHILE', 'GENERAL', 'NOVEMBER 21, 2021 (ALL DAY)'], ['HONDURAS', 'GENERAL', 'NOVEMBER 28, 2021 (ALL DAY)'], ['THE GAMBIA', 'PRESIDENTIAL', 'DECEMBER 4, 2021 (ALL DAY)'], ['HONG KONG', 'LEGISLATIVE', 'DECEMBER 19, 2021 (ALL DAY)'], ['COSTA RICA', 'GENERAL', 'FEBRUARY 6, 2022 (ALL DAY)'], ['HONG KONG', 'EXECUTIVE', 'MARCH 27, 2022 (ALL DAY)'], ['COLOMBIA', 'PRESIDENTIAL', 'MAY 29, 2022 (ALL DAY)'], ['INDIA', 'PRESIDENTIAL', 'JULY 2022'], ['KENYA', 'GENERAL', 'AUGUST 9, 2022 (ALL DAY)'], ['BRAZIL', 'GENERAL', 'OCTOBER 2, 2022 (ALL DAY)'], ['NIGERIA', 'GENERAL', 'FEBRUARY 18, 2023 (ALL DAY)']]

The first 3 items are: [['ZAMBIA', 'GENERAL', 'AUGUST 12, 2021 (ALL DAY)'], ['MOROCCO', 'GENERAL', 'SEPTEMBER 8, 2021 (ALL DAY)'], ['RUSSIA', 'LEGISLATIVE', 'SEPTEMBER 19, 2021 (ALL DAY)']] and the dimensions are: (25, 3)

Sign up to request clarification or add additional context in comments.

5 Comments

Awesome! This did the job I was hoping. Can you help me understand why adding the brackets allows this operation to work vs elections.append(country_text, type_text, date_text), which for me did not work.
I think elections.append(text1, text2, text3) adds each of those 3 items to your 1-D list/array (though from a preliminary look at Python doc, it need not work at all?). A 2-D list/array is a list/array of list/arrays. So appending [text1, text2, text3] adds another row (of 3 columns).
@PeterSchoffelen if this resolved your question - why didn't you accept the answer?
@Prophet my mistake, I am new to asking questions on stack exchange
I understand that, it's OK, mate.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.