Python: Creating a table to CSV from multiple selenium.text lists

Question

I am trying to take three selenium.text objects scraped from a website and I want to put them in a table format so that I have a array of 3 columns by 25 rows.

Example of desired end output:

Country	Date	Election Type
Zambia	August 12, 2021 (All day)	General

Current code below:

# Installing selenium in Jupyter notebook
!pip install selenium

#checking my file path since we'll need to make sure our webdriver is in the same path
import os
import sys
os.path.dirname(sys.executable)

#Opens webbrowser chrome
from selenium import webdriver

#Manually directing the webdriver path and specifying the Chrome browser
browser = webdriver.Chrome('/Users/peterschoffelen/Documents/Copulus/chromedriver')
type(browser)

#Pulls up the webpage for the elections calendar
browser.get('https://www.ndi.org/elections-calendar')

#Clicking load more button

#tools needed
import time
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

#Finding the link
linkElem = browser.find_element_by_link_text('LOAD MORE')
type(linkElem)

#linkElem.click() #Clicks the load more button to make sure all available links are shown
while True:
    try:
        loadMoreButton = browser.find_element_by_link_text('LOAD MORE')
        time.sleep(2)
        linkElem.click()
        time.sleep(5)
    except Exception as e:
        print (e)
        break
print ("Complete")
time.sleep(10)

#creating an object for each of the elements I am interested in
election_date = browser.find_elements_by_css_selector('span.date-display-single')
election_country = browser.find_elements_by_css_selector('div.election-title')
election_type = browser.find_elements_by_css_selector('div.election-type')

#checking the type for the created objects and looking at the first element in each object
print(type(election_date))
print(type(election_date[0]))
print(election_date[0])

#Our above checking showed us that we need to specfically extract the text from the elements we scraped
print(election_date[0].text)
print(election_country[0].text)
print(election_type[0].text)

#With this number being greater then 10, we can be confident we have gotten all the elections available
print(len(election_date))

#Creating that list as an object

elections = []

for d, t, c in zip(election_date, election_type, election_country):
    date_text = d.text
    type_text = t.text
    country_text = c.text
    elections.append(country_text)
    elections.append(type_text)
    elections.append(date_text)

#Show us the whole table we have now created of three columns
print(elections)

#
print(elections[0:3])

import numpy as np

democracy = np.array(elections)

#Shows that I just have created a single column 75 row arrary which is not what I want :(
print(democracy.shape)

Thanks in advance for any help!

going to test it now, but pretty sure you want to say elections.append([country_text, type_text, date_text]) — Jeremy Kahan
– Jeremy Kahan, Commented Jul 19, 2021 at 18:03
Since asking my question I have made some progress: ``` date_text = [] for date in election_date: text = date.text date_text.append(text) print(date_text) type_text = [] for t in election_type: text = t.text type_text.append(text) print(type_text) country_text = [] for c in election_country: text = c.text country_text.append(text) print(country_text) #x=[] #for i in document: # x.append(i.text) election_table = np.vstack([date_text, type_text, country_text]) print(election_table) print(election_table.shape) ``` — Peter Schoffelen
– Peter Schoffelen, Commented Jul 19, 2021 at 18:08

Jeremy Kahan · Accepted Answer · 2021-07-19 18:18:22Z

1

If I replace your 3 append lines with elections.append([country_text, type_text, date_text]) I get out a 25 by 3, which I think is what you wanted. Output looks like:

[['ZAMBIA', 'GENERAL', 'AUGUST 12, 2021 (ALL DAY)'], ['MOROCCO', 'GENERAL', 'SEPTEMBER 8, 2021 (ALL DAY)'], ['RUSSIA', 'LEGISLATIVE', 'SEPTEMBER 19, 2021 (ALL DAY)'], ['HAITI', 'PRESIDENTIAL AND PARLIAMENTARY', 'SEPTEMBER 26, 2021 (ALL DAY)'], ['SOMALIA', 'PRESIDENTIAL', 'OCTOBER 10, 2021 (ALL DAY)'], ['IRAQ', 'PARLIAMENTARY', 'OCTOBER 10, 2021 (ALL DAY)'], ['BULGARIA', 'PRESIDENTIAL', 'OCTOBER 20, 2021 (ALL DAY)'], ['CHAD', 'LEGISLATIVE', 'OCTOBER 24, 2021 (ALL DAY)'], ['UZBEKISTAN', 'GENERAL', 'OCTOBER 24, 2021 (ALL DAY)'], ['CAPE VERDE', 'PRESIDENTIAL', 'OCTOBER 2021'], ['MALI', 'REFERENDUM', 'OCTOBER 31, 2021 (ALL DAY)'], ['NICARAGUA', 'GENERAL', 'NOVEMBER 7, 2021 (ALL DAY)'], ['ARGENTINA', 'LEGISLATIVE', 'NOVEMBER 14, 2021 (ALL DAY)'], ['VENEZUELA', 'MUNICIPAL, REGIONAL', 'NOVEMBER 21, 2021 (ALL DAY)'], ['CHILE', 'GENERAL', 'NOVEMBER 21, 2021 (ALL DAY)'], ['HONDURAS', 'GENERAL', 'NOVEMBER 28, 2021 (ALL DAY)'], ['THE GAMBIA', 'PRESIDENTIAL', 'DECEMBER 4, 2021 (ALL DAY)'], ['HONG KONG', 'LEGISLATIVE', 'DECEMBER 19, 2021 (ALL DAY)'], ['COSTA RICA', 'GENERAL', 'FEBRUARY 6, 2022 (ALL DAY)'], ['HONG KONG', 'EXECUTIVE', 'MARCH 27, 2022 (ALL DAY)'], ['COLOMBIA', 'PRESIDENTIAL', 'MAY 29, 2022 (ALL DAY)'], ['INDIA', 'PRESIDENTIAL', 'JULY 2022'], ['KENYA', 'GENERAL', 'AUGUST 9, 2022 (ALL DAY)'], ['BRAZIL', 'GENERAL', 'OCTOBER 2, 2022 (ALL DAY)'], ['NIGERIA', 'GENERAL', 'FEBRUARY 18, 2023 (ALL DAY)']]

The first 3 items are: [['ZAMBIA', 'GENERAL', 'AUGUST 12, 2021 (ALL DAY)'], ['MOROCCO', 'GENERAL', 'SEPTEMBER 8, 2021 (ALL DAY)'], ['RUSSIA', 'LEGISLATIVE', 'SEPTEMBER 19, 2021 (ALL DAY)']] and the dimensions are: (25, 3)

answered Jul 19, 2021 at 18:18

Jeremy Kahan

3,8361 gold badge12 silver badges23 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Peter Schoffelen Over a year ago

Awesome! This did the job I was hoping. Can you help me understand why adding the brackets allows this operation to work vs elections.append(country_text, type_text, date_text), which for me did not work.

Jeremy Kahan Over a year ago

I think elections.append(text1, text2, text3) adds each of those 3 items to your 1-D list/array (though from a preliminary look at Python doc, it need not work at all?). A 2-D list/array is a list/array of list/arrays. So appending [text1, text2, text3] adds another row (of 3 columns).

Prophet Over a year ago

@PeterSchoffelen if this resolved your question - why didn't you accept the answer?

Peter Schoffelen Over a year ago

@Prophet my mistake, I am new to asking questions on stack exchange

Prophet Over a year ago

I understand that, it's OK, mate.

Collectives™ on Stack Overflow

Python: Creating a table to CSV from multiple selenium.text lists

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related