0

So I've never really used import csv before, but I've managed to scrape a bunch of information from websites and now want to put them in a csv file. The issue I'm having is that all my list values are being separated by commas (i.e. Jane Doe = J,a,n,e, ,D,o,e).

Also, I have three lists (one with names, one with emails, and one with titles) and I would like to add them each as its own column in the CSV file (so col1 = Name, col2 = title, col3= email)

Any thoughts on how to execute this? Thanks.


from bs4 import BeautifulSoup
import requests
import csv


urls = ''

with open('websites.txt', 'r') as f:
    for line in f.read():
        urls += line

urls = list(urls.split())

name_lst = []
position_lst = []
email_lst = []

for url in urls:

    print(f'CURRENTLY PARSING: {url}')
    print()

    res = requests.get(url)
    soup = BeautifulSoup(res.text, 'html.parser')

    try:
        for information in soup.find_all('tr', class_='sidearm-staff-member'):
            names = information.find("th", attrs={'headers': "col-fullname"}).text.strip()
            positions = information.find("td", attrs={'headers': "col-staff_title"}).text.strip()
            emails = information.find("td", attrs={'headers': "col-staff_email"}).script
            target = emails.text.split('var firstHalf = "')[1]
            fh = target.split('";')[0]
            lh = target.split('var secondHalf = "')[1].split('";')[0]
            emails = fh + '@' + lh

            name_lst.append(names)
            position_lst.append(positions)
            email_lst.append(emails)


    except Exception as e:
        pass
       
with open('test.csv', 'w') as csv_file:
    csv_writer = csv.writer(csv_file)
    for line in name_lst:
        csv_writer.writerow(line)
    for line in position_lst:
        csv_writer.writerow(line)
    for line in email_lst:
        csv_writer.writerow(line)

2 Answers 2

1

Writing your data column-by-column is easy. All you have to do is write the rows where each row contains elements of the 3 tables with the same list index. Here is the code:

with open('test.csv', 'w') as csv_file:
    csv_writer = csv.writer(csv_file)
    for name, position, email in zip(name_lst, position_lst, email_lst):
        csv_writer.writerow([name, position, email])
Sign up to request clarification or add additional context in comments.

Comments

1

Assuming that the name_lst, position_lst and email_lst are all correct and are of the same size, Your problem is in the last part of your code where you write it to a CSV file.

Here is a way to do this:

fieldnames = ['Name', 'Position', 'Email']
with open('Data_to_Csv.csv', 'w') as csvfile:
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    writer.writeheader()
    for i in range(len(name_lst)):
        writer.writerow({'Name':name_lst[i],'Position':position_lst[i], 'Email':email_lst[i]})

This would of course fail if you are the length of the lists are unequal. You need to make sure that you are adding dummy values for entries that are not available to make sure that 3 lists have equal number of values.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.