0

I'm grabbing search results from bing. Everything is working except the output to the csv file. I've tried pandas also but can't seem to get the output right. I need the "url" in column A and "name" in column B next to the corresponding link.

example search link

def scrape():
    urls = WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CLASS_NAME, "b_algo")))
    url = [div.find_element_by_tag_name('a').get_attribute('href') for div in urls]
    names = WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CLASS_NAME, "b_algo")))
    name = [div.find_element_by_tag_name('h2 > a').get_attribute('innerHTML').split('-')[0].strip() for div in names]

   x1 = [url]
   x2 = [name]

   pp.pprint([url,name])

   with open(bing_parameters.file_name, 'a', newline='\n', encoding='utf-8') as f:
       wr = csv.writer(f)
       for items in x1:
           wr.writerow([x1,x2])
scrape()

2 Answers 2

1

Try this out. To put url to the first column and name to second column and then write to csv.

import pandas as pd
df = pd.DataFrame(url)
df.columns =['A']
df['B']=name
print(df)
df.to_csv(bing_parameters.file_name, index=False)
Sign up to request clarification or add additional context in comments.

2 Comments

that worked perfectly. How do I append to instead of overwrite with pandas?
df.to_csv(bing_parameters.file_name, mode='a', index=False)
0

Let's say you have this data:

x1 = ['foo']
x2 = ['https://www.example.com']

Then your existing code is doing something like this

for items in x1:
    print([x1, x2])

Giving this incorrect output:

[['foo'], ['https://www.example.com']]

The code is looping over the contents of x1 - a list containing one item, so the loop will have one iteration - and outputting a list containing x1 and x2, which are both lists.

If x1 and x2 are always single item lists you can explicitly select the first item in each, and dispense with the loop:

    with open(bing_parameters.file_name, 'a', newline='\n', encoding='utf-8') as f:
       wr = csv.writer(f)
       wr.writerow([x1[0], x2[0]])

or just not make these redundant lists

   with open(bing_parameters.file_name, 'a', newline='\n', encoding='utf-8') as f:
       wr = csv.writer(f)
       wr.writerow([name, url])

If x1 and x2 contain multiple corresponding items, you can zip them together:

x1 = [name1, name2]
x2 = [url1, url2]

   with open(bing_parameters.file_name, 'a', newline='\n', encoding='utf-8') as f:
       wr = csv.writer(f)
       for name, url in zip(x1, x2):
           wr.writerow([name, url])

or even

x1 = [name1, name2]
x2 = [url1, url2]

   with open(bing_parameters.file_name, 'a', newline='\n', encoding='utf-8') as f:
       wr = csv.writer(f)
       wr.writerows(zip(x1, x2))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.