0

I have scraped some links using the following code, but I can't seem to store them in one column of excel. When I use the code it will parse all the alphabet of the link address and will store them in multiple columns.

for h1 in soup.find_all('h1', class_="title entry-title"):
    print(h1.find("a")['href'])

This yields all the links I need to find.

To store in csv I used:

import csv
with open('links1.csv', 'wb') as f:
    writer = csv.writer(f)
    for h1 in soup.find_all('h1', class_="title entry-title"):
        writer.writerow(h1.find("a")['href'])

I also tried to store the results for instance using

for h1 in soup.find_all('h1', class_="title entry-title"):
    dat = h1.find("a")['href']

and then tried using dat in other csv codes but would not work.

3
  • what do you mean by store in one column. Like one link every line? Commented Jan 28, 2017 at 21:31
  • pandas dataframe may make things a lot easier Commented Jan 28, 2017 at 21:31
  • @Bobby yes, one link every line Commented Jan 28, 2017 at 21:33

1 Answer 1

1

If you only need one link every line, You may not even need a csv writer? It looks like plain file writing to me

The new line character should serve you well at one link per line

with open('file', 'w') as f:
    for h1 in soup.find_all('h1', class_="title entry-title"):
        f.write(str(h1.find("a")['href']) + '\n')
Sign up to request clarification or add additional context in comments.

2 Comments

yea, I originally wanted to get csv as output, but I guess I can do this and then import to excel.
@song0089 There is nothing magical about csv, just name your file as xxx.csv. If you have more than one column in a row, add commas. This is farily simple for now. If the data get complicated I will say check out pandas

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.