2

I'm trying to save all the data(i.e all pages) in single csv file but this code only save the final page data.Eg Here url[] contains 2 urls. the final csv only contains the 2nd url data. I'm clearly doing something wrong in the loop.but i dont know what. And also this page contains 100 data points. But this code only write first 44 rows. please help this issue.............

from bs4 import BeautifulSoup
import requests
import csv
url = ["http://sfbay.craigslist.org/search/sfc/npo","http://sfbay.craigslist.org/search/sfc/npo?s=100"]
for ur in url:
    r = requests.get(ur)
    soup = BeautifulSoup(r.content)
    g_data = soup.find_all("a", {"class": "hdrlnk"})
    gen_list=[]
    for row in g_data:
       try:
            name = row.text
       except:
            name=''
       try:
            link = "http://sfbay.craigslist.org"+row.get("href")
       except:
            link=''
       gen=[name,link]
       gen_list.append(gen)

with open ('filename2.csv','wb') as file:
    writer=csv.writer(file)
    for row in gen_list:
        writer.writerow(row)

2 Answers 2

3

the gen_list is being initialized again inside your loop that runs over the urls.

gen_list=[]

Move this line outside the for loop.

...
url = ["http://sfbay.craigslist.org/search/sfc/npo","http://sfbay.craigslist.org/search/sfc/npo?s=100"]
gen_list=[]
for ur in url:
...
Sign up to request clarification or add additional context in comments.

Comments

0

i found your post later, wanna try this method:

import requests
from bs4 import BeautifulSoup
import csv

final_data = []
url = "https://sfbay.craigslist.org/search/sss"
r = requests.get(url)
data = r.text

soup = BeautifulSoup(data, "html.parser")
get_details = soup.find_all(class_="result-row")

for details in get_details:
    getclass = details.find_all(class_="hdrlnk")
    for link in getclass:
        link1 = link.get("href")
        sublist = []
        sublist.append(link1)
        final_data.append(sublist)
print(final_data)

filename = "sfbay.csv"
with open("./"+filename, "w") as csvfile:
    csvfile = csv.writer(csvfile, delimiter = ",")
    csvfile.writerow("")
    for i in range(0, len(final_data)):
        csvfile.writerow(final_data[i])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.