0

I am trying to scrape lyrics from an api and print responses directly to a csv file, like so:

def scrape_genius_lyrics(urls):

    all_lyrics=[]

    headers = {'Authorization': 'mytoken'}
    base_url = 'https://genius.com/'

    with codecs.open('genius.csv', 'ab', encoding='utf8') as outputfile:
        outwriter = csv.writer(outputfile)

    for url in urls:
        page_url = base_url + url
        try:
            page = requests.get(page_url, headers=headers)
            html = BeautifulSoup(page.text, "html.parser")
            [h.extract() for h in html('script')]
            lyrics = html.find('div', class_='lyrics').get_text()         
            # outwriter.writerow(lyrics)
            all_lyrics.append(lyrics)
            print lyrics
        except:
            'could not find page for {}'.format(url)

however, I only see responses if i comment #outwriter.writerow(lyrics), otherwise the program halts and does not print lyrics.

how can I save to csv file every lyrics to its own row, at each iteration?

1
  • [h.extract() for h in html('script')] on it's own does nothing... Did you want to save that list? Commented Sep 23, 2017 at 23:32

1 Answer 1

1

You probably should indent that for loop to keep the writer open.

with codecs.open('genius.csv', 'ab', encoding='utf8') as outputfile:
    outwriter = csv.writer(outputfile)

    for url in urls:
        page_url = base_url + url
        ...

You also should decide if you really need to store all_lyrics in memory while you write the same information to the file.

You can always re-open the file and get all_lyrics at a later point.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.