0

I have a loop. Everytime the loop runs, a new list is created. I want to add the all these lists together. My code is as follows:

while i < len(symbolslist):

    html_text = urllib.urlopen('my-url.com/'+symbolslist[i]).read()
    pattern = re.compile('<a target="_blank" href="(.+?)" rel="nofollow"')
    applink = re.findall(pattern, htmltext)
    applink += applink
    i+=1

where applink is a list. However, with the current code I have, it only adds the last two lists together. What am I doing wrong?

Thanks!

1
  • applink = re.findall(pattern,htmltext) <...> applink += applink Commented Jun 30, 2015 at 4:18

1 Answer 1

2

The issue is that you are using applink as the variable name to store the list returned by re.findall() , hence you are ending up creating a new list everytime, instead of that use a different name and then extend applink to include the new list (or use +=).

Code -

applink = []
while i<len(symbolslist):

    url = "http://www.indeed.com/resumes/-/in-Singapore?co=SG&start="+str(symbolslist[i])
    htmlfile = urllib.urlopen(url)
    htmltext = htmlfile.read()
    regex = '<a target="_blank" href="(.+?)" rel="nofollow"'
    pattern = re.compile(regex)
    tempapplink = re.findall(pattern,htmltext)
    print tempapplink
    applink += tempapplink
    i+=1
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.