Python add string to a list loop

Question

I have a var vk_read from Python HTMLParser which handle data like this: ['id168233095']

Now I'm trying to collect all data from this var 'vk_read' after script runs in a list. Should be like: ['id168233095', 'id1682334534', 'id16823453', 'etc...']

if vk_read:
    vk_ids = []
    for line in vk_read:
        if vk_read != '':
            vk_ids.append(vk_read)
            print(vk_ids)

This is the result:

['id168233095']
['id168233095', 'id168233095']
['id168233095', 'id168233095', 'id168233095']
['id168233095', 'id168233095', 'id168233095', 'id168233095']
['id168233095', 'id168233095', 'id168233095', 'id168233095', 'id168233095']
['id168233095', 'id168233095', 'id168233095', 'id168233095', 'id168233095', 'id168233095']

After some advice code has been changed (see at the end of this post)

if vk_read not in vk_ids:
    vk_ids.append(vk_read)
print(vk_ids)

But in this case result is:

['id45849605']
['id91877071']
['id17422363']
['id119899405']
['id65045632']
['id168233095']

That means my vk_read add itself up to 10 times and then my script starts to add the next one.

Also trying list.insert()- and have the same result. (!!!)

How can I run this loop to catch all different result in one list after script runs as many times as the data can be found from the parsed file.

Nota bene: I've updated the code as advised for list1.append(list0) but in my case this method still return the same result as described above. And changed list name to avoid further confusions.

LAST UPDATE Thanks for helping, guys, you`re really push me in right way: same on stackoverflow

The problem appears to be that you are reinitializing the list to an empty list in each iteration:

from html.parser import HTMLParser
import re, sys, random, csv

with open('test.html', 'r', encoding='utf-8') as content_file:
    read_data = content_file.read()

vk_ids = []

class MyHTMLParser(HTMLParser):

    def handle_starttag(self, tag, attrs):
        href = str(attrs)
        for line in href:
            id_tag = re.findall('/\S+$', href)
            id_raw = str(id_tag)

            if re.search('/\w+\'\)\]', id_raw):
                global vk_read
                vk_read = id_raw
            else:
                break
            for ch in ['/', ')', '[', ']', '"', "'"]:
                if ch in vk_read:

                    vk_read = vk_read.replace(ch, "")

            # https://stackoverflow.com/questions/30328193/python-add-string-to-a-list-loop
            for vk_id in vk_read:
                if vk_id not in vk_ids:
                    vk_ids.append(vk_read)
                    break
            print(vk_ids)
            break

N.B. After last changes

print(type(vk_ids))
<class 'list'>

for line in vk_read: Why aren't you using line inside your for-loop? — Steven Rumbalski
– Steven Rumbalski, Commented May 19, 2015 at 14:17
It's probably a good idea not to name a variable list, as it shadows an often used builtin. — Steven Rumbalski
– Steven Rumbalski, Commented May 19, 2015 at 14:17
list.insert(0, vk_read) is a very inefficient operation because each time you insert an item all the other items need to be shifted one location to the right. This will become really slow if your list grows large. — Steven Rumbalski
– Steven Rumbalski, Commented May 19, 2015 at 14:19
@trianglesis is it your actual indentation? if so, everything after id_tag= re.findall(...) is wrong. I assume it should all be in the for line in href loop — Julien Spronck
– Julien Spronck, Commented May 19, 2015 at 16:08

Luis · Accepted Answer · 2017-09-26 21:16:01Z

3

how about:

vk_ids = []
if vk_read:
    for line in vk_read:
        vk_ids.append(format(line))
    print(vk_ids)

answered Sep 26, 2017 at 21:16

Luis

464 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Julien Spronck · Accepted Answer · 2015-05-19 15:02:56Z

0

It appears that you are inside a loop, vk_read is a string that changes at each iteration:

vk_ids = [] ## initialize list outside the main loop

## main loop
for some_variable in some_kind_of_iterator: ## this is just a placeholder, i don't know what your loop looks like.

    ## get the value for vk_read
    vk_read = ...

    ## append to vk_ids
    if vk_read and vk_read not in vk_ids:
        vk_ids.append(vk_read)

print vk_ids

answered May 19, 2015 at 15:02

Julien Spronck

15.5k5 gold badges50 silver badges57 bronze badges

1 Comment

trianglesis Over a year ago

Trying different constructions and also trying to make readable code, but now I have: vk_ids.append(vk_read) is <class 'list'> but the list still does not collect different data from variable. Something I`ve lost.

Julien Spronck · Accepted Answer · 2015-05-19 15:06:35Z

0

In your code, you were not making use of the line variable inside the loop. At each iteration, you are inserting the entire vk_read variable.

Assuming that vk_read is a list, you can use a list comprehension:

lis = [line for line in vk_read if line != '']
print lis

If you need it reversed (as seems to be the case by your use of insert, just use reversed:

lis = list(reversed([line for line in vk_read if line != '']))

However, vk_read seems to be a string not a list.

edited May 19, 2015 at 15:06

answered May 19, 2015 at 14:19

Julien Spronck

15.5k5 gold badges50 silver badges57 bronze badges

13 Comments

Steven Rumbalski Over a year ago

His example code is actually equivalent to lis = reversed([vk_read for line in vk_read if vk_read != '']). The if vkread != '' can be skipped as the loop wouldn't happen if vk_read were equal to an empty string. reversed is used because OP is using list.insert(0, vk_read). The most efficient equivalent would be lis = len(vk_read) * [vk_read] (reversed doesn't really matter because we're just inserting vk_read, not vk_read's ordered contents.)

Julien Spronck Over a year ago

I just assumed that it was a mistake since line is not referenced inside the loop

Frank V Over a year ago

While this may or may not be right, you should try to establish the problem the OP ran in to and why this can help solve the problem.

Julien Spronck Over a year ago

@FrankV let me know if my last edit is more helpful.

trianglesis Over a year ago

if vk_read: vk_ids = [line for line in vk_read if line != ''] print(vk_ids) And still have the same, but separately for every character: ['i', 'd', '1', '6', '8', '2', '3', '3', '0', '9', '5']

|

Community · Accepted Answer · 2017-05-23 12:08:09Z

0

My bad, I've doing it wrong and run iteration and list append all time wiping prev list. Here is comment about it

edited May 23, 2017 at 12:08

CommunityBot

11 silver badge

answered May 19, 2015 at 19:02

trianglesis

831 gold badge2 silver badges14 bronze badges

Collectives™ on Stack Overflow

Python add string to a list loop

4 Answers 4

Comments

1 Comment

13 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

1 Comment

13 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related