2

I'm pretty new to python, but I think I catch on fast.

Anyways, I'm making a program (not for class, but to help me) and have come across a problem.

I'm trying to document a list of things, and by things I mean close to a thousand of them, with some repeating. So my problem is this:

I would not like to add redundant names to the list, instead I would just like to add a 2x or 3x before (or after, whichever is simpler) it, and then write that to a txt document.

I'm fine with reading and writing from text documents, but my only problem is the conditional statement, I don't know how to write it, nor can I find it online.

for lines in list_of_things:
   if(lines=="XXXX x (name of object here)"):

And then whatever under the if statement. My only problem is that the "XXXX" can be replaced with any string number, but I don't know how to include a variable within a string, if that makes any sense. Even if it is turned into an int, I still don't know how to use a variable within a conditional.

The only thing I can think of is making multiple if statements, which would be really long.

Any suggestions? I apologize for the wall of text.

1
  • To clarify, are you saying you have a source with potentially redundant lines, and ultimately you want to output unique lines prefixed with a count? Also, is order important? Commented Mar 28, 2012 at 17:43

6 Answers 6

5

I'd suggest looping over the lines in the input file and inserting a key in a dictionary for each one you find, then incrementing the value at the key by one for each instance of the value you find thereafter, then generating your output file from that dictionary.

catalog = {}
for line in input_file:
    if line in catalog:
        catalog[line] += 1
    else:
        catalog[line] = 1

alternatively

from collections import defaultdict
catalog = defaultdict(int)
for line in input_file:
    catalog[line] += 1

Then just run through that dict and print it out to a file.

Sign up to request clarification or add additional context in comments.

8 Comments

I think this is what he was asking. Same as what I was about to suggest.
@NolenRoyalty: Ultimately I would suggest it to, but it should be an appended bit of info, after first explaining the standard dictionary way, since he is a new python programmer.
@jdi Fair enough, the solution is correct either way(assuming we've understood the question correctly).
Yes, that's what I was asking. I had to look into dictionaries a bit, but I got it. Seemed simple enough, thanks for the timely response! I guess I was just approaching it the wrong way.
@user1298844: Once you feel like you understand dictionaries, you should look up defaultdict as Nolen Royalty was suggesting. It makes this counter a little simpler because it will always make a brand new key == 0
|
1

You may be looking for regular expressions and something like

for line in text:
    match = re.match(r'(\d+) x (.*)', line)
    if match:
        count = int(match.group(1))
        object_name = match.group(2)
        ...

1 Comment

I get the sense that the pattern the OP was showing was really an awkward approach to wanting to count the lines, and that he had preformatted these strings as he ingested them and then wanted to reparse them.
0

Something like this?

list_of_things=['XXXX 1', 'YYYY 1', 'ZZZZ 1', 'AAAA 1', 'ZZZZ 2']

for line in list_of_things:
    for e in ['ZZZZ','YYYY']:
        if e in line:
            print line

Output:

YYYY 1
ZZZZ 1
ZZZZ 2

You can also use if line.startswith(e): or a regex (if I am understanding your question...)

Comments

0

To include a variable in a string, use format():

>>> i = 123
>>> s = "This is an example {0}".format(i)
>>> s
'This is an example 123'

In this case, the {0} indicates that you're going to put a variable there. If you have more variables, use "This is an example {0} and more {1}".format(i, j)" (so a number for each variable, starting from 0).

Comments

0

This should do it:

a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
from itertools import groupby
print ["%dx %s" % (len(list(group)), key) for key, group in groupby(a)]

7 Comments

Especially with the OP admittedly being a new python programmer, you should maybe be a little less obscure with this answer.
He said he catches fast :) hehe I'm sorry if this was too advanced or something, I just hope it helps the OP, but remember this is a public Q&A and people who are not beginners might come by later.
Well if you are going to suggest something advanced, at least fix it so it doesn't use constant string concatenation: ["%dx %s" % (len(list(group)), key) for key, group in groupby(a)]
@jdi Maybe I sounded like a smartass in my last comment? I wasn't trying to be one. I'm just trying to help out. Good suggestion BTW, I've edited.
This is actually pretty interesting. The class I'm taking is fairly slow, unfortunately. I get the gist of this, and am playing around with groupby a bit more to see exactly how you did that. Thanks!
|
0

There are two options to approach this. 1) something like the following using a dictionary to capture the count of items and then a list to format each item with its count

list_of_things = ['sun', 'moon', 'green', 'grey', 'sun', 'grass', 'green']
listItemCount = {}
countedList = []
for lines in list_of_thing:
    if lines in listItemCount:
        listItemCount[lines] += 1
    else:
        listItemCount[lines] = 1
for id in listItemCount:
    if listItemCount[id] > 1:
        countedList.append(id+' - x'str(listItemCount[id]))
    else:
        countedList.append(id)
for item in countedList:
    print(item)

the output of the above would be

sun - x2
grass
green - x2
grey
moon

or 2) using collections to make things simpler as shown below

import collections

list_of_things = ['sun', 'moon', 'green', 'grey', 'sun', 'grass', 'green']
listItemCount = collections.Counter(list_of_things)
listItemCountDict = dict(listItemCount)
countedList = []
for id in listItemCountDict:
    if listItemCountDict[id] > 1:
        countedList.append(id+' - x'str(listItemCountDict[id]))
    else:
        countedList.append(id)
for item in countedList:
    print(item)

the output of the above would be

sun - x2
grass
green - x2
grey
moon

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.