4

I have a list of strings and some of them are equal. I need some script which would count equal strings. Ex:

I have a list with some words :

"House"
"Dream"
"Tree"
"Tree"
"House"
"Sky"
"House"

And the output should look like this:

"House" - 3
"Tree" - 2
"Dream" - 1
and so on

2
  • 2
    sort file.txt | uniq -c will do what you want on unix or cygwin. Otherwise, if this is an assignment, you need to tell us what you tried already and what didn't work about it. Commented Dec 2, 2011 at 23:41
  • 1
    Do they need to be sorted in the result? Commented Dec 2, 2011 at 23:59

5 Answers 5

8

Use collections.Counter(). It is designed for exactly this use case:

>>> import collections
>>> seq = ["House", "Dream", "Tree", "Tree", "House", "Sky", "House"]
>>> for word, cnt in collections.Counter(seq).most_common():
        print repr(word), '-', cnt

'House' - 3
'Tree' - 2
'Sky' - 1
'Dream' - 1
Sign up to request clarification or add additional context in comments.

2 Comments

This is a great solution, but note that Counter only exists in Python 2.7+
There is a Py2.5 and Py2.6 backport of Counter at code.activestate.com/recipes/576611
5

Solution

This is quite simple (words is a list of words you want to process):

result = {}
for word in set(words):
    result[word] = words.count(word)

It does not require any additional modules.

Test

For the following words value:

words = ['House', 'Dream', 'Tree', 'Tree', 'House', 'Sky', 'House']

it will give you the following result:

>>> result
{'Dream': 1, 'House': 3, 'Sky': 1, 'Tree': 2}

Does it answer your question?

3 Comments

If you want to avoid the standard library for some reason, it would be better to replace words.count(word) with result.get(word, 0) + 1. This simple change replaces an O(n) operation with an O(1) operation.
@RaymondHettinger: The change maybe is simple, but more complex than you proposed. At least your proposal does not work. Did you want to say that I should replace set(words) with words and result[word] = words.count(word) with result[word] = result.get(word, 0) + 1?
result = {} and for word in words: result[word] = result.get(word, 0) + 1 and if you want to put a bow-tie on it: for word, cnt in sorted(result.items(), reverse=True): print repr(word), '-', cnt
3
from collections import defaultdict
counts = defaultdict(int)
for s in strings:
    counts[s] += 1
for (k, v) in counts.items():
    print '"%s" - %d' % (k, v)

1 Comment

defaultdict(int) is enough. You don't need lambdas here
2

I will extend Tadeck's answer to print the results.

for word in set(words):
  print '''"%s" - %d''' %(word, words.count(word))

2 Comments

The same comment applies as with Tadeck's solution. Using words.count(word) is an O(n) solution. You're much better-off using a python dictionary with its O(1) lookups.
@RaymondHettinger: Same comment as below my answer: your comment is about word in words loop, not word in set(words) loop, correct?
1

Below code should get you as expected

stringvalues = ['House', 'Home', 'House', 'House', 'Home']
for str in stringvalues:
    if( str in newdict ):
        newdict[str] = newdict[str] + 1
    else:
        newdict[str] = 1
all = newdict.items()
for k,v in all:
    print "%s-%s" % (k,v)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.