3

I have a list in my hand and I want to create vocabulary from this list. Then, I want to show each word and count the same strings in this list.

The sample list as below.

    new_list = ['one', 'thus', 'once', 'one', 'count', 'once', 'this', 'thus']

First, I created a vocabulary with below.

    vocabulary = []
        for i in range (0, len(new_list)):
            if new_list[i] not in vocabulary:
                vocabulary.append(new_list[i])`
    print vocabulary

The output of above code is: "count, once, one, this, thus."

I want to show the number of each words in the list as below. [count][1], [once][2], [one][2], [this][1], [thus][2].

In order to get above result; I try below code.

    matris = []

    for i in range(0,len(new_list)):
        temp = []
        temp.insert(0,new_list.count(new_list[i]))        
        matris.append(temp)

    for x in matris:
        print x

Above code only gives the number of words. Can someone advise me how can I print the word name and number of the words together such as in [once][2] format.

2 Answers 2

6

Use a Counter dict to get the word count then just iterate over the .items:

from collections import Counter

new_list = ['one', 'thus', 'once', 'one', 'count', 'once', 'this', 'thus']

cn = Counter(new_list)
for k,v in cn.items():
    print("{} appears  {} time(s)".format(k,v))

If you want that particular output you can wrap the elements in the str.format:

for k,v in cn.items():
    print("[{}][{}]".format(k,v))

[thus][2]
[count][1]
[one][2]
[once][2]
[this][1]

To get the output from highest count to lowest use .most_common:

cn = Counter(new_list)
for k,v in cn.most_common():
    print("[{}][{}]".format(k,v))

Output:

[once][2]
[thus][2]
[one][2]
[count][1]
[this][1]

If you want the data alphabetically from lowest to highest and from highest to lowest for the count you need to pass a key -x[1] to sorted to negate the count sorting the count from highest to lowest:

for k, v in sorted(cn.items(), key=lambda x: (-x[1],x[0])):
    print("[{}][{}]".format(k, v))

Output:

[once][2]
[one][2]
[thus][2]
[count][1]
[this][1]
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks Padraic. It works well. I also want to sort the words by number of counts from maksimum to minimum. Firstly, according to number of counts and then alphabeticaly second. such as [once][2], [one][2], [thus][2], [count][1], [this][1].
See the wiki: wiki.python.org/moin/HowTo/Sorting Especially the section on key functions. Sort by word first, THEN by value to get what you expect. Also, note that sorting is stable in Python according to this question: stackoverflow.com/questions/1915376/…
@Padraic it does not sort alphabetically. "for k, v in sorted(cn.items(), key=lambda x: -x[1]):" It still sorts according to number of counts only.
0
new_list = ['one', 'thus', 'once', 'one', 'count', 'once', 'this', 'thus']
vocabulary = list(dict.fromkeys(new_list))
print(*vocabulary, sep = "\n")

OUTPUT:
one
thus
once
count
this

#######################
matris= ["["+str(item)+"]"+"["+str(new_list.count(item))+"]" for item in 
new_list]
print(*list(dict.fromkeys(matris)), sep = "\n")

OUTPUT:
[one][2]
[thus][2]
[once][2]
[count][1]
[this][1]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.