2

I'm new to python and I'm trying to write a function whose description is as follows: I have a list of integers. From this list I have to find the item with max frequency and print that. This seems pretty straight forward unless I have a limitation that the function must complete the execution within 10 sec and should consume memory < 512 MB. For shorter list length my function works fine but for a list of length 100000 it tanks. I'm not able to optimize the code. I have 2 implementations for the same:

Implementation #1

def returnMaxFrequency(ar):
    freqList = []
    for val in ar:
        freq = ar.count(val)
        freqList.append(freq)
    return(max(freqList))

Implementation #2

def returnMaxFrequency(ar):   
    freqDict = {x:ar.count(x) for x in ar}   
    maxFreq = max(freqDict.values())
    return maxFreq

Eg

if ar = [3 2 1 3]
o/p: 2

Using NumPy is not an option here. (Can't use external package)

6

6 Answers 6

4

Easiest (and reasonably fast) is likely the built-in Counter:

from collections import Counter
winner = Counter(ar).most_common(1)[0]

An even faster method (and using no extra memory, but destroying the original array) is given in this article, replicated here:

# Python program to find the maximum repeating number 

# Returns maximum repeating element in arr[0..n-1]. 
# The array elements are in range from 0 to k-1 
def maxRepeating(arr, n,  k): 

    # Iterate though input array, for every element 
    # arr[i], increment arr[arr[i]%k] by k 
    for i in range(0,  n): 
        arr[arr[i]%k] += k 

    # Find index of the maximum repeating element 
    max = arr[0] 
    result = 0
    for i in range(1, n): 

        if arr[i] > max: 
            max = arr[i] 
            result = i 

    # Uncomment this code to get the original array back 
    #for i in range(0, n): 
    #    arr[i] = arr[i]%k 

    # Return index of the maximum element 
    return result 

(Parts of this code could be replaced by more performant alternates, in particular using the max function instead of the second loop.)

Sign up to request clarification or add additional context in comments.

2 Comments

He wrote he can't even use numpy as he "Can't use external package"
@Aryerez: collections is not an external package.
3

Both of your implementations are basically the same, the second one only uses a list comprehension instead of the for loop. Both algorithms are in O(n^2) because count is in O(n) and you are calling it n times (once for each value).

If you want to optimize, reduce the complexity (to O(n)):

def returnMaxFrequency(ar):   
    freqDict = {x:0 for x in ar}
    for val in ar:
        freqDict[val] = freqDict[val] + 1
    maxFreq = max(freqDict.values())
    return maxFreq

2 Comments

You can skip the dict initialization by using collections.defaultdict.
defaultdict is not required to skip dictionary initialization. dict.get(key,[default=0]) will do
3

Hope this helps!

We are using Python's High-Performance container datatypes(Counter)

from collections import Counter

def returnMaxFrequency(ar):
    return max(Counter(t).values())

Counter does a frequency mapping of your number and created a dict, once dict is created you are using a max to get max-freq of the list.

Using Dict is an efficient way of generating a frequency count, unless your are going for distributed compute solutions

Note: collections is a python in-built package i.e. comes with setup. Not an external library.

9 Comments

He wrote he can't even use numpy as he "Can't use external package"
@Aryerez - It's in-built package. Here is a link for your reference - docs.python.org/2/library/collections.html
Regarding "High-Performance container datatypes": Counter is implemented in Python, and is no more performant than the dict solution, just more readable and succint. It is way faster than any solution involving count, though.
@Amadan - thanks! can you prove if dict has faster performance than Counter?
@Amadan I just reran the timings given in this answer and it seems Counter is actually faster than solutions based on dict and defaultdict for recent Python versions (tested this with v3.6.7).
|
1

Returns most frequently occurring value in list

max(set(ar), key=ar.count) 

1 Comment

It needs to be added that this has horrible algorithmic complexity.
0

The second implement is nice, but inside the dict-comprehension change ar to set(ar), and it will check every item only once:

def returnMaxFrequency(ar):   
    freqDict = {x:ar.count(x) for x in set(ar)}   
    maxFreq = max(freqDict.values())
    return maxFreq

Comments

0

What about this?:

max(ar.count(i) for i in ar)

Or this?:

max(map(ar.count,ar))

1 Comment

Solutions iterating the input list more than once are inferior.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.