What is more efficient? Using .replace() or passing string to list

Question

Solving the following problem from CodeFights:

Given two strings, find the number of common characters between them. For s1 = "aabcc" and s2 = "adcaa", the output should be commonCharacterCount(s1, s2) = 3.

Strings have 3 common characters - 2 "a"s and 1 "c".

The way I approached it, whenever I took a letter into account I wanted to cancel it out so as not to count it again. I know strings are immutable, even when using methods such as .replace() (replace() method returns a copy of the string, no the actual string changed).

In order to mutate said string what I tend to do at the start is simply pass it on to a list with list(mystring) and then mutate that.

Question is... what is more efficient of the following? Take into account that option B gets done over and over, worst case scenario the strings are equal and have a match for match. Meanwhile option A happens once.

Option A)

list(mystring)

Option B)

mystring = mystring.replace(letterThatMatches, "")

Have you tried timing them? Also, option A does not do any work with said list, so those aren't really equivalent. — Chris
– Chris, Commented Jul 6, 2017 at 17:21
Neither, just make a set of both strings and ask for the intersection. — Matteo Italia
– Matteo Italia, Commented Jul 6, 2017 at 17:22
@MatteoItalia: a set will not take the number of as, etc. into account. — willeM_ Van Onsem
– willeM_ Van Onsem, Commented Jul 6, 2017 at 17:24

willeM_ Van Onsem · Accepted Answer · 2017-07-06 17:30:43Z

The idea of calling replace on the string for each element you find, is simply not a good idea: it takes O(n) to do that. If you do that for every character in the other string, it will result in an O(m×n) algorithm with m the number of characters of the first string, and n the number of characters from the second string.

You can simply use two Counters, then calculate the minimum of the two, and then calculate the sum of the counts. Like:

from collections import Counter

def common_chars(s1,s2):
    c1 = Counter(s1) # count for every character in s1, the amount
    c2 = Counter(s2) # count for every character in s2, the amount
    c3 = c1 & c2 # produce the minimum of each character
    return sum(c3.values()) # sum up the resulting counts

Or as a one-liner:

def common_chars(s1,s2):
    return sum((Counter(s1) & Counter(s2)).values())

If dictionary lookup can be done in O(1) (which usually holds for an average case), this is an O(m+n) algorithm: the counting then happens in O(m) and O(n) respectively, and calculating the minimum runs in the number of different characters (which is at most O(max(m,n)). Finally taking the sum is again an O(max(m,n)) operation.

For the sample input, this results in:

>>> common_chars("aabcc","adcaa")
3

Collectives™ on Stack Overflow

What is more efficient? Using .replace() or passing string to list

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related