1

Solving the following problem from CodeFights:

Given two strings, find the number of common characters between them. For s1 = "aabcc" and s2 = "adcaa", the output should be commonCharacterCount(s1, s2) = 3.

Strings have 3 common characters - 2 "a"s and 1 "c".

The way I approached it, whenever I took a letter into account I wanted to cancel it out so as not to count it again. I know strings are immutable, even when using methods such as .replace() (replace() method returns a copy of the string, no the actual string changed).

In order to mutate said string what I tend to do at the start is simply pass it on to a list with list(mystring) and then mutate that.

Question is... what is more efficient of the following? Take into account that option B gets done over and over, worst case scenario the strings are equal and have a match for match. Meanwhile option A happens once.

Option A)

list(mystring) 

Option B)

mystring = mystring.replace(letterThatMatches, "")
3
  • Have you tried timing them? Also, option A does not do any work with said list, so those aren't really equivalent. Commented Jul 6, 2017 at 17:21
  • 2
    Neither, just make a set of both strings and ask for the intersection. Commented Jul 6, 2017 at 17:22
  • 1
    @MatteoItalia: a set will not take the number of as, etc. into account. Commented Jul 6, 2017 at 17:24

1 Answer 1

7

The idea of calling replace on the string for each element you find, is simply not a good idea: it takes O(n) to do that. If you do that for every character in the other string, it will result in an O(m×n) algorithm with m the number of characters of the first string, and n the number of characters from the second string.

You can simply use two Counters, then calculate the minimum of the two, and then calculate the sum of the counts. Like:

from collections import Counter

def common_chars(s1,s2):
    c1 = Counter(s1) # count for every character in s1, the amount
    c2 = Counter(s2) # count for every character in s2, the amount
    c3 = c1 & c2 # produce the minimum of each character
    return sum(c3.values()) # sum up the resulting counts

Or as a one-liner:

def common_chars(s1,s2):
    return sum((Counter(s1) & Counter(s2)).values())

If dictionary lookup can be done in O(1) (which usually holds for an average case), this is an O(m+n) algorithm: the counting then happens in O(m) and O(n) respectively, and calculating the minimum runs in the number of different characters (which is at most O(max(m,n)). Finally taking the sum is again an O(max(m,n)) operation.

For the sample input, this results in:

>>> common_chars("aabcc","adcaa")
3
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.