4

I have this code here, it is supposed to remove the common letters from both the lists n1 and n2. But when i run this code it only runs once as in it removes only 'a' from both n1 and n2 and doesnt remove 'k'.

Just to clarify this code should always work on only 2 words.

name1 = "abdjek"
name2 = "doarhsnk"

n1l = list(name1)
n2l = list(name2)

for i in range(len(n1l)):
   for j in range(len(n2l)):
         if n1l[i] == n2l[j]:
               n1l.pop(i)
               n2l.pop(j)
               n1l.append('0')
               n2l.append('1')

Ok wait, it seems to work for the above 2 names but when i have name1 = "naveen" and name2 = "darshana" it doesnt work!

6
  • When I run this, it produces the expected result of n1l = ['b', 'j', 'e', '0', '0', '0'] and n2l = ['o', 'r', 'h', 's', 'n', '1', '1', '1']. Commented Jan 31, 2010 at 20:46
  • I just pasted it into my python interpreter and it seems to do what you want (removes duplicate letters and adds 0 or 1 to keep the length of the lists the same). That's not the algorithm I would have used, though. Commented Jan 31, 2010 at 20:47
  • You could do this a lot more efficiently with set operators. Commented Jan 31, 2010 at 20:54
  • 1
    There are much more Pythonic ways to do this. The most obvious things I would change are: 1. Strings are effectively sequences of characters, so you don't need to explicitly create lists. 2. for ch in aString: and then using ch is much preferred to for i in range(len(aString)): and then using aString[i]. Commented Jan 31, 2010 at 20:58
  • What if there are 3 occurrences of a letter in name1 and only 1 in name 2? Are they all removed? Commented Jan 31, 2010 at 21:02

6 Answers 6

5

I suggest a much simpler approach:

def removecommon(name1, name2):
  common = set(name1).intersection(name2)
  res1 = ''.join(n for n in name1 if n not in common)
  res2 = ''.join(n for n in name2 if n not in common)
  return res1, res2

n1, n2 = removecommon('naveen', 'darshana')
print n1, n2

emits vee drsh as desired.

Edit: as the OP now specified (in a comment -- pls remember to edit your question too, oh OP!) that he actually wants to remove only the first occurrence in each word of each common letter, the needed algorithm is of course completely different. A simple approach (feasible if the length of the words is not too high):

def removefirstcommon(name1, name2):
  common = set(name1).intersection(name2)
  n1 = list(name1)
  for c in common: n1.remove(c)
  n2 = list(name2)
  for c in common: n2.remove(c)
  return ''.join(n1), ''.join(n2)

A more elaborate approach (while slower for normal-length words) would be faster for extremely long words (since the following is O(N) while the former's O(N squared)):

def removefirstcommonlongwords(name1, name2):
  common = set(name1).intersection(name2)
  def mustrem(c, copycom):
    res = c not in copycom
    copycom.discard(c)
    return res
  cop = set(common)
  n1 = [c for c in name1 if mustrem(c, cop)]
  n2 = [c for c in name2 if mustrem(c, common)]
  return ''.join(n1), ''.join(n2)
Sign up to request clarification or add additional context in comments.

2 Comments

@Alex Martelli thanks for this new approach but i have 1 more problem, i have to remove only the first instance of the match, i must get an out put of 'veen' and 'darsha'. I dont want to remove all the instances of 'n' and 'a'
@Prabhu, how do you get 'darsha' by removing the first occurrences of a and n from 'darshana'? 'drshaa' would seem to follow the "remove the first occurrence" rule. Clarify please?
2

A more pythonic approach would be to use sets and list comprehensions.

name1 = "naveen"; name2 = "darshana"

name1_set=set(name1)
name2_set=set(name2)

clean1=[x for x in  name1 if x not in name2_set]
clean2=[x for x in name2 if x not in name1_set]

clean1.extend(['0']*(len(name1)-len(clean1)))
clean2.extend(['1']*(len(name2)-len(clean2)))

print clean1,clean2

set gives us O(1) lookups, thus making the whole process faster by making it O(N) instead of O(N^2).

EDIT: In light of your later comment that the number of occurrences matter, this is the updated version that takes that into account.

name1 = "naveen"; name2 = "darshana"

count1={}
count2={}


for x in name1:
    count1[x]=count1.get(x,0)+1

for x in name2:
    count2[x]=count2.get(x,0)+1

def remove_dups(name,count,null):
    clean=[]
    for x in name:
        if count.get(x,0):
            count[x]-=1
        else:
            clean.append(x)
    clean.extend([null]*(len(name)-len(clean)))
    return clean

clean1=remove_dups(name1,count2,'0')
clean2=remove_dups(name2,count1,'1')

print clean1,clean2

It uses dicts to keep counts of occurrences. Whenever a character is removed, the corresponding count for the other name is decremented. Complexity is still O(N).

It prints ['v', 'e', 'e', 'n', '0', '0'] and ['d', 'r', 's', 'h', 'a', 'a', '1', '1']. Is that what you wanted?

2 Comments

set takes an iterable, so set('foo') works, no need to convert to an intermediate list.
@Mike: Right. Forgot that string is already iterable. Answer updated, thanks.
0

It's working here for me. That is, if I add print statements thus:

name1 = "abdjek"
name2 = "doarhsnk"

n1l = list(name1)
n2l = list(name2)

print "Lists before loop:"
print n1l
print n2l

for i in range(len(n1l)):
    for j in range(len(n2l)):
        if n1l[i] == n2l[j]:
           n1l.pop(i)
           n2l.pop(j)
           n1l.append('0')
           n2l.append('1')

print "Lists after loop:"
print n1l
print n2l

the characters 'a', 'd', and 'k' are all removed:

> python test.py 
Lists before loop:
['a', 'b', 'd', 'j', 'e', 'k']
['d', 'o', 'a', 'r', 'h', 's', 'n', 'k']
Lists after loop:
['b', 'j', 'e', '0', '0', '0']
['o', 'r', 'h', 's', 'n', '1', '1', '1']

1 Comment

ok i think it works... but when i have name1 = "naveen" name2 = "darshana" it doesnt work!!
0

Your code might not be working as you expect it to, as it is removing paired letters. For instance, you see an a, you then remove both a's from your words...

Comments

0

Your code most probably fails because you pop common letters from anywhere in the list, but append the replacements ("0" and "1") to the end of the list. They should be at position i and j, respectively.

So the loop should probably look like this:

for i in range(len(n1l)):
   for j in range(len(n2l)):
         if n1l[i] == n2l[j] and n1l[i] not in ("0", "1"):
               print "common letter ", n1l[i]
               # Replace i-th and j-th element
               n1l[i] = "0"
               n2l[j] = "1"

Anyway, there more "pythonic" ways, which are already shown in the other answers.

EDIT: Tested and working also with name1 = "naveen" / name2 = "darshana".

Comments

0

Here's some quite (IMHO quite elegant) code that runs in O(n). If word 1 has N occurrences of the letter x, it removes the first N x's from word 2 (and vice versa) -- I think this is what you want, but I could be wrong.

from collections import defaultdict

def build(s, chars_s, chars_t):
    """Return characters of s, with duplicate characters from t removed."""
    for i, char in enumerate(s):
        indexes_s, indexes_t = chars_s[char], chars_t[char]
        if len(indexes_s) > len(indexes_t) and i >= indexes_s[len(indexes_t)]:
            yield char

def rm_dup(a, b):
    """Pairwise remove duplicate letters in a and b."""
    chars_a, chars_b = defaultdict(list), defaultdict(list)
    for i, char in enumerate(a): chars_a[char].append(i)
    for i, char in enumerate(b): chars_b[char].append(i)
    return (''.join(build(a, chars_a, chars_b)),
            ''.join(build(b, chars_b, chars_a)))

print rm_dup('naveen', 'darshana')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.