Matching Strings in Python?

Question

Using Python, how can I check whether 3 consecutive chars within a string (A) are also contained in another string (B)? Is there any built-in function in Python?

EXAMPLE:

A = FatRadio
B = fradio

Assuming that I have defined a threshold of 3, the python script should return true as there are three consecutive characters in B which are also included in A (note that this is the case for 4 and 5 consecutive characters as well).

Do you mean something like loop over triplet chars from (A) and look for in in (B)? — Alexis
– Alexis, Commented Aug 13, 2013 at 12:27
If I get it right, yes! So as I wrote bellow, I need python to try all possible combinations (of words of length 3) in its own... — user2295350
– user2295350, Commented Aug 13, 2013 at 12:36
Also, do you mean three identical chars, or any char triplet? — Alexis
– Alexis, Commented Aug 13, 2013 at 12:38

Giwrgos Tsopanoglou · Accepted Answer · 2013-08-13 13:37:44Z

2

How about this?

char_count = 3 # Or whatever you want
if len(A) >= char_count and len(B) >= char_count :
    for i in range(0, len(A) - char_count + 1):
        some_chars = A[i:i+char_count]
        if some_chars in B:
            # Huray!

edited Aug 13, 2013 at 13:37

answered Aug 13, 2013 at 12:24

Giwrgos Tsopanoglou

1,2352 gold badges9 silver badges15 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Bakuriu Over a year ago

@user2295350 This solution is broken. Try with A = "abcdefghijk" and B = "zzijk". Even though they share a triplet the code doesn't find it.

Giwrgos Tsopanoglou Over a year ago

@Bakuriu Yeah, I missed that. Fixed it now

Bakuriu · Accepted Answer · 2013-08-13 13:40:12Z

2

You can use the difflib module:

import difflib

def have_common_triplet(a, b):
    matcher = difflib.SequenceMatcher(None, a, b)
    return max(size for _,_,size in matcher.get_matching_blocks()) >= 3

Result:

>>> have_common_triplet("FatRadio", "fradio")
True

Note however that SequenceMatcher does much more than finding the first common triplet, hence it could take significant more time than a naive approach. A simpler solution could be:

def have_common_group(a, b, size=3):
     first_indeces = range(len(a) - len(a) % size)
     second_indeces = range(len(b) - len(b) % size)
     seqs = {b[i:i+size] for i in second_indeces}
     return any(a[i:i+size] in seqs for i in first_indeces)

Which should perform better, especially when the match is at the beginning of the string.

edited Aug 13, 2013 at 13:40

answered Aug 13, 2013 at 12:58

Bakuriu

103k23 gold badges208 silver badges236 bronze badges

Comments

Alexis · Accepted Answer · 2013-08-13 12:45:13Z

1

I don't know about any built-in function for this, so I guess the most simple implementation would be something like this:

a = 'abcdefgh'
b = 'foofoofooabcfoo'

for i in range(0,len(a)-3):
  if a[i:i+3] in b:
    print 'then true!'

Which could be shorten to:

search_results = [i for in range(0,len(a)-3) if a[i:i+3] in b]

answered Aug 13, 2013 at 12:45

Alexis

7159 silver badges35 bronze badges

Collectives™ on Stack Overflow

Matching Strings in Python?

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related