1

I could do longest common substring using two strings each time. But consider 3 strings below:

  1. ABZDCC
  2. ABZDEC
  3. EFGHIC

Here we see that the lcs of the first two strings is ABZD. But when this will be compared to the third string, the length of lcs will be zero. But it is clear that the lcs is "C". How can I find the longest common substring of n strings using suffix array?

2
  • 1
    The position of the substrings is relevant or not? I mean, if the last string was CEFGHI, would the answer still be C? Commented Oct 12, 2021 at 7:45
  • Yes. Position is not relevant Commented Oct 12, 2021 at 11:19

2 Answers 2

1

If you have a suffix array that contains all the suffixes of every input string, then for any string X that is a (contiguous) substring of all the input strings, there is a contiguous subarray in which every suffix starts with X, that includes a suffix from every input string.

Furthermore, if X is a longest common substring, then the subarray can be as small as possible, such that the first and last suffixes in the subarray are the only suffixes from their corresponding input strings.

To find the longest substring, then:

  1. For each position in the suffix array, find the smallest subarray starting at that position that includes a suffix from every string. You can use the incrementing-two-pointers technique to do this in amortized constant time per entry.
  2. For each such subarray, find the longest common prefix of the first and last entries, which will be a common prefix of every item in the subarray.
  3. Remember the longest common prefix you find, which is the longest substring that occurs in every input.
Sign up to request clarification or add additional context in comments.

Comments

0

When working with more than two strings, find all common substrings between the two shortest input strings and then start eliminating common substrings that aren't included in the other input strings. When done, return the longest common remaining substring.

1 Comment

@RobbyCornelissen your comment is correct; I gave a generic answer in the case that the question erroneously mentions suffix arrays, because it happens :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.