Would like to discuss algorithms, no code.
Problem: Let S and T be two sequences of elements. Find the common subsequences between them where the order of the elements is preserved.
It should have O(n + m) running time where n is the length of S, and m is the length of T. I would also like to make the assumption that for the most part the two sequences will be similar.
An optimal solution?: After some research, one solution that appears to be optimal is to first build a generalised suffix tree for the two sequences. Then find the longest common substring and consider this subsequence to be part of the solution. Then either remove this subsequence from the tree or build a new suffix tree with this subsequence removed from the two original sequences to form S' and T'. Then find the longest common substring between S' and T', and so on.
To analyze the running time, building the tree takes O(n) and you can find the lengths and starting positions of the longest common substrings of S and T in O(n + m).
Are there other (more) practical solutions that someone knows of or can link to? Any published papers considering the same or related problem you all know about? Input and constructive criticism about the above solution? Thanks for all your time!
aaaaabbbbcccddeandeddcccbbbbaaaaain such a way your construction takes sqrt(n) * n not n to find all substrings. (assume alphabet has at leastnvariable, in other case you can find similar contradiction.)