0

Is there a known algorithm for combining strings in a way, so that what most oft the input strings have in common is put in the resulting string? What I mean is this:

input-1: "This is a Tsst"

input-2: "This is Test"

input-3: "Thi5 ia a Test"


result: "This is a Test" 

The length in words and characters of the inputs is varying, which creates the problem for me.

4
  • Do you necessarily want to output on of the original input, as in your example ? Commented Sep 22, 2017 at 6:17
  • Sorry, I don't understand what you mean. Commented Sep 22, 2017 at 8:03
  • sorry, I mispelled a word. My question is : is the result exactly one of the input ? In your example, the result is input-2. So do you have to choose among the input, or can it be a combination of the different inputs ? Commented Sep 22, 2017 at 8:07
  • No, it's a mix between what is most common in all of the inputs. In input 2 the 'a' is missing, but in 1 and 3 it is there, therefore it will make it to the result. Commented Sep 22, 2017 at 8:47

1 Answer 1

1

Yes, but it's rtather involved.

You do a multiple alignment of the string sequences using Clustal or a variant. Then you read off the consensus sequence. Clustal accepts a scoring matrix, which is intended for protein sequences, but could be used for English letters (k is similar to c, 5 to s and so on).

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.