I have some Strings and i want a measure for their similarity, but different from string edit distance for example, based more on structural similarities than on letter similarity.
For example: 312164 and 48479 should get a very high score, since they are only numbers and have same length. For Bla blubb and bla bloob blo should be the same, because they only contain letters and have gaps in between. Less score should be applied to couples like apple and app3 f, even if they share some letters, but have different structure.
Something like that... Anybody has a clue? In Java, if possible.
Thank you!