1

Do you know any strictly equivalent implementation of the PHP similar_text function in Java?

2
  • warm-up: stackoverflow.com/questions/907997/string-distance-library Commented Jan 4, 2010 at 16:09
  • not exactly. The PHP similar_text is different than the levenshtein distance. From the PHP similar_text manual : "This calculates the similarity between two strings as described in Oliver [1993]. [...] Returns the number of matching chars in both strings." I cannot find any Java implementation for the Oliver similarity algorithm Commented Jan 4, 2010 at 16:19

5 Answers 5

1

Here is my implementation in java :

package comwebndesignserver.server;

import android.util.Log;

/*
 * 
 * DenPashkov 2012 
 * http://www.facebook.com/pashkovdenis
 *  * PhP Similar String  Implementation 
 * 30.07.2012 
 * 
 */

public class SimilarString {

    private String  string = "" ;
    private String string2 = ""; 
    public int procent = 0 ; 
    private int position1 =0 ; 
    private int position2 =0;

    // Similar String 
    public SimilarString(String str1,  String str2){
        this.string = str1.toLowerCase();   
        this.string2 = str2.toLowerCase(); 
    }
    public SimilarString() {

    }
    // Set string 
    public SimilarString setString(String str1,  String str2){
        this.string = str1.toLowerCase(); 
        this.string2 = str2.toLowerCase(); 
        return this ; 
    }

 //get Similar 
    public int  similar(){
        string= string.trim() ; 
        string2= string2.trim();
     int len_str1 = string.length() ;
        int len_str2 = string2.length() ; 

        int max= 0; 
        if (string.length()>1 && string2.length()>1 ){
            // iterate 
            for (int p=0  ; p<=len_str1; p++){
                for (int q=0  ; q<=len_str2; q++){
                    for(int l=0 ; (p + l < len_str1) && (q + l < len_str2) && (string.charAt(l) == string2.charAt(l)); l++){
                        if (l>max){
                            max=l ; 
                            position1 = p ; 
                            position2 = q; 
                        }
                    }
                }
            }

         //sim * 200.0 / (t1_len + t2_len)
        this.procent = max * 200 / ((string.length()) + (string2.length())  - (max) + (position2 - position1)   ) - (max*string.length() ) ;
        if (procent>100) procent = 100; 
        if (procent<0) procent = 0; 
        }
        return this.procent ; 
    }
}
Sign up to request clarification or add additional context in comments.

Comments

1

this works the same as php similar_text function as is in php_similar_str, php_similar_char, PHP_FUNCTION(similar_text) in string.c file of php sources

private float similarText(String first, String second)   {
    first = first.toLowerCase();
    second = second.toLowerCase();
    return (float)(this.similar(first, second)*200)/(first.length()+second.length());
}

private int similar(String first, String second)  { 
    int p, q, l, sum;
    int pos1=0;
    int pos2=0;
    int max=0;
    char[] arr1 = first.toCharArray();
    char[] arr2 = second.toCharArray();
    int firstLength = arr1.length;
    int secondLength = arr2.length;

    for (p = 0; p < firstLength; p++) {
        for (q = 0; q < secondLength; q++) {
            for (l = 0; (p + l < firstLength) && (q + l < secondLength) && (arr1[p+l] == arr2[q+l]); l++);            
            if (l > max) {
                max = l;
                pos1 = p;
                pos2 = q;
            }

        }
    }
    sum = max;
    if (sum > 0) {
        if (pos1 > 0 && pos2 > 0) {
            sum += this.similar(first.substring(0, pos1>firstLength ? firstLength : pos1), second.substring(0, pos2>secondLength ? secondLength : pos2));
        }

        if ((pos1 + max < firstLength) && (pos2 + max < secondLength)) {
            sum += this.similar(first.substring(pos1 + max, firstLength), second.substring(pos2 + max, secondLength));
        }
    }       
    return sum;
}

Comments

0

As for Java, your best bet might be the StringUtils class from the Apache Commons Lang library, which contains the LevensteinDistance method that the other SO posts mention.

1 Comment

So you could take the longest strings length and subtract the LevensteinDistance in order to get the same number that similar_text would produce. and for the percentage result you would devide the result by the length.
0
  1. Download the source code for PHP (http://php.net/downloads.php)
  2. Uncompress it.
  3. Convert the similar_text() function in ext\standard\string.c to Java.
  4. Then eat some ice-cream for tea :D

1 Comment

OK so I've converted the C similar_text() to Java. I have a love / hate relationship with C lol. Converting slightly hacky pointer code (obviously to make it efficient for PHP) to Java wasn't easy (for me anyway hehe). Unfortunately the code won't fit here... now just point 4) to finish :)
-1

I think you can take a look on this post : PHP similar_text function in Javascript

That's a javascript equivalent for PHP similar_text. You only need to adapt it in Java. sorry if that's not help since I think Javascript syntax and Java has only a little difference.

At least, you know the implementation algorithm

1 Comment

Javascript and Java are completely different

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.