4

I am trying to understand the algorithm from the famous 'diff' paper here, running on characters in the two command-line arguments. However, my code does not produce the results I expect. I've written the algorithm to match their variables to the extent possible:

$ ./diff abbcabba cbabac #Hmm.. I think it should be 5
SES: 10  
$ ./diff abcdefg abcdefg #0! Great!
SES: 0
$ ./diff abcXefg abcYefg # 2 seems reasonable
SES: 2
$ ./diff abcXefg abcefg # clearly wrong
SES: 7

Here is my code (Sorry about the wall of code):

    a = argv[1];
    b = argv[2];
    alen = strlen(a);
    blen = strlen(b);
    tlen = alen + blen;
    maxd = tlen;

    vp = (int *)calloc(2 * maxd + 1, sizeof(int));

    // How I'm thinking about pointer arithmetic:
    // vp in [0, 2*maxd + 1) == [0, 2*maxd]
    // v  in [-maxd, maxd + 1) == [-maxd, maxd]
    v = vp + maxd;
    v[1] = 0;
    for (D = 0; D <= maxd; D++) {
            for (k = -D; k <= D; k += 2) {
                    if (k == -D || (k != D && v[k-1] < v[k+1])) {
                            x = v[k + 1];
                    } else {
                            x = v[k - 1] + 1;
                    }
                    y = x - k;
                    while (x < alen && y < blen && a[x] == b[x]) {
                            x++;
                            y++;
                    }
                    v[k] = x;
                    if (x >= alen && y >= blen) {
                            printf("SES: %d\n", D);
                            goto out;
                    }
            }
    }
    printf("Nothing found. SES > %d\n", maxd);

Any idea where the flaw is here? I've been finding it incredibly hard to search online for this problem...

1
  • I'm a little surprised about the (int*) cast in front of calloc(3) -- that cast hasn't been necessary since prototypes were added in C89, and its presence can mask warnings and errors. Commented Mar 1, 2012 at 23:07

1 Answer 1

4

It seems that the problem is in this line:

while (x < alen && y < blen && a[x] == b[x]) {

Here b[x] should be b[y], which gives:

while (x < alen && y < blen && a[x] == b[y]) {

With this change the results for your examples are 6, 0, 2 and 1. This seems to be accurate.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.