21

Are there any libraries out there for Java that will accept two strings, and return a string with formatted output as per the *nix diff command?

e.g. feed in

test 1,2,3,4
test 5,6,7,8
test 9,10,11,12
test 13,14,15,16

and

test 1,2,3,4
test 5,6,7,8
test 9,10,11,12,13
test 13,14,15,16

as input, and it would give you

test 1,2,3,4                                                    test 1,2,3,4
test 5,6,7,8                                                    test 5,6,7,8
test 9,10,11,12                                               | test 9,10,11,12,13
test 13,14,15,16                                                test 13,14,15,16

Exactly the same as if I had passed the files to diff -y expected actual

I found this question, and it gives some good advice on general libraries for giving you programmatic output, but I'm wanting the straight string results.

I could call diff directly as a system call, but this particular app will be running on unix and windows and I can't be sure that the environment will actually have diff available.

5 Answers 5

15

java-diff-utils

The DiffUtils library for computing diffs, applying patches, generationg side-by-side view in Java

Diff Utils library is an OpenSource library for performing the comparison operations between texts: computing diffs, applying patches, generating unified diffs or parsing them, generating diff output for easy future displaying (like side-by-side view) and so on.

Main reason to build this library was the lack of easy-to-use libraries with all the usual stuff you need while working with diff files. Originally it was inspired by JRCS library and it's nice design of diff module.

Main Features

  • computing the difference between two texts.
  • capable to hand more than plain ascci. Arrays or List of any type that implements hashCode() and equals() correctly can be subject to differencing using this library
  • patch and unpatch the text with the given patch
  • parsing the unified diff format
  • producing human-readable differences
Sign up to request clarification or add additional context in comments.

2 Comments

The actively maintained fork seems to be github.com/bkromhout/java-diff-utils
producing human-readable differences. Is this done yet? could you point me towards an example.
6

I ended up rolling my own. Not sure if it's the best implementation, and it's ugly as hell, but it passes against test input.

It uses java-diff to do the heavy diff lifting (any apache commons StrBuilder and StringUtils instead of stock Java StringBuilder)

public static String diffSideBySide(String fromStr, String toStr){
    // this is equivalent of running unix diff -y command
    // not pretty, but it works. Feel free to refactor against unit test.
    String[] fromLines = fromStr.split("\n");
    String[] toLines = toStr.split("\n");
    List<Difference> diffs = (new Diff(fromLines, toLines)).diff();

    int padding = 3;
    int maxStrWidth = Math.max(maxLength(fromLines), maxLength(toLines)) + padding;

    StrBuilder diffOut = new StrBuilder();
    diffOut.setNewLineText("\n");
    int fromLineNum = 0;
    int toLineNum = 0;
    for(Difference diff : diffs) {
        int delStart = diff.getDeletedStart();
        int delEnd = diff.getDeletedEnd();
        int addStart = diff.getAddedStart();
        int addEnd = diff.getAddedEnd();

        boolean isAdd = (delEnd == Difference.NONE && addEnd != Difference.NONE);
        boolean isDel = (addEnd == Difference.NONE && delEnd != Difference.NONE);
        boolean isMod = (delEnd != Difference.NONE && addEnd != Difference.NONE);

        //write out unchanged lines between diffs
        while(true) {
            String left = "";
            String right = "";
            if (fromLineNum < (delStart)){
                left = fromLines[fromLineNum];
                fromLineNum++;
            }
            if (toLineNum < (addStart)) {
                right = toLines[toLineNum];
                toLineNum++;
            }
            diffOut.append(StringUtils.rightPad(left, maxStrWidth));
            diffOut.append("  "); // no operator to display
            diffOut.appendln(right);

            if( (fromLineNum == (delStart)) && (toLineNum == (addStart))) {
                break;
            }
        }

        if (isDel) {
            //write out a deletion
            for(int i=delStart; i <= delEnd; i++) {
                diffOut.append(StringUtils.rightPad(fromLines[i], maxStrWidth));
                diffOut.appendln("<");
            }
            fromLineNum = delEnd + 1;
        } else if (isAdd) {
            //write out an addition
            for(int i=addStart; i <= addEnd; i++) {
                diffOut.append(StringUtils.rightPad("", maxStrWidth));
                diffOut.append("> ");
                diffOut.appendln(toLines[i]);
            }
            toLineNum = addEnd + 1; 
        } else if (isMod) {
            // write out a modification
            while(true){
                String left = "";
                String right = "";
                if (fromLineNum <= (delEnd)){
                    left = fromLines[fromLineNum];
                    fromLineNum++;
                }
                if (toLineNum <= (addEnd)) {
                    right = toLines[toLineNum];
                    toLineNum++;
                }
                diffOut.append(StringUtils.rightPad(left, maxStrWidth));
                diffOut.append("| ");
                diffOut.appendln(right);

                if( (fromLineNum > (delEnd)) && (toLineNum > (addEnd))) {
                    break;
                }
            }
        }

    }

    //we've finished displaying the diffs, now we just need to run out all the remaining unchanged lines
    while(true) {
        String left = "";
        String right = "";
        if (fromLineNum < (fromLines.length)){
            left = fromLines[fromLineNum];
            fromLineNum++;
        }
        if (toLineNum < (toLines.length)) {
            right = toLines[toLineNum];
            toLineNum++;
        }
        diffOut.append(StringUtils.rightPad(left, maxStrWidth));
        diffOut.append("  "); // no operator to display
        diffOut.appendln(right);

        if( (fromLineNum == (fromLines.length)) && (toLineNum == (toLines.length))) {
            break;
        }
    }

    return diffOut.toString();
}

private static int maxLength(String[] fromLines) {
    int maxLength = 0;

    for (int i = 0; i < fromLines.length; i++) {
        if (fromLines[i].length() > maxLength) {
            maxLength = fromLines[i].length();
        }
    }
    return maxLength;
}

2 Comments

Can you post maxLength() amd value of NONE? Thanks for saving me tons of development
Did you find the java-diff-utils library wanting? Just want to know (excuse the pun).
0

Busybox has a diff implementation that is very lean, should not be hard to convert to java, but you would have to add the two-column functionality.

Comments

0

http://c2.com/cgi/wiki?DiffAlgorithm I found this on Google and it gives some good background and links. If you care about the algorithm beyond just doing the project, a book on basic algorithm that covers Dynamic Programming or a book just on it. Algorithm knowledge is always good:)

1 Comment

Getting working implementations in Java is the easy part - they're out there. The hard bit was actually formatting the output in a format that is required.
0

You can use Apache Commons Text library to achieve this. This library provides 'diff' capability based on "very efficient algorithm from Eugene W. Myers".

This provides you ability to create your own visitor so that you can process the diff in the way you want & may be output to console or HTML etc. Here is one article which walks through nice & simple example to output side by side diff in HTML format using Apache Commons Text library & simple Java code.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.