0

I wrote a code to find all URLs within a PDF file and replace the one(s) that matches the parameters that was passed from a PHP script.

It is working fine when a single URL is passed. But I don't know how to handle more than one URL, I'm guessing I would need a loop that reads the array length, and call the changeURL method passing the correct parameters.

I actually made it work with if Statements (if myarray.lenght < 4 do this, if it is < 6, do that, if < 8.....), but I am guessing this is not the optimal way. So I removed it and want to try something else.

Parameters passed from PHP (in this order):

  • args[0] - Location of original PDF
  • args[1] - Location of new PDF
  • args[2] - URL 1 (URL to be changed)
  • args[3] - URL 1a (URL that will replace URL 1)
  • args[4] - URL 2 (URL to be changed)
  • args[5] - URL 2a - (URL that will replace URL 2)
  • args...

and so on... up to maybe around 16 args, depending on how many URLs the PDF file contains.

Here's the code:

Main.java

public class Main {

public static void main(String[] args) {

    if (args.length >= 4) {
        URLReplacer.changeURL(args);
    } else {
        System.out.println("PARAMETER MISSING FROM PHP");
    }
}
}

URLReplacer.java

public class URLReplacer {

public static void changeURL(String... a) {

    try (PDDocument doc = PDDocument.load(a[0])) {
        List<?> allPages = doc.getDocumentCatalog().getAllPages();
        for (int i = 0; i < allPages.size(); i++) {
            PDPage page = (PDPage) allPages.get(i);
            List annotations = page.getAnnotations();
            for (int j = 0; j < annotations.size(); j++) {
                PDAnnotation annot = (PDAnnotation) annotations.get(j);
                if (annot instanceof PDAnnotationLink) {
                    PDAnnotationLink link = (PDAnnotationLink) annot;
                    PDAction action = link.getAction();
                    if (action instanceof PDActionURI) {
                        PDActionURI uri = (PDActionURI) action;
                        String oldURL = uri.getURI();

                        if (a[2].equals(oldURL)) {
                            //System.out.println("Page " + (i + 1) + ": Replacing " + oldURL + " with " + a[3]);
                            uri.setURI(a[3]);
                        }
                    }
                }
            }
        }
        doc.save(a[1]);
    } catch (IOException | COSVisitorException e) {
        e.printStackTrace();
    }
}
}

I have tried all sort of loops, but with my limited Java skills, did not achieve any success.

Also, if you notice any dodgy code, kindly let me know so I can learn the best practices from more experienced programmers.

2
  • you could use an array, and transmit it with JSON. useful if you have dozens of parameters ... Commented Nov 25, 2015 at 15:26
  • I have no experience with JSON, I am just a beginner, but I will make sure to look into that possibility. Thanks! Commented Nov 25, 2015 at 15:30

3 Answers 3

1

Your main problem - as I understand -, is the "variable number of variables". And you have to send from PHP to JAVA.

1 you can transmit one by one as your example

2 or, in a structure. there are several structures. JSON is rather simple at PHP: multiple examples here: encode json using php?

and for java you have: Decoding JSON String in Java.

or others (like XML , which seems too complex for this).

Sign up to request clarification or add additional context in comments.

Comments

0

I'd structure your method to accept specific parameters. I used map to accept URLs, a custom object would be another option.

Also notice the way loops are changed, might give you a hint on some Java skills.

public static void changeURL(String originalPdf, String targetPdf, Map<String, String> urls ) {

        try (PDDocument doc = PDDocument.load(originalPdf)) {
            List<PDPage> allPages = doc.getDocumentCatalog().getAllPages();
            for(PDPage page: allPages){
                List annotations = page.getAnnotations();
                for(PDAnnotation annot : page.getAnnotations()){
                    if (annot instanceof PDAnnotationLink) {
                        PDAnnotationLink link = (PDAnnotationLink) annot;
                        PDAction action = link.getAction();
                        if (action instanceof PDActionURI) {
                            PDActionURI uri = (PDActionURI) action;
                            String oldURL = uri.getURI();

                            for (Map.Entry<String, String> url : urls.entrySet()){
                                if (url.getKey().equals(oldURL)) {
                                    uri.setURI(url.getValue());
                                }
                            }

                        }
                    }
                }
            }
            doc.save(targetPdf);
        } catch (IOException | COSVisitorException e) {
            e.printStackTrace();
        }
    }

If you have to get the URL and PDF locations from command line, then call the changeURL function like this:

 public static void main(String[] args) {

        if (args.length >= 4) {
            String originalPdf = args[0];
            String targetPdf = args[1];
            Map<String, String> urls = new HashMap<String, String>();
            for(int i = 2; i< args.length; i+=2){
                urls.put(args[i], args[i+1]);
            }
            URLReplacer.changeURL(originalPdf, targetPdf, urls);
        } else {
            System.out.println("PARAMETER MISSING FROM PHP");
        }
    }

1 Comment

A question though, I noticed you have changed the loops, what is the difference between: for(PDPage page: allPages){} and for (int i = 0; i < allPages.size(); i++) {}
-1

Of the top of my head, you could do something like this

public static void main(String[] args) {

    if (args.length >= 4 && args.length % 2 == 0) {
        for(int i = 2; i < args.length; i += 2) {
            URLReplacer.changeURL(args[0], args[1], args[i], args[i+1]);
            args[0] = args[1];
        }
    } else {
        System.out.println("PARAMETER MISSING FROM PHP");
    }
}

3 Comments

Inefficient, this makes the changeURL process the PDF document several times.
@Aragorn The whole way he handles this process is wrong in my opinion, I simply provided a solution that works respecting his interface. I don't believe that my answer deserves your negative vote.
@GiorgosKritsotakis, though I don't think your answer deserves a negative vote, I find that Aragorn's example (accepted answer) is more efficient and elegant. but thank you for your answer, as it also works.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.