9

I have a server that generates a PDF, I have no access or ability to change the settings on the server.

When the server produces the pdf it embeds the following javascript code into the file so that when any PDF reader/viewer opens it, the PRINT DOCUMENT screen automatically opens. This is very inconvenient and frustrating.

The code inside the file at the very start looks like this:

%PDF-1.4 %âãÏÓ 1 0 obj <</S/JavaScript/JS(this.print\(true , 0,this.numPages-1,false\);\r)>> endobj 3 0 obj <</Length 10/Filter/FlateDecode>>stream xœ+ä SNIP

I thought it would be an easy task just to remove the javascript line and prevent the auto print screen from popping up.

I have tried this (just did a string search and replace and removed line 4). This DOES stop the print screen appearing - BUT when opened in a few PDF viewers (goodreader etc) instantly flags up as a corrupted PDF.

I can click the repair option and everything works fine, but I would like to know, is there anything I could do to replace the javascript code with some sort of NOOP code to keep the file from being corrupt whilst still preventing the print page?

Here's a link to a source file: https://www.dropbox.com/s/kziy6evi57cfhb3/2014-04-04_EIKY.pdf (800k)

Is there a way to nullify a pdf object or something similar?

Thank you.

6
  • So, it was the document owner's intention to immediately print the documents after they have opened. In fact, the command to execute the JavaScript ins not in obj 1 (which only contains the code to be executed), but in another object which refers to obj 1. Commented Apr 5, 2014 at 10:55
  • Noodling around in a PDF using a text editor can be dangerous for the file; using a Hex editor has already fewer risks. But when it comes to "cleaning up" a whole collection of documents, it is recommended to use an appropriate tool. You may be willing to spend some time developing something using one of the libraries out there, or you might look at some commercial tools (for that, you might have a talk to Appligent, for example). Commented Apr 5, 2014 at 11:02
  • It was for a personal project only. It was forcing a print preview screen on my ipad and other devices and was just an annoyance rather than a show stopper. This quick "hack" is all that I needed at this stage. Thanks for the info though. Commented Apr 5, 2014 at 19:16
  • @PilotSnipes could you please let us know , which library did you use for parsing the pdf file. if possible can you upload your script portion?. Commented Feb 12, 2020 at 8:31
  • @SakthiSureshAnand it really wasn't anything fancy. And it still works as of today! I will leave my code below as a new answer. Commented Feb 16, 2020 at 11:33

4 Answers 4

17

Since PDF has checks to make sure that the content length hasn't changed at certain points, you can't add or remove characters. But you can change them. You can change it like this:

<</S/JavaScript/JS(this.print\(true , 0,this.numPages-1,false\);\r)>>

to this

<</S/JavaScript/JS(;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\r)>>

for example.

Sign up to request clarification or add additional context in comments.

6 Comments

Have just tested and this works now perfectly! Thank you for the quick answer!
The reason why, if you dare noodling around in the PDF file, the number of characters has to remain the same is because PDF has a xref table, where the beginning of the objects is listed as offset from the beginning of the file. By deleting stuff, the offsets are no longer correct. A well-behaving PDF viewer tries to fix that, but there are limits, and then the document is corrupted for good.
What editor do you use to remove javascript from a pdf?
Any text editor should work. On Windows, Notepad. On Linux or OSX, vim.
@JoeFrambach It looks like gibberish and I can't find the javascript I know is embedded. Maybe it needs tricky encoding settings?
|
6

The easy way:

  1. Open the file with notepad++ or a similar editor.
  2. find the javascript code that triggers the print dialogue. You can use the find dialogue of the editor (ctl+f) and use the string "this.print".The rest of the code might change from document to document.
  3. Select all the characters inside the brackets of the JS instruction and count the number of characters. e.g.

    /JS (this.print({bUI:true,bSilent:false,bShrinkToFit:true});)

    See attached pic 1

  4. Replace all content inside the brackets for an exact number of semicolons. e.g.

    /JS (;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;)

  5. Save the document.

Comments

3

@SakthiSureshAnand was looking for my code/library I used. It really is nothing special, but I thought I'd leave it here.

A simple php script requests the original file and then we get the contents of the file as a string:

Then preg_replace is all I use to replace the unwanted printing code and write the admended file to disk.

$fileString = file_get_contents('source.pdf');

$pdf = preg_replace(
  '%(<</S/Javascript/JS\()(.*;)(.*)%i',
  '<</S/Javascript/JS(;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;\r)>>',
  $fileString
);


//Do what you want with the fixed $pdf string.

Hope that helps someone.

Comments

1

With Foxit Reader on Windows, you can print the document to PDF, and the resulting PDF no longer has the Javascript actions.

1 Comment

Yes, but the resulting PDF is sadly often 10x the size of the original

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.