12

I'm trying to extract the text of a pdf from the pdf's url. Following the example on the pdf.js website, i understand how to render a pdf on client-side, but I'm running into issues when I do this server-side.

I downloaded the package using npm i pdfjs-dist

I tried the code below as a simple example to load the pdf:

var url = 'https://raw.githubusercontent.com/mozilla/pdf.js/ba2edeae/examples/learning/helloworld.pdf';
var pdfjsLib = require("pdfjs-dist")
var loadingTask = pdfjsLib.getDocument(url);

loadingTask.promise.then(function (pdf) {
    console.log(pdf);
}).catch(function (error){
    console.log(error)
})

But when I run this, I get the following error:

  message: 'The browser/environment lacks native support for critical functionality used by the PDF.js library (e.g. `ReadableStream` and/or `Promise.allSettled`); please use an ES5-compatible build instead.',
  name: 'UnknownErrorException',
  details: 'Error: The browser/environment lacks native support for critical functionality used by the PDF.js library (e.g. `ReadableStream` and/or `Promise.allSettled`); please use an ES5-compatible build instead.'

Any ideas on how to go about doing this? All I'm trying to do is extract the text of a pdf from it's URL. And I'm trying to do this server side using nodejs. Appreciate any input!

3 Answers 3

14

You need to import the es5 build of pdf.js. The code below should work:

var pdfjsLib = require("pdfjs-dist/es5/build/pdf.js");
var url = 'https://raw.githubusercontent.com/mozilla/pdf.js/ba2edeae/examples/learning/helloworld.pdf';
var loadingTask = pdfjsLib.getDocument(url);

loadingTask.promise.then(function (pdf) {
    console.log(pdf);
}).catch(function (error){
    console.log(error)
})

Also check out https://github.com/mozilla/pdf.js/blob/master/examples/node/getinfo.js for a working example with node.js

Sign up to request clarification or add additional context in comments.

Comments

14

I had the same problem (The browser/environment lacks native support for critical functionality used by the PDF.js library (e.g. ReadableStream and/or Promise.allSettled); please use an ES5-compatible build instead.) but with Angular 8 so here I leave the solution in case someone needs it:

packaje.json configuration:

  • Angular versión: 8.2.14
  • pdfjs-dist: 2.4.456

component:

import * as pdfjs from 'pdfjs-dist/es5/build/pdf';
import { pdfjsworker } from 'pdfjs-dist/es5/build/pdf.worker.entry';

pdfjs.GlobalWorkerOptions.workerSrc = pdfjsworker;

1 Comment

Sweet Solution. I was facing same issue in the react (17.0) app and I was using react-pdf module but later removed that installed the said version and it worked. Thanks
13

I've also faced the same issue in latest version of pdfjs-dist (2.8.335) while using it in a node js project and as mentioned in other answers that we need to change path to fix this.

But in my case path - pdfjs-dist/es5/build/pdf didn't work.

In latest version it got changed to pdfjs-dist/legacy/build/pdf.js

1 Comment

jup, this currently seems to still be the official way

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.