4

So, here is my problem: I have a large text file (size around 150 MB) with hundreds of thousands of lines.I need to read the contents of the file, parse it so that the lines are put in appropriate html tags and write it into a window.document.open() object.

My code works for files until 50 MB of size.

var rawFile=new XMLHttpRequest();
    rawFile.open("GET",file, true);
    rawFile.onreadystatechange= function () {
        if (rawFile.readyState === 4) {
            if (rawFile.status === 200 || rawFile.status === 0) {
                var allText = rawFile.responseText;
                var contents = allText.split("\n");
                var w = window.open();
                w.document.open();
                for (i = 0; i < contents.length; i++) {
                    //logc so that str= appropriate tags + contents[i]
                    w.document.write(str);
                }
            }
        }
    }

The code works. The logic works. But if the file size is greater than 100MB or similar, chrome crashes. I think reading the file in chunks and then writing it to window.document.open() will remove this problem for me.

any advice how I could go about accomplishing this is very appreciated. Thank you :)

(Ignore if there are any errors in the code I posted above, my actual code is very large so I just wrote a miniature version of it)

4
  • Check this Commented Jun 12, 2017 at 7:46
  • Is there no way I can do this with plain javascript/jquery without using any plugins? Commented Jun 12, 2017 at 7:51
  • Of course there is. Create entire plugin on your own. Re-invent the wheel. ha ha. Commented Jun 12, 2017 at 7:52
  • I mean, any simpler solution? This seems like such a common use case I assumed there would be a simpler soluiton Commented Jun 12, 2017 at 8:36

1 Answer 1

5

Your approach will cripple the browser because you are processing the entire response at once. A better approach would be to break the process down so that you are processing smaller chunks or alternatively stream the file through your process.

Using the Fetch API rather than XMLHttpRequest will get you access to the streaming data. The big advantage of using the stream is that you aren't hogging the browser's memory when you're processing the content.

The following code outlines how to use streams to perform the task:

var file_url = 'URL_TO_FILE';
// @link https://developer.mozilla.org/en-US/docs/Web/API/Request/Request
var myRequest = new Request( file_url );
// fetch returns a promise
fetch(myRequest)
  .then(function(response) {
    var contentLength = response.headers.get('Content-Length');
    // response.body is a readable stream
    // @link https://learn.microsoft.com/en-us/microsoft-edge/dev-guide/performance/streams-api
    var myReader = response.body.getReader();
    // the reader result will need to be decoded to text
    // @link https://developer.mozilla.org/en-US/docs/Web/API/TextDecoder/TextDecoder 
    var decoder = new TextDecoder();
    // add decoded text to buffer for decoding
    var buffer = '';
    // you could use the number of bytes received to implement a progress indicator
    var received = 0;
    // read() returns a promise
    myReader.read().then(function processResult(result) {
      // the result object contains two properties:
      // done  - true if the stream is finished
      // value - the data
      if (result.done) {
        return;
      }
      // update the number of bytes received total
      received += result.value.length;
      // result.value is a Uint8Array so it will need to be decoded
      // buffer the decoded text before processing it
      buffer += decoder.decode(result.value, {stream: true});
      /* process the buffer string */

      // read the next piece of the stream and process the result
      return myReader.read().then(processResult);
    })
  })

I didn't implement the code for processing the buffer but the algorithm would be as follows:

If the buffer contains a newline character:
    Split the buffer into an array of lines
If there is still more data to read:
    Save the last array item because it may be an incomplete line
    Do this by setting the content of the buffer to that of the last array item
Process each line in the array

A quick look at Can I Use tells me that this won't work in IE because the Fetch API wasn't implemented before the Edge browser. However there's no need to despair because as always some kind soul has implemented a polyfill for non-supporting browsers.

Sign up to request clarification or add additional context in comments.

8 Comments

Thank you very much! I'll try this and get back to you :)
Hi! So I tried doing what you have mentioned. I still have the problem. I am processing a huge file (with over 1.5M lines), and when I process the buffer and write into a new window, the script finishes executing (better than my original one) but the I still get the message saying "the page has become unresponsive. Wait or kill" in chrome. Chrome doesn't crash, but I still can't get what I want because the page becomes irresponsive. Is it this because of me exceeding any of chrome's default memory limits?
Update: Now I get chrome ran out of memory while displaying this web page. Any workarounds around this?
I think the problem you asked to be solved isn't the actually problem which you need to solve - No web browser was designed to handle your application in its current form. Why do you need all of the content displayed at once in the DOM? Humans can't process the information in 100+ lines at once, which is a universe away from your 1.5M+ lines. A better solution would be to process the file on the server and only send the necessary data to the browser a page at a time. Can you describe what it is that the application is doing?
Have you considered logging to a database? Databases compress text really well so you will save on disk space relative to your current log files. A JS application which pages through the data from the database would be a cleaner solution than what you currently have.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.