Converting byte array output into Blob corrupts file

Question

I am using the Office Javascript API to write an Add-in for Word using Angular.

I want to retrieve the Word document through the API, then convert it to a file and upload it via POST to a server.

The code I am using is nearly identical to the documentation code that Microsoft provides for this use case: https://dev.office.com/reference/add-ins/shared/document.getfileasync#example---get-a-document-in-office-open-xml-compressed-format

The server endpoint requires uploads to be POSTed through a multipart form, so I create a FormData object on which I append the file (a blob) as well as some metadata, when creating the $http call.

The file is being transmitted to the server, but when I open it, it has become corrupted and it can no longer be opened by Word.

According to the documentation, the Office.context.document.getFileAsync function returns a byte array. However, the resulting fileContent variable is a string. When I console.log this string it seems to be compressed data, like it should be.

My guess is I need to do some preprocessing before turning the string into a Blob. But which preprocessing? Base64 encoding through atob doesn't seem to be doing anything.

let sendFile = (fileContent) => {

  let blob = new Blob([fileContent], {
      type: 'application/vnd.openxmlformats-officedocument.wordprocessingml.document'
    }),
    fd = new FormData();

  blob.lastModifiedDate = new Date();

  fd.append('file', blob, 'uploaded_file_test403.docx');
  fd.append('case_id', caseIdReducer.data());

  $http.post('/file/create', fd, {
      transformRequest: angular.identity,
      headers: {
        'Content-Type': undefined
      }
    })
    .success(() => {

      console.log('upload succeeded');

    })
    .error(() => {
      console.log('upload failed');
    });

};


function onGotAllSlices(docdataSlices) {

  let docdata = [];

  for (let i = 0; i < docdataSlices.length; i++) {
    docdata = docdata.concat(docdataSlices[i]);
  }

  let fileContent = new String();

  for (let j = 0; j < docdata.length; j++) {
    fileContent += String.fromCharCode(docdata[j]);
  }

  // Now all the file content is stored in 'fileContent' variable,
  // you can do something with it, such as print, fax...

  sendFile(fileContent);

}

function getSliceAsync(file, nextSlice, sliceCount, gotAllSlices, docdataSlices, slicesReceived) {
  file.getSliceAsync(nextSlice, (sliceResult) => {

    if (sliceResult.status === 'succeeded') {
      if (!gotAllSlices) { // Failed to get all slices, no need to continue.
        return;
      }

      // Got one slice, store it in a temporary array.
      // (Or you can do something else, such as
      // send it to a third-party server.)
      docdataSlices[sliceResult.value.index] = sliceResult.value.data;
      if (++slicesReceived === sliceCount) {
        // All slices have been received.
        file.closeAsync();

        onGotAllSlices(docdataSlices);

      } else {
        getSliceAsync(file, ++nextSlice, sliceCount, gotAllSlices, docdataSlices, slicesReceived);
      }
    } else {

      gotAllSlices = false;
      file.closeAsync();
      console.log(`getSliceAsync Error: ${sliceResult.error.message}`);
    }
  });
}

// User clicks button to start document retrieval from Word and uploading to server process
ctrl.handleClick = () => {

  Office.context.document.getFileAsync(Office.FileType.Compressed, {
      sliceSize: 65536 /*64 KB*/
    },
    (result) => {
      if (result.status === 'succeeded') {

        // If the getFileAsync call succeeded, then
        // result.value will return a valid File Object.
        let myFile = result.value,
          sliceCount = myFile.sliceCount,
          slicesReceived = 0,
          gotAllSlices = true,
          docdataSlices = [];

        // Get the file slices.
        getSliceAsync(myFile, 0, sliceCount, gotAllSlices, docdataSlices, slicesReceived);

      } else {

        console.log(`Error: ${result.error.message}`);

      }
    }
  );
};

Squrler · Accepted Answer · 2016-09-14 22:21:21Z

12

I ended up doing this with the fileContent string:

let bytes = new Uint8Array(fileContent.length);

for (let i = 0; i < bytes.length; i++) {
    bytes[i] = fileContent.charCodeAt(i);
}

I then proceed to build the Blob with these bytes:

let blob = new Blob([bytes], { type: 'application/vnd.openxmlformats-officedocument.wordprocessingml.document' });

If I then send this via a POST request, the file isn't mangled and can be opened correctly by Word.

I still get the feeling this can be achieved with less hassle / less steps. If anyone has a better solution, I'd be very interested to learn.

answered Sep 14, 2016 at 22:21

Squrler

3,5148 gold badges49 silver badges67 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

JohnnyAW · Accepted Answer · 2017-08-11 16:52:34Z

1

thx for your answer, Uint8Array was the solution. Just a little improvement, to avoid creating the string:

let bytes = new Uint8Array(docdata.length);
for (var i = 0; i < docdata.length; i++) {
    bytes[i] = docdata[i];
}

answered Aug 11, 2017 at 16:52

JohnnyAW

2,8761 gold badge19 silver badges27 bronze badges

Comments

Endless · Accepted Answer · 2016-09-14 20:25:40Z

0

Pff! what is wrong with a getting a instance of File and not using FileReader api? c'mon Microsoft!

You should take the byte array and throw it into the blob constructor, turning a binary blob to string in javascript is a bad idea that can lead to "out of range" error or incorrect encoding

just do something along with this

var byteArray = new Uint8Array(3)
byteArray[0] = 97
byteArray[1] = 98
byteArray[2] = 99
new Blob([byteArray])

if the chunk is an instance of a typed arrays or a instance of blob/file. in that case you can just do:

blob = new Blob([blob, chunk])

And please... don't base64 encode it (~3x larger + slower)

answered Sep 14, 2016 at 20:25

Endless

38.8k14 gold badges118 silver badges138 bronze badges

1 Comment

Squrler Over a year ago

>> Pff! what is wrong with a getting a instance of File and not using FileReader api? c'mon Microsoft! Don't I know it...Very confusing. When you say create a Uint8Array, where do you input the slices?

Collectives™ on Stack Overflow

Converting byte array output into Blob corrupts file

3 Answers 3

Comments

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related