Unable to upload huge file on google drive using python

Question

I am trying to upload files to Google drive by Google API using the following code

import httplib2
from apiclient import discovery
from httplib2 import Http
from oauth2client import file, client, tools
try:
    import argparse
    flags = argparse.ArgumentParser(parents=[tools.argparser]).parse_args()
except ImportError:
    flags = None

SCOPES =['https://www.googleapis.com/auth/drive','https://www.googleapis.com/auth/drive.file','https://www.googleapis.com/auth/drive.appdata', 'https://www.googleapis.com/auth/drive.apps.readonly']
store = file.Storage('scope.json')
creds = store.get()
if not creds or creds.invalid:
    flow = client.flow_from_clientsecrets('client_secret.json', SCOPES)
    creds = tools.run_flow(flow, store, flags) if flags else tools.run(flow, store)
    DRIVE = discovery.build('drive', 'v3', http=creds.authorize(Http()))
else:
    credentials = creds
    http = credentials.authorize(httplib2.Http())
    DRIVE = discovery.build('drive', 'v3', http=http)

FILES = (
    ('/home/vkm/mayur/Demo_Google_API.zip', 'application/vmd.google-apps.document'),
)

for filename, mimeType in FILES:
    metadata = {'name': filename}
    if mimeType:
        metadata['mimeType'] = mimeType
    res = DRIVE.files().create(body=metadata, media_body=filename).execute()
    if res:
        print('Uploaded "%s" (%s)' % (filename, res['mimeType']))

I am able to upload the small files but when I am trying with 8GB of the file, it is giving MemorryErro.Please find the error message that I am getting.

Traceback (most recent call last):
  File "demo.py", line 46, in <module>
    res = DRIVE.files().create(body=metadata, media_body=filename).execute()
  File "/usr/local/lib/python2.7/dist-packages/googleapiclient/discovery.py", line 853, in method
    payload = media_upload.getbytes(0, media_upload.size())
  File "/usr/local/lib/python2.7/dist-packages/googleapiclient/http.py", line 482, in getbytes
    return self._fd.read(length)
MemoryError

Your machine is running out of memory. Increase the available memory (if only it was that easy). Consider file compression. Or split the file up into multiple files and send each separately. — mhawke
– mhawke, Commented Feb 18, 2018 at 10:42
@mhawke thanks for the reply.. is there any other way to do this.... — Vikram Singh Chandel
– Vikram Singh Chandel, Commented Feb 18, 2018 at 10:53

bgoodr · Accepted Answer · 2018-03-26 02:27:01Z

9

Vikram's comment revealed a problem in mhawke's answer: next_chunk needs to be called upon the return value of:

request = DRIVE.files().create(body=metadata, media_body=media)

not on the return value of request.execute().

Here is a snippet of Python code I verified as working on files up to 10MB to my Google Drive account:

# Upload some file that just happens to be binary (we
# don't care about metadata, just upload it without
# translation):
the_file_to_upload = 'some_binary_file'
metadata = {'name': the_file_to_upload}
# Note the chunksize restrictions given in
# https://developers.google.com/api-client-library/python/guide/media_upload
media = MediaFileUpload(the_file_to_upload,
                        chunksize=1024 * 1024,
                        # Not sure whether or not this mimetypes is necessary:
                        mimetype='text/plain',
                        resumable=True)
request = drive_service.files().create(body=metadata, media_body=media)
response = None
while response is None:
    status, response = request.next_chunk()
    if status:
        print("Uploaded %d%%." % int(status.progress() * 100))
print("Upload of {} is complete.".format(the_file_to_upload))

Here is a snippet of Python code that downloads the same file, but to a different file, so that I can use sha1sum to verify file has not been altered by Google Drive going in and out.

# Verify downloading works without translation:
request = drive_service.files().get_media(fileId=response['id'])
# Use io.FileIO. Refer to:
# https://google.github.io/google-api-python-client/docs/epy/googleapiclient.http.MediaIoBaseDownload-class.html
out_filename = the_file_to_upload + ".out"
fh = io.FileIO(out_filename, mode='wb')
downloader = MediaIoBaseDownload(fh, request, chunksize=1024 * 1024)
done = False
while done is False:
    status, done = downloader.next_chunk()
    if status:
        print("Download %d%%." % int(status.progress() * 100))
print("Download Complete!")

answered Mar 26, 2018 at 2:27

bgoodr

2,9601 gold badge32 silver badges59 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Grant Robert Smith Over a year ago

This appears to be partially working for me. It prints the progress until it goes to call next_chunck on the last chunk. Then it returns a 413. My best guess is that MediaIoBaseUpload (which I'm using instead of MediaFileUpload) implements the request chunking incorrectly. Either that or there is a newly imposed file size limit that isn't well documented.

KillerKode Over a year ago

Good answer, this works for me. However, how do I get back the fields using this approach? For example, I want the id of the uploaded file, in a standard upload I would just specify fields="id" inside the .create() and run .execute() on that which would give it back.

bgoodr Over a year ago

@KillerKode Not really sure where you would get the "id". It's been too long ago when I posted this, but I do see response['id'] might be it. Even if that were the case, it might not help you if you hadn't first uploaded the file and had access to the response object. Maybe someone else would know how to obtain that info from some other API (e.g., from searching Drive for filenames and such).

KillerKode Over a year ago

I thought so too but got an exception when trying it lol. Google documentation is never great. Thanks anyway, I don't really need the ID but wanted to put it into my reusable functions for the future. Main thing is my 100 MB uploads are working a treat now :).

Grant Robert Smith · Accepted Answer · 2018-10-31 07:55:01Z

2

You could upload the file using a resumable media upload. This will send the file in chunks and should not max out your memory, which I assume is happening because your client is trying to send the whole file at once.

To do this you need to pass a MediaFileUpload object to the create() method in which the resumable flag is set to True. Optionally you can also set the chunksize.

metadata = {'name': filename}
media = MediaFileUpload(filename, mimetype=mimetype, resumable=True)

request = DRIVE.files().create(body=metadata, media_body=media)
response = None
while response is None:
  status, response = request.next_chunk()
  if status:
    print "Uploaded %d%%." % int(status.progress() * 100)
print "Upload Complete!"

Try reducing the chunksize if needed.

edited Oct 31, 2018 at 7:55

Grant Robert Smith

5014 silver badges11 bronze badges

answered Feb 18, 2018 at 11:41

mhawke

87.5k10 gold badges122 silver badges142 bronze badges

3 Comments

Vikram Singh Chandel Over a year ago

thanks for the ans but I am getting AttributeError: 'dict' object has no attribute 'next_chunk' error...

bgoodr Over a year ago

@VikramSinghChandel I ran into the very same error you did. I have concluded that either mhawke's answer was wrong to begin with, or that it was correct at one point in time but the API maintainers invalidated it. See my modification of his code in my answer at stackoverflow.com/a/49483101/257924

Grant Robert Smith Over a year ago

This answer is correct if you remove the execute() call.

Alice Iceberg · Accepted Answer · 2021-08-27 02:41:44Z

1

The easiest way to upload large files to Google drive with python is just to add resumable=True

from googleapiclient.http import MediaFileUpload    
media = MediaFileUpload(filename, resumable=True)

answered Aug 27, 2021 at 2:41

Alice Iceberg

412 bronze badges

Collectives™ on Stack Overflow

Unable to upload huge file on google drive using python

3 Answers 3

4 Comments

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related