4

I am trying to upload files to Google drive by Google API using the following code

import httplib2
from apiclient import discovery
from httplib2 import Http
from oauth2client import file, client, tools
try:
    import argparse
    flags = argparse.ArgumentParser(parents=[tools.argparser]).parse_args()
except ImportError:
    flags = None

SCOPES =['https://www.googleapis.com/auth/drive','https://www.googleapis.com/auth/drive.file','https://www.googleapis.com/auth/drive.appdata', 'https://www.googleapis.com/auth/drive.apps.readonly']
store = file.Storage('scope.json')
creds = store.get()
if not creds or creds.invalid:
    flow = client.flow_from_clientsecrets('client_secret.json', SCOPES)
    creds = tools.run_flow(flow, store, flags) if flags else tools.run(flow, store)
    DRIVE = discovery.build('drive', 'v3', http=creds.authorize(Http()))
else:
    credentials = creds
    http = credentials.authorize(httplib2.Http())
    DRIVE = discovery.build('drive', 'v3', http=http)

FILES = (
    ('/home/vkm/mayur/Demo_Google_API.zip', 'application/vmd.google-apps.document'),
)

for filename, mimeType in FILES:
    metadata = {'name': filename}
    if mimeType:
        metadata['mimeType'] = mimeType
    res = DRIVE.files().create(body=metadata, media_body=filename).execute()
    if res:
        print('Uploaded "%s" (%s)' % (filename, res['mimeType']))

I am able to upload the small files but when I am trying with 8GB of the file, it is giving MemorryErro.Please find the error message that I am getting.

Traceback (most recent call last):
  File "demo.py", line 46, in <module>
    res = DRIVE.files().create(body=metadata, media_body=filename).execute()
  File "/usr/local/lib/python2.7/dist-packages/googleapiclient/discovery.py", line 853, in method
    payload = media_upload.getbytes(0, media_upload.size())
  File "/usr/local/lib/python2.7/dist-packages/googleapiclient/http.py", line 482, in getbytes
    return self._fd.read(length)
MemoryError
2
  • Your machine is running out of memory. Increase the available memory (if only it was that easy). Consider file compression. Or split the file up into multiple files and send each separately. Commented Feb 18, 2018 at 10:42
  • @mhawke thanks for the reply.. is there any other way to do this.... Commented Feb 18, 2018 at 10:53

3 Answers 3

9

Vikram's comment revealed a problem in mhawke's answer: next_chunk needs to be called upon the return value of:

request = DRIVE.files().create(body=metadata, media_body=media)

not on the return value of request.execute().

Here is a snippet of Python code I verified as working on files up to 10MB to my Google Drive account:

# Upload some file that just happens to be binary (we
# don't care about metadata, just upload it without
# translation):
the_file_to_upload = 'some_binary_file'
metadata = {'name': the_file_to_upload}
# Note the chunksize restrictions given in
# https://developers.google.com/api-client-library/python/guide/media_upload
media = MediaFileUpload(the_file_to_upload,
                        chunksize=1024 * 1024,
                        # Not sure whether or not this mimetypes is necessary:
                        mimetype='text/plain',
                        resumable=True)
request = drive_service.files().create(body=metadata, media_body=media)
response = None
while response is None:
    status, response = request.next_chunk()
    if status:
        print("Uploaded %d%%." % int(status.progress() * 100))
print("Upload of {} is complete.".format(the_file_to_upload))

Here is a snippet of Python code that downloads the same file, but to a different file, so that I can use sha1sum to verify file has not been altered by Google Drive going in and out.

# Verify downloading works without translation:
request = drive_service.files().get_media(fileId=response['id'])
# Use io.FileIO. Refer to:
# https://google.github.io/google-api-python-client/docs/epy/googleapiclient.http.MediaIoBaseDownload-class.html
out_filename = the_file_to_upload + ".out"
fh = io.FileIO(out_filename, mode='wb')
downloader = MediaIoBaseDownload(fh, request, chunksize=1024 * 1024)
done = False
while done is False:
    status, done = downloader.next_chunk()
    if status:
        print("Download %d%%." % int(status.progress() * 100))
print("Download Complete!")
Sign up to request clarification or add additional context in comments.

4 Comments

This appears to be partially working for me. It prints the progress until it goes to call next_chunck on the last chunk. Then it returns a 413. My best guess is that MediaIoBaseUpload (which I'm using instead of MediaFileUpload) implements the request chunking incorrectly. Either that or there is a newly imposed file size limit that isn't well documented.
Good answer, this works for me. However, how do I get back the fields using this approach? For example, I want the id of the uploaded file, in a standard upload I would just specify fields="id" inside the .create() and run .execute() on that which would give it back.
@KillerKode Not really sure where you would get the "id". It's been too long ago when I posted this, but I do see response['id'] might be it. Even if that were the case, it might not help you if you hadn't first uploaded the file and had access to the response object. Maybe someone else would know how to obtain that info from some other API (e.g., from searching Drive for filenames and such).
I thought so too but got an exception when trying it lol. Google documentation is never great. Thanks anyway, I don't really need the ID but wanted to put it into my reusable functions for the future. Main thing is my 100 MB uploads are working a treat now :).
2

You could upload the file using a resumable media upload. This will send the file in chunks and should not max out your memory, which I assume is happening because your client is trying to send the whole file at once.

To do this you need to pass a MediaFileUpload object to the create() method in which the resumable flag is set to True. Optionally you can also set the chunksize.

metadata = {'name': filename}
media = MediaFileUpload(filename, mimetype=mimetype, resumable=True)

request = DRIVE.files().create(body=metadata, media_body=media)
response = None
while response is None:
  status, response = request.next_chunk()
  if status:
    print "Uploaded %d%%." % int(status.progress() * 100)
print "Upload Complete!"

Try reducing the chunksize if needed.

3 Comments

thanks for the ans but I am getting AttributeError: 'dict' object has no attribute 'next_chunk' error...
@VikramSinghChandel I ran into the very same error you did. I have concluded that either mhawke's answer was wrong to begin with, or that it was correct at one point in time but the API maintainers invalidated it. See my modification of his code in my answer at stackoverflow.com/a/49483101/257924
This answer is correct if you remove the execute() call.
1

The easiest way to upload large files to Google drive with python is just to add resumable=True

from googleapiclient.http import MediaFileUpload    
media = MediaFileUpload(filename, resumable=True)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.