8

I would like to upload a file to a web server. From what I have read, the best way to do this is to use the multipart/form-data encoding type on an HTTP POST request.

My research seems to indicate that there is no simple way to do this using the Python standard library. I am using Python 3.

(Note: see a package called requests (PyPI Link) to easily accomplish this)

I am currently using this method:

import mimetypes, http.client
boundary = 'wL36Yn8afVp8Ag7AmP8qZ0SA4n1v9T' # Randomly generated
for fileName in fileList:
    # Add boundary and header
    dataList.append('--' + boundary)
    dataList.append('Content-Disposition: form-data; name={0}; filename={0}'.format(fileName))

    fileType = mimetypes.guess_type(fileName)[0] or 'application/octet-stream'
    dataList.append('Content-Type: {}'.format(fileType))
    dataList.append('')

    with open(fileName) as f: 
        # Bad for large files
        dataList.append(f.read())

dataList.append('--'+boundary+'--')
dataList.append('')
contentType = 'multipart/form-data; boundary={}'.format(boundary)

body = '\r\n'.join(dataList)
headers = {'Content-type': contentType}

conn = http.client.HTTPConnection('http://...')
req = conn.request('POST', '/test/', body, headers)

print(conn.getresponse().read())

This works to send text.

There are two issues: This is text only, and the whole text file must be stored in memory as a giant string.

How can I upload any binary file? Is there a way to do this without reading the whole file into memory?

3
  • 1
    You should REALLY use the requests package you mentioned already. There is a lot of stuff which is not handled by the standard library, like session management, authentication, checking SSL certificates, ... What's your reason for not using the request module? Commented Apr 3, 2013 at 7:54
  • You could just copy code out of the requests module that does what you need. Commented Apr 3, 2013 at 18:09
  • 1
    It's of course your decission. But if I can choose between packaging a high quality, well tested, "mission proved", ... library or my own custom code, I would always choose the first one. ;-) Commented Apr 3, 2013 at 20:23

4 Answers 4

3

Take a look at small Doug Hellmann's urllib2, translated by me to python3.

I use it nearly this way:

import urllib.request
import urllib.parse
from lib.multipart_sender import MultiPartForm

myfile = open('path/to/file', 'rb')
form = MultiPartForm()
form.add_field('token', apipost[mycgi['domain']]._token)
form.add_field('domain', mycgi['domain'])
form.add_file('file', 'logo.jpg', fileHandle=myfile)
form.make_result()

url = 'http://myurl'
req1 = urllib.request.Request(url)
req1.add_header('Content-type', form.get_content_type())
req1.add_header('Content-length', len(form.form_data))
req1.add_data(form.form_data)
fp = urllib.request.urlopen(req1)
print(fp.read()) # to view status
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you, but the idea is to use the standard library.
2

I had a look at this module

class HTTPConnection:
    # ...
    def send(self, data): # line 820
        """Send `data' to the server.
        ``data`` can be a string object, a bytes object, an array object, a
        file-like object that supports a .read() method, or an iterable object.
        """

data is exactly body. You may pass an iterator like this: (I did not try it out)

def body():
  for fileName in fileList:
    # Add boundary and header
    yield('--' + boundary) + '\r\n'
    yield('Content-Disposition: form-data; name={0}; filename=    {0}'.format(fileName)) + '\r\n'

    fileType = mimetypes.guess_type(fileName)[0] or 'application/octet-stream'
    yield('Content-Type: {}'.format(fileType)) + '\r\n'
    yield('\r\n')

    with open(fileName) as f: 
        # Bad for large files
        yield f.read()
    yield('--'+boundary+'--') + '\r\n'
    yield('') + '\r\n'

3 Comments

This is a good idea, and works with a few modifications in 3.2 or 3.3. If an iterable is given to HTTPConnection.request or HTTPConnection.send then the content-length header also must be specified. However, I am using Python 3.1. Perhaps it is not feasible.
What problem do you get with python 3.1?
In 3.1 body cannot be an iterable. We actually have moved away from 3.1, so this answers my question.
0

I answered a similar question on how to do this using just python's standard library email module, here's the relevant method:

import email.parser
import email.mime.multipart
import email.mime.text
import email.mime.base
import mimetypes
import os
import io

def encode_multipart(fields: dict[str, str], files: dict[str, io.IOBase]):
    multipart_data = email.mime.multipart.MIMEMultipart("form-data")

    # Add form fields
    for key, value in fields.items():
        part = email.mime.text.MIMEText(str(value), "plain")
        part.add_header("Content-Disposition", f"form-data; name=\"{key}\"")
        multipart_data.attach(part)

    # Add files
    for key, fp in files.items():
        mimetype = mimetypes.guess_type(fp.name)[0]
        maintype, subtype = mimetype.split("/", maxsplit=1)
        basename = os.path.basename(fp.name)
        part = email.mime.base.MIMEBase(maintype, subtype)
        part.set_payload(fp.read())

        part.add_header(
            "Content-Disposition",
            f"form-data; name=\"{key}\";filename=\"{basename}\""
        )
        email.encoders.encode_base64(part)
        multipart_data.attach(part)

    headerbytes, body = multipart_data.as_bytes().split(b"\n\n", 1)
    hp = email.parser.BytesParser().parsebytes(headerbytes, headersonly=True)

    return hp._headers, body

The function isn't verifying the data or handling any errors but this should be enough to get people started and they can read my other answer for more details.

Comments

-2

You can use unirest to make the call. Sample code

import unirest

# consume async post request
def consumePOSTRequestSync():
 params = {'test1':'param1','test2':'param2'}

 # we need to pass a dummy variable which is open method
 # actually unirest does not provide variable to shift between
 # application-x-www-form-urlencoded and
 # multipart/form-data
 params['dummy'] = open('dummy.txt', 'r')
 url = 'http://httpbin.org/post'
 headers = {"Accept": "application/json"}
 # call get service with headers and params
 response = unirest.post(url, headers = headers,params = params)
 print "code:"+ str(response.code)
 print "******************"
 print "headers:"+ str(response.headers)
 print "******************"
 print "body:"+ str(response.body)
 print "******************"
 print "raw_body:"+ str(response.raw_body)

# post sync request multipart/form-data
consumePOSTRequestSync()

Check out the blog post for more details http://stackandqueue.com/?p=57

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.