Multipart-Encoded and Python Requests

It’s easy to find on the web many examples of how to send multipart-encoded data like images/files using python requests. Even in request’s documentation there’s a section only for that. But I struggled a couple days ago about the Content-type header.

The recommended header for multipart-encoded files/images is multipart/form-data and requests already set it for us automatically, using the parameter “files”. Here’s an example taken from requests documentation:

>>> url = 'https://httpbin.org/post'
>>> files = {'file': open('report.xls', 'rb')}

>>> r = requests.post(url, files=files)
>>> r.text
{
  ...
  "files": {
    "file": "<censored...binary...data>"
  },
  ...
}

As you can see, you don’t even need to set the header. Moving on, we often need custom headers, like x-api-key or something else. So, we’d have:

>>> headers = {'x-auth-api-key': <SOME_TOKEN>, 'Content-type': 'multipart/form-data'}
>>> url = 'https://httpbin.org/post'
>>> files = {'file': open('report.xls', 'rb')}

>>> r = requests.post(url, files=files, headers=headers)
>>> r.text
{
  ...
  "files": {
    "file": "<censored...binary...data>"
  },
  ...
}

Right? Unfortunately, not. Most likely that you will receive an error like below:

ValueError: Invalid boundary in multipart form: b'' 

or

{'detail': 'Multipart form parse error - Invalid boundary in multipart: None'}

Or even from a simple Nodejs server, because it’s not a matter of language or framework. In the case of the NodeJs server, you will get an undefined in request.files because is not set.

So, what’s the catch?

The catch here is even when we need custom headers, we don’t need to set the 'Content-type': 'multipart/form-data', because otherwise requests won’t do its magic for us setting the boundary field.

For multipart entities the boundary directive is required, which consists of 1 to 70 characters from a set of characters known to be very robust through email gateways, and not ending with white space. It is used to encapsulate the boundaries of the multiple parts of the message. Often, the header boundary is prepended with two dashes and the final boundary has two dashes appended at the end. (source)

Here’s an example of a request containing multipart/form-data:

Example of a request containing multipart/form-data

So, there it is. When using requests to POST file and/or images, use the files param and “forget” the Content-type, because the library will handle it for you.

Nice, huh? 😄
Not when I was suffering. 😒