I'm new to python so I am building a simple program to parse YAML to JSON and JSON to YAML.
The yaml2json converts YAML to JSON on a single line, but a JSON validator says it is correct.
This is my code so far:
def parseyaml(inFileType, outFileType):
infile = input('Please enter a {} filename to parse: '.format(inFileType))
outfile = input('Please enter a {} filename to output: '.format(outFileType))
with open(infile, 'r') as stream:
try:
datamap = yaml.safe_load(stream)
with open(outfile, 'w') as output:
json.dump(datamap, output)
except yaml.YAMLError as exc:
print(exc)
print('Your file has been parsed.\n\n')
def parsejson(inFileType, outFileType):
infile = input('Please enter a {} filename to parse: '.format(inFileType))
outfile = input('Please enter a {} filename to output: '.format(outFileType))
with open(infile, 'r') as stream:
try:
datamap = json.load(stream)
with open(outfile, 'w') as output:
yaml.dump(datamap, output)
except yaml.YAMLError as exc:
print(exc)
print('Your file has been parsed.\n\n')
An example of the original YAML vs. the new YAML
Original:
inputs:
webTierCpu:
type: integer
minimum: 2
default: 2
maximum: 5
title: Web Server CPU Count
description: The number of CPUs for the Web nodes
New:
inputs:
dbTierCpu: {default: 2, description: The number of CPUs for the DB node, maximum: 5,
minimum: 2, title: DB Server CPU Count, type: integer}
It doesn't look like its decoding all of the JSON so I'm not sure where I should go next...
Your file is losing its formatting because the original dump routine
by default writes all leaf nodes in YAML flow-style, whereas your input is block style
all the way.
You are also losing the order of the keys, which is first because the JSON parser
uses dict, and second because dump sorts the output.
If you look at your intermediate JSON you already see that the key order is
gone at that point. To preserve that, use the new API to load your YAML
and have a special JSON encoder as a replacement for dump that can
handle the subclasses of Mapping in which the YAML is loaded similar to
this example
from the standard Python doc.
Assuming your YAML is stored in input.yaml:
import sys
import json
from collections.abc import Mapping, Sequence
from collections import OrderedDict
import ruamel.yaml
# if you instantiate a YAML instance as yaml, you have to explicitly import the error
from ruamel.yaml.error import YAMLError
yaml = ruamel.yaml.YAML() # this uses the new API
# if you have standard indentation, no need to use the following
yaml.indent(sequence=4, offset=2)
input_file = 'input.yaml'
intermediate_file = 'intermediate.json'
output_file = 'output.yaml'
class OrderlyJSONEncoder(json.JSONEncoder):
def default(self, o):
if isinstance(o, Mapping):
return OrderedDict(o)
elif isinstance(o, Sequence):
return list(o)
return json.JSONEncoder.default(self, o)
def yaml_2_json(in_file, out_file):
with open(in_file, 'r') as stream:
try:
datamap = yaml.load(stream)
with open(out_file, 'w') as output:
output.write(OrderlyJSONEncoder(indent=2).encode(datamap))
except YAMLError as exc:
print(exc)
return False
return True
yaml_2_json(input_file, intermediate_file)
with open(intermediate_file) as fp:
sys.stdout.write(fp.read())
which gives:
{
"inputs": {
"webTierCpu": {
"type": "integer",
"minimum": 2,
"default": 2,
"maximum": 5,
"title": "Web Server CPU Count",
"description": "The number of CPUs for the Web nodes"
}
}
}
You see that your JSON has the appropriate key order, which we also
need to preserve on loading. You can do that without subclassing
anything, by specifying the loading of JSON objects into the subclass of
Mapping, that the YAML parser is using internally, by providingobject_pairs_hook.
from ruamel.yaml.comments import CommentedMap
def json_2_yaml(in_file, out_file):
with open(in_file, 'r') as stream:
try:
datamap = json.load(stream, object_pairs_hook=CommentedMap)
# if you need to "restore" literal style scalars, etc.
# walk_tree(datamap)
with open(out_file, 'w') as output:
yaml.dump(datamap, output)
except yaml.YAMLError as exc:
print(exc)
return False
return True
json_2_yaml(intermediate_file, output_file)
with open(output_file) as fp:
sys.stdout.write(fp.read())
Which outputs:
inputs:
webTierCpu:
type: integer
minimum: 2
default: 2
maximum: 5
title: Web Server CPU Count
description: The number of CPUs for the Web nodes
And I hope that that is similar enough to your original input to be acceptable.
Notes:
When using the new API I tend to use yaml as the name of the
instance of ruamel.yaml.YAML(), instead of from ruamel import
yaml. That however masks the use of yaml.YAMLError because the
error class is not an attribute of YAML()
If you are developing this kind of stuff, I can recommend removing
at least the user input from the actual functionality. It should be
trivial to write your parseyaml and parsejson to call yaml_2_json resp.
json_2_yaml.
Any comments in your original YAML file will be lost, although
ruamel.yaml can load them. JSON originally did allow comments, but it is
not in the specification and no parsers that I know can output comments.
Since your real file has literal block scalars you have to use some magic to get those back.
Include the following functions that walk a tree, recursing into dict values and list elements and converting any line with an embedded newline to a type that gets output to YAML as a literal blocks style scalar in place (hence no return value):
from ruamel.yaml.scalarstring import PreservedScalarString, SingleQuotedScalarString
from ruamel.yaml.compat import string_types, MutableMapping, MutableSequence
def preserve_literal(s):
return PreservedScalarString(s.replace('\r\n', '\n').replace('\r', '\n'))
def walk_tree(base):
if isinstance(base, MutableMapping):
for k in base:
v = base[k] # type: Text
if isinstance(v, string_types):
if '\n' in v:
base[k] = preserve_literal(v)
elif '${' in v or ':' in v:
base[k] = SingleQuotedScalarString(v)
else:
walk_tree(v)
elif isinstance(base, MutableSequence):
for idx, elem in enumerate(base):
if isinstance(elem, string_types):
if '\n' in elem:
base[idx] = preserve_literal(elem)
elif '${' in elem or ':' in elem:
base[idx] = SingleQuotedScalarString(elem)
else:
walk_tree(elem)
And then do
walk_tree(datamap)
after you load the data from JSON.
With all of the above you should have only one line that differs in your Wordpress.yaml file.
function yaml_validate {
python -c 'import sys, yaml, json; yaml.safe_load(sys.stdin.read())'
}
function yaml2json {
python -c 'import sys, yaml, json; print(json.dumps(yaml.safe_load(sys.stdin.read())))'
}
function yaml2json_pretty {
python -c 'import sys, yaml, json; print(json.dumps(yaml.safe_load(sys.stdin.read()), indent=2, sort_keys=False))'
}
function json_validate {
python -c 'import sys, yaml, json; json.loads(sys.stdin.read())'
}
function json2yaml {
python -c 'import sys, yaml, json; print(yaml.dump(json.loads(sys.stdin.read())))'
}
More useful Bash tricks at http://github.com/frgomes/bash-scripts
Related
I have created a simple API with FastAPI and I want to export the output in a text file (txt).
This is a simplified code
import sys
from clases.sequence import Sequence
from clases.read_file import Read_file
from fastapi import FastAPI
app = FastAPI()
#app.get("/DNA_toolkit")
def sum(input: str): # pass the sequence in, this time as a query param
DNA = Sequence(input) # get the result (i.e., 4)
return {"Length": DNA.length(), # return the response
"Reverse": DNA.reverse(),
"complement":DNA.complement(),
"Reverse and complement": DNA.reverse_and_complement(),
"gc_percentage": DNA.gc_percentage()
}
And this is the output
{"Length":36,"Reverse":"TTTTTTTTTTGGGGGGGAAAAAAAAAAAAAAAATAT","complement":"ATATTTTTTTTTTTTTTTTCCCCCCCAAAAAAAAAA","Reverse and complement":"AAAAAAAAAACCCCCCCTTTTTTTTTTTTTTTTATA","gc_percentage":5.142857142857143}
The file I would like to get
Length 36
Reverse TTTTTTTTTTGGGGGGGAAAAAAAAAAAAAAAATAT
complement ATATTTTTTTTTTTTTTTTCCCCCCCAAAAAAAAAA
Reverse and complement AAAAAAAAAACCCCCCCTTTTTTTTTTTTTTTTATA
There is a simple way to do this. This is my first time working with APIs and I don't even know how possible is this
dict1={"Length":36,"Reverse":"TTTTTTTTTTGGGGGGGAAAAAAAAAAAAAAAATAT","complement":"ATATTTTTTTTTTTTTTTTCCCCCCCAAAAAAAAAA","Reverse and complement":"AAAAAAAAAACCCCCCCTTTTTTTTTTTTTTTTATA","gc_percentage":5.142857142857143}
with open("output.txt","w") as data:
for k,v in dict1.items():
append_data=k+" "+str(v)
data.write(append_data)
data.write("\n")
Output:
Length 36
Reverse TTTTTTTTTTGGGGGGGAAAAAAAAAAAAAAAATAT
complement ATATTTTTTTTTTTTTTTTCCCCCCCAAAAAAAAAA
Reverse and complement AAAAAAAAAACCCCCCCTTTTTTTTTTTTTTTTATA
gc_percentage 5.142857142857143
You can use open method to create a new file, and write your output. And as #Blackgaurd told you, this isn't a code-writing service.
Also I wrote this code really quickly so some syntax error may occur
import sys
import datetime
from clases.sequence import Sequence
from clases.read_file import Read_file
from fastapi import FastAPI
app = FastAPI()
#app.get("/DNA_toolkit")
def sum(input: str): # pass the sequence in, this time as a query param
DNA = Sequence(input) # get the result (i.e., 4)
res = {"Length": DNA.length(), # return the response
"Reverse": DNA.reverse(),
"complement":DNA.complement(),
"Reverse and complement": DNA.reverse_and_complement(),
"gc_percentage": DNA.gc_percentage()
}
#with open('result.txt', 'w+') as resFile:
#for i in res:
#resFile.write(i+" "+res[i]+"\n")
#resFile.close()
# Undo the above comment if you don't want to save result into
#file with unique id, else go with the method I wrote below...
filename = str(datetime.datetime.now().date()) + '_' + str(datetime.datetime.now().time()).replace(':', '.')
with open(filename+'.txt', 'w+') as resFile:
for i in res:
resFile.write(i+" "+res[i]+"\n")
resFile.close()
return {"Length": DNA.length(), # return the response
"Reverse": DNA.reverse(),
"complement":DNA.complement(),
"Reverse and complement": DNA.reverse_and_complement(),
"gc_percentage": DNA.gc_percentage()
}
I gonna assume that you have already got your data somehow calling your API.
# data = request.get(...).json()
# save to file:
with open("DNA_insights.txt", 'w') as f:
for k, v in data.items():
f.write(f"{k}: {v}\n")
I have a file, memory.txt, and I want to store an instance of the class Weapon() in a dictionary, on the second line.
with open(memorypath(), "r") as f:
lines = f.readlines()
inv = inventory()
if "MAINWEAPON" not in inv or inv["MAINWEAPON"] == "":
inv["MAINWEAPON"] = f"""Weapon(sw, 0, Ability(0, "0"), ["{name}'s first weapon."], dmg=30, cc=20, str=15)"""
lines[1] = str(inv) + "\n"
with open(memorypath(), "w") as f:
f.writelines(lines)
(inventory and memorypath are from another file I have for utility functions)
Though, with what I have, if I get inv["MAINWEAPON"] I'll just get the string, not the class. And I have to store it like a string, or else I'll be getting something like <__main\__.Weapon object at (hexadecimal path thing)>.
How do I get the class itself upon getting inv["MAINWEAPON"]?
Another thing, too, I feel like I'm making such confusion with newlines, because file memory.txt has 6 lines but gets shortened to 5, please tell me if I'm doing anything wrong.
If you have a class then you can represent it as a dict and save it as json format.
class Cat:
name: str
def __init__(self, name: str):
self.name = name
def dict(self):
return {'name': self.name}
#classmethod
def from_dict(cls, d):
return cls(name = d['name'])
Now you can save the class as a json to a file like this:
import json
cat = Cat('simon')
with open('cat.json', 'w') as f:
json.dump(cat.dict(), f)
And you can load the json again like this:
with open('cat.json', 'r') as f:
d = json.load(f)
cat = Cat.from_dict(d)
Update
Since python 3.7 the possilility to make dataclasses has been made, and I am here giving an example of how you can use that to save the classes into a json format.
If you want to use the json file as a database and be able to append new entities to it then you will have to load the file into memory and append the new data and finally override the old json file, the code below will do exactly that.
from dataclasses import dataclass, asdict
import json
#dataclass
class Cat:
name: str
def load_cats() -> list[Cat]:
try:
with open('cats.json', 'r') as fd:
return [Cat(**x) for x in json.load(fd)]
except FileNotFoundError:
return []
def save_cat(c):
data = [asdict(x) for x in load_cats() + [c]]
with open('cats.json', 'w') as fd:
json.dump(data, fd)
c = Cat(name='simon')
save_cat(c)
cats = load_cats()
print(cats)
A simplest approach I can suggest would be dataclasses.asdict as mentioned; or else, using a serialization library that supports dataclasses. There are a lot of good ones out there, but for this purpose I might suggest dataclass-wizard. Further, if you want to transform an arbitrary JSON object to dataclass structure, you can use the included CLI tool. When serializing, it will autoamtically apply a key transform (snake_case to camelCase) but this is easily customizable as well.
Disclaimer: I am the creator (and maintener) of this library.
I want to make a keystore of values in JSON. Everything should work through the arguments entered into the console. That is, the data is first written to a file, and then must be read from there.
Input: python storage.py --key key_name --value value_name
Output: python storage.py --key key_name
A function with arguments and a function with data entry work. But I had a problem with the file read function. I need to print the key by its value, or values if there are several.
The recorded JSON looks something like this:
{"key": "Pepe", "value": "Pepeyaya"}{"key": "PepeHug", "value": "KekeHug"}{"key": "Pepega", "value": "Kekega"}{"key": "Pepe", "value": "Keke"}
I tried reading the file like this:
data = json.loads(f.read())
But the error is exactly the same
In other similar topics I saw that the "dictionaries" in JSON are written to the list. I tried something like this:
data = json.loads([f.read()])
Result:
TypeError: the JSON object must be str, bytes or bytearray, not list
Also:
data = json.load([f])
Result:
AttributeError: 'list' object has no attribute 'read'
I tried to change the recording function, but I can't write everything to a pre-created dictionary, everything is written to the right of it. Something like this:
[]{"key": "Pepe", "value": "Pepeyaya"}{"key": "PepeHug", "value": "KekeHug"}{"key": "Pepega", "value": "Kekega"}{"key": "Pepe", "value": "Keke"}
Code:
import os
import tempfile
import json
import sys
def create_json(path):
with open(path, mode='a', encoding='utf-8') as f:
json.dump([], f)
def add(key, value, path):
with open(path, mode='a', encoding='utf-8') as f:
entry = {'key': key, 'value': value}
json.dump(entry, f)
def read(a_key, path):
read_result = ""
with open(path) as f:
data = json.load(f)
print(data)
my_list = data
for i in my_list:
for key, value in i.items():
if key == a_key:
read_result += value + ", "
print(value)
def main():
storage_path = os.path.join(tempfile.gettempdir(), 'storage.json')
if sys.argv[1] == "--key":
arg_key = sys.argv[2]
if len(sys.argv) <= 3:
read(arg_key, storage_path)
elif sys.argv[3] == "--value":
arg_value = sys.argv[4]
add(arg_key, arg_value, storage_path)
else:
print("Введите верные аргументы")
else:
print("Введите верные аргументы")
if __name__ == '__main__':
main()
In general, from the attached code, now this error:
json.decoder.JSONDecodeError: Extra data: line 1 column 39 (char 38)
I need on request:
python storage.py --key Pepe
Get Pepe and PepeYaya values
this it's a basic storage method, this method is very bad for large json files but it's an example that show how can you do the job.
import os
import sys
import json
# storage.py key_name value
key =sys.argv[1]
value = sys.argv[2]
data_path = "data.json"
if os.path.isfile(data_path):
with open("data.json") as target:
json_data = json.load(target)
else:
json_data = {}
json_data[key] = value
with open("data.json", "w") as target:
json.dump(json_data, target)
in your case the problem is because the append flag when you open the file. If you need to write a new object you need to delete the last '}' of the json and add a ",object" item after that add the '}' char again.
I'm have a csv file which contains hundred thousands of rows and below are some sample lines..,
1,Ni,23,28-02-2015 12:22:33.2212-02
2,Fi,21,28-02-2015 12:22:34.3212-02
3,Us,33,30-03-2015 12:23:35-01
4,Uk,34,31-03-2015 12:24:36.332211-02
I need to get the last column of csv data which is in wrong datetime format. So I need to get default datetimeformat("YYYY-MM-DD hh:mm:ss[.nnn]") from last column of the data.
I have tried the following script to get lines from it and write into flow file.
import json
import java.io
from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import StreamCallback
class PyStreamCallback(StreamCallback):
def __init__(self):
pass
def process(self, inputStream, outputStream):
text = IOUtils.readLines(inputStream, StandardCharsets.UTF_8)
for line in text[1:]:
outputStream.write(line + "\n")
flowFile = session.get()
if (flowFile != None):
flowFile = session.write(flowFile,PyStreamCallback())
flowFile = session.putAttribute(flowFile, "filename", flowFile.getAttribute('filename'))
session.transfer(flowFile, REL_SUCCESS)
but I am not able to find a way to convert it like below output.
1,Ni,23,28-02-2015 12:22:33.221
2,Fi,21,29-02-2015 12:22:34.321
3,Us,33,30-03-2015 12:23:35
4,Uk,34,31-03-2015 12:24:36.332
I have checked solutions with my friend(google) and was still not able to find solution.
Can anyone guide me to convert those input data into my required output?
In this transformation the unnecessary data located at the end of each line, so it's really easy to manage transform task with regular expression.
^(.*:\d\d)((\.\d{1,3})(\d*))?(-\d\d)?
Check the regular expression and explanation here:
https://regex101.com/r/sAB4SA/2
As soon as you have a large file - better not to load it into the memory. The following code loads whole the file into the memory:
IOUtils.readLines(inputStream, StandardCharsets.UTF_8)
Better to iterate line by line.
So this code is for ExecuteScript nifi processor with python (Jython) language:
import sys
import re
import traceback
from org.apache.commons.io import IOUtils
from org.apache.nifi.processor.io import StreamCallback
from org.python.core.util import StringUtil
from java.lang import Class
from java.io import BufferedReader
from java.io import InputStreamReader
from java.io import OutputStreamWriter
class TransformCallback(StreamCallback):
def __init__(self):
pass
def process(self, inputStream, outputStream):
try:
writer = OutputStreamWriter(outputStream,"UTF-8")
reader = BufferedReader(InputStreamReader(inputStream,"UTF-8"))
line = reader.readLine()
p = re.compile('^(.*:\d\d)((\.\d{1,3})(\d*))?(-\d\d)?')
while line!= None:
# print line
match = p.search(line)
writer.write( match.group(1) + (match.group(3) if match.group(3)!=None else '') )
writer.write('\n')
line = reader.readLine()
writer.flush()
writer.close()
reader.close()
except:
traceback.print_exc(file=sys.stdout)
raise
flowFile = session.get()
if flowFile != None:
flowFile = session.write(flowFile, TransformCallback())
# Finish by transferring the FlowFile to an output relationship
session.transfer(flowFile, REL_SUCCESS)
And as soon as question is about nifi, here are alternatives that seems to be easier
the same code as above but in groovy for nifi ExecuteScript processor:
def ff = session.get()
if(!ff)return
ff = session.write(ff, {rawIn, rawOut->
// ## transform streams into reader and writer
rawIn.withReader("UTF-8"){reader->
rawOut.withWriter("UTF-8"){writer->
reader.eachLine{line, lineNum->
if(lineNum>1) { // # skip the first line
// ## let use regular expression to transform each line
writer << line.replaceAll( /^(.*:\d\d)((\.\d{1,3})(\d*))?(-\d\d)?/ , '$1$3' ) << '\n'
}
}
}
}
} as StreamCallback)
session.transfer(ff, REL_SUCCESS)
ReplaceText processor
And if regular expression is ok - the easiest way in nifi is a ReplaceText processor that could do regular expression replace line-by-line.
In this case you don't need to write any code, just build the regular expression and configure your processor correctly.
Just using pure jython. It is an example that can be adapted to OP's needs.
Define a datetime parser for this csv file
from datetime import datetime
def parse_datetime(dtstr):
mydatestr='-'.join(dtstr.split('-')[:-1])
try:
return datetime.strptime(mydatestr,'%d-%m-%Y %H:%M:%S.%f').strftime('%d-%m-%Y %H:%M:%S.%f')[:-3]
except ValueError:
return datetime.strptime(mydatestr,'%d-%m-%Y %H:%M:%S').strftime('%d-%m-%Y %H:%M:%S')
my test.csv includes data like this: ( 2015 didnt have 29 Feb had to change OP's example ).
1,Ni,23,27-02-2015 12:22:33.2212-02
2,Fi,21,28-02-2015 12:22:34.3212-02
3,Us,33,30-03-2015 12:23:35-01
4,Uk,34,31-03-2015 12:24:36.332211-02
now the solution
with open('test.csv') as fi:
for line in fi:
line_split=line.split(',')
out_line = ', '.join(word if i<3 else parse_datetime(word) for i,word in enumerate(line_split))
#print(out_line)
#you can write this out_line to a file here.
printing out_line looks like this
1, Ni, 23, 27-02-2015 12:22:33.221
2, Fi, 21, 28-02-2015 12:22:34.321
3, Us, 33, 30-03-2015 12:23:35
4, Uk, 34, 31-03-2015 12:24:36.332
You can get them with regex :
(\d\d-\d\d-\d\d\d\d\ \d\d:\d\d:)(\d+(?:\.\d+)*)(-\d\d)$
Then just replace #2 with a rounded version of #2
See regex example at regexr.com
You could even do it "nicer" by getting every single value with a capturing group and then put them into a datetime.datetime object and print it from there, but imho that would be an overkill in maintainability and loose you too much performance.
Code had no possibility to test
import re
...
pattern = '^(.{25})(\d+(?:\.\d+)*)(-\d\d)$' //used offset for simplicity
....
for line in text[1:]:
match = re.search(pattern, line)
line = match.group(1) + round(match.group(2),3) + match.group(3)
outputStream.write(line + "\n")
I would like to POST multipart/form-data encoded data.
I have found an external module that does it: http://atlee.ca/software/poster/index.html
however I would rather avoid this dependency. Is there a way to do this using the standard libraries?
thanks
The standard library does not currently support that. There is cookbook recipe that includes a fairly short piece of code that you just may want to copy, though, along with long discussions of alternatives.
It's an old thread but still a popular one, so here is my contribution using only standard modules.
The idea is the same than here but support Python 2.x and Python 3.x.
It also has a body generator to avoid unnecessarily memory usage.
import codecs
import mimetypes
import sys
import uuid
try:
import io
except ImportError:
pass # io is requiered in python3 but not available in python2
class MultipartFormdataEncoder(object):
def __init__(self):
self.boundary = uuid.uuid4().hex
self.content_type = 'multipart/form-data; boundary={}'.format(self.boundary)
#classmethod
def u(cls, s):
if sys.hexversion < 0x03000000 and isinstance(s, str):
s = s.decode('utf-8')
if sys.hexversion >= 0x03000000 and isinstance(s, bytes):
s = s.decode('utf-8')
return s
def iter(self, fields, files):
"""
fields is a sequence of (name, value) elements for regular form fields.
files is a sequence of (name, filename, file-type) elements for data to be uploaded as files
Yield body's chunk as bytes
"""
encoder = codecs.getencoder('utf-8')
for (key, value) in fields:
key = self.u(key)
yield encoder('--{}\r\n'.format(self.boundary))
yield encoder(self.u('Content-Disposition: form-data; name="{}"\r\n').format(key))
yield encoder('\r\n')
if isinstance(value, int) or isinstance(value, float):
value = str(value)
yield encoder(self.u(value))
yield encoder('\r\n')
for (key, filename, fd) in files:
key = self.u(key)
filename = self.u(filename)
yield encoder('--{}\r\n'.format(self.boundary))
yield encoder(self.u('Content-Disposition: form-data; name="{}"; filename="{}"\r\n').format(key, filename))
yield encoder('Content-Type: {}\r\n'.format(mimetypes.guess_type(filename)[0] or 'application/octet-stream'))
yield encoder('\r\n')
with fd:
buff = fd.read()
yield (buff, len(buff))
yield encoder('\r\n')
yield encoder('--{}--\r\n'.format(self.boundary))
def encode(self, fields, files):
body = io.BytesIO()
for chunk, chunk_len in self.iter(fields, files):
body.write(chunk)
return self.content_type, body.getvalue()
Demo
# some utf8 key/value pairs
fields = [('প্রায়', 42), ('bar', b'23'), ('foo', 'ން:')]
files = [('myfile', 'image.jpg', open('image.jpg', 'rb'))]
# iterate and write chunk in a socket
content_type, body = MultipartFormdataEncoder().encode(fields, files)
You can't do this with the stdlib quickly. Howevewr, see the MultiPartForm class in this PyMOTW. You can probably use or modify that to accomplish whatever you need:
PyMOTW: urllib2 - Library for opening URLs.