Writing a set to an output file in python - python

I usually use json for lists, but it doesn't work for sets. Is there a similar function to write a set into an output file,f? Something like this, but for sets:
f=open('kos.txt','w')
json.dump(list, f)
f.close()

json is not a python-specific format. It knows about lists and dictionaries, but not sets or tuples.
But if you want to persist a pure python dataset you could use string conversion.
with open('kos.txt','w') as f:
f.write(str({1,3,(3,5)})) # set of numbers & a tuple
then read it back again using ast.literal_eval
import ast
with open('kos.txt','r') as f:
my_set = ast.literal_eval(f.read())
this also works for lists of sets, nested lists with sets inside... as long as the data can be evaluated literally and no sets are empty (a known limitation of literal_eval). So basically serializing (almost) any python basic object structure with str can be parsed back with it.
For the empty set case there's a kludge to apply since set() cannot be parsed back.
import ast
with open('kos.txt','r') as f:
ser = f.read()
my_set = set() if ser == str(set()) else ast.literal_eval(ser)
You could also have used the pickle module, but it creates binary data, so more "opaque", and there's also a way to use json: How to JSON serialize sets?. But for your needs, I would stick to str/ast.literal_eval

Using ast.literal_eval(f.read()) will give error ValueError: malformed node or string, if we write empty set in file. I think, pickle would be better to use.
If set is empty, this will give no error.
import pickle
s = set()
##To save in file
with open('kos.txt','wb') as f:
pickle.dump(s, f)
##To read it again from file
with open('kos.txt','rb') as f:
my_set = pickle.load(f)

Related

Deserialize json array directly to a set in python

Is there a way to deserialize a json array directly to a set?
data.json (yes this is just a json array.)
["a","b","c"]
Notice that the json array contains unique elements.
Currently my workflow is the following.
open_file = open(path, 'r')
json_load = json.load(open_file) # this returns a list
return set(json_load) # which I am then converting to a set.
Is there a way to do something like this?
open_file = open(path, 'r')
return json.load(open_file, **arguments) # this returns a set.
Also is there any other way to go about doing it without the json module perhaps? Surely I am not the first one to need a set decoder.
No. You would have to subclass one of the json module classes JSONDecoder and override the method that creates the object, to do it yourself.
And it is also not worth the trouble. json arrays really map to lists in python - they have order, and can allow duplicates - a set can't correctly represent a json array. Therefore it is not the job of a json decoder to provide a set.
Converting is the best you can do. You could create a function and call it when you need:
def json_load_set(f):
return set(json.load(f))

How to alter the pickle database in Python?

I have a pickle database which I am reading using the following code
import pickle, pprint
import sys
def main(datafile):
with open(datafile,'rb')as fin:
data = pickle.load(fin)
pprint.pprint(data)
if __name__=='__main__':
if len(sys.argv) != 2:
print "Pickle database file must be given as an argument."
sys.exit()
main(sys.argv[1])
I recognised that it contained a dictionary. I want to delete/edit some values from this dictionary and make a new pickle database.
I am storing the output of this program in a file ( so that I can read the elements in the dictionary and choose which ones to delete) How do I read this file (pprinted data structures) and create a pickle database from it ?
As stated in Python docs pprint is guaranteed to turn objects into valid (in the sense of Python syntax) objects as long as they are representable as Python constants. So first thing is that what you are doing is fine as long as you do it for dicts, lists, numbers, strings, etc. In particular if some value deep down in the dict is not representable as a constant (e.g. a custom object) this will fail.
Now reading the output file should be quite straight forward:
import ast
with open('output.txt') as fo:
data = fo.read()
obj = ast.literal_eval(data)
This is assuming that you keep one object per file and nothing more.
Note that you may use built-in eval instead of ast.literal_eval but that is quite unsafe since eval can run arbitrary Python code.

storing and retrieving lists from files

I have a very big list of lists. One of my programs does this:
power_time_array = [[1,2,3],[1,2,3]] # In a short form
with open (file_name,'w') as out:
out.write (str(power_time_array))
Now another independent script need to read this list of lists back.
How do I do this?
What I have tried:
with open (file_name,'r') as app_trc_file :
power_trace_of_application.append (app_trc_file.read())
Note: power_trace_application is a list of list of lists.
This stores it as a list with one element as a huge string.
How does one efficiently store and retrieve big lists or list of lists from files in python?
You can serialize your list to json and deserialize it back. This really doesn't change anything in representation, your list is already valid json:
import json
power_time_array = [[1,2,3],[1,2,3]] # In a short form
with open (file_name,'w') as out:
json.dump(power_time_array, out)
and then just read it back:
with open (file_name,'r') as app_trc_file :
power_trace_of_application = json.load(app_trc_file)
For speed, you can use a json library with C backend (like ujson). And this works with custom objects too.
Use Json library to efficiently read and write structured information (in the form of JSON) to a text file.
To write data on the file, use json.dump() , and
To retrieve json data from file, use json.load()
It will be faster:
from ast import literal_eval
power_time_array = [[1,2,3],[1,2,3]]
with open(file_name, 'w') as out:
out.write(repr(power_time_array))
with open(file_name,'r') as app_trc_file:
power_trace_of_application.append(literal_eval(app_trc_file.read()))

Writing and reading a list from a file in Python

I want to save a list in python to a file which should be able to read later and added to a list variable in later use.
As an example
list = [42,54,24,65]
This should be written to a file as
[42,54,24,65] or
list = [42,54,24,65]
And should be able to read later from python for a later use and assign it to a list variable
Right now I'm using the following code.
f = open('list_file', 'w')
f.write(values)
f.close()
This gives me an error
TypeError: write() argument must be str, not list
How can I fix this?
Thanks
You could do it also with pickle, it works similarly to json, but it can serialize a broader set of Python objects than json. Json serializes text, and is human readable, while pickle serializes bytes, not human readable.
Consider this example:
import pickle, json
list_ = [42,54,24,65]
with open('list_file.pickle', 'wb') as fp, open('list_file.json', 'w') as fj:
pickle.dump(list_, fp)
json.dump(list_, fj)
with open('list_file.pickle', 'rb') as fp, open('list_file.json', 'r') as fj:
list_unpickled = pickle.load(fp)
list_from_json = json.load(fj)
print(list_unpickled) #[42, 54, 24, 65]
print(list_from_json) #[42, 54, 24, 65]
Notice that with pickle you have to open the files with the 'b' for binary reading/writing.
A side note: do not use variables with the same name as python keywords, like list.
According to 12.1.4 in the documentation:
The following types can be pickled:
None, True, and False
integers, floating point numbers, complex numbers
strings, bytes, bytearrays
tuples, lists, sets, and dictionaries containing only picklable objects
functions defined at the top level of a module (using def, not lambda)
built-in functions defined at the top level of a module
classes that are defined at the top level of a module
instances of such classes whose dict or the result of calling getstate() is picklable (see section Pickling Class Instances for details).
If you just have a simple list, then you can use JSON and the json module.
import json
data = [42,54,24,65]
with open('output.txt', 'w') as f_out:
json.dump(data, f_out)
with open('output.txt', 'r') as f_in:
data2 = json.load(f_in)
print(data2) # [42,54,24,65]
And the contents of output.txt looks like
[42,54,24,65]
Map all values in the list to strings first, the write method only supports strings.
E.g. list = list(map(str, list))
Also calling a variable "list" is a bad practice, use something like "ls" or whatever differs from standard Python keywords. If you want to use it later, you can just delimit the values using spaces. Just write it like f.write(" ".join(list)). Then, to read it back into a list, do list = f.readline().split() This, however, will keep the values in the list as strings, to get them back to ints, map again like list = list(map(int, list))
According to the error in your code you passing a list to f.write().you need to pass string.
I assuming you want to write one word per line.try the code below it should work.
f = open('list_file', 'w')
for value in list:
f.write(value+"\n")
f.close()
To read later you can just open file again and read using this code:
f = open('list_file', 'r')
for line in f:
print line.strip()
f.close()
Turning my comment into an answer:
Try Saving and loading objects and using pickle:
import pickle
filehandler = open(b"Fruits.obj","wb")
pickle.dump(banana,filehandler)
To load the data, use:
file = open("Fruits.obj",'r')
object_file = pickle.load(file)

How do I encode/decode a dictionary in Python 3 to/from an external file?

I'm relatively new to and encoding and decoding, in fact I don't have any experience with it at all.
I was wondering, how would I decode a dictionary in Python 3 into an unreadable format that would prevent someone from modifying it outside the program?
Likewise, how would I then read from that file and encode the dictionary back?
My test code right now only writes to and reads from a plain text file.
import ast
myDict = {}
#Writer
fileModifier = open('file.txt', 'w')
fileModifier.write(str(myDict)))
fileModifier.close()
#Reader
fileModifier = open('file.txt', 'r')
myDict = ast.literal_eval(fileModifier.read())
fileModifier.close()
Depending on what your dictionary is holding you can use an encoding library like json or pickle (useful for storing my complex python data structures).
Here is an example using json, to use pickle just replace all instances of json with pickle and you should be good to go.
import json
myDict = {}
#Writer
fileModifier = open('file.txt', 'w'):
json.dump(myDict, fileModifier)
fileModifier.close()
#Reader
fileModifier = open('file.txt', 'r'):
myDict = json.load(fileModifier)
fileModifier.close()
The typical thing to use here is either the json or pickle modules (both in the standard library). The process is called "serialization". pickle can serialize almost arbitrary python objects whereas json can only serialize basic types/objects (integers, floats, strings, lists, dictionaries). json is human readible at the end of the day whereas pickle files aren't.
An alternative to encoding/decoding is to simply use a file as a dict, and python has the shelve module which does exactly this. This module uses a file as database and provide a dict-like interface to it.
It has some limitations, for example keys must be strings, and it's obviously slower than a normal dict since it performs I/O operations.

Categories