How to sort json in the dump statement? - python

I read this page at W3Schools, and noticed it showed you can dump and sort by alphabetical order, is it possible to sort by the time?
My dump statement:
with open("./warns.json","w") as f:
json.dump(warns,f)
How would I dump it and sort it by date?

From https://docs.python.org/3/library/json.html#json.dump :
If sort_keys is true (default: False), then the output of dictionaries will be sorted by key.
Sorting by date is different, as it depends on the structure of your JSON. If you really need it, you can modify the encoder:
To use a custom JSONEncoder subclass (e.g. one that overrides the default() method to serialize additional types), specify it with the cls kwarg; otherwise JSONEncoder is used.
This is used to serialize types, but you may be able to also control the output order.
Another solution is to use a sorted array in the JSON, to ensure the order is respected.

Related

How to make SEVERAL types BOTH serializable and deserializable to/from JSON in Python?

I wan't to save and load dicts, some of fields of which are datetime-s and ndarray-s. How to make both of these types JSON serializeble and deserializeble, so I can use both json.load and json.dump? The code for serializer and deserializer I want to provide myself
For datetime it will be string in specified format (datetime.strftime in serializer and datetime.strptime in deserializer) and for ndarray it will be one-line list.
How to do that? In many examples I found the following shorcomings appear:
1) only serialization provided (I need deserialization too)
2) only one custom type serialization was provided (I need two)
How to accomplish?

Accessing Unicode Values in a Python Dictionary

I have a dictionary full of unicode keys/values due to importing JSON through json.loads().
dictionaryName = {u'keyName' : u'valueName'}
I'm trying to access values inside the dictionary as follows:
accessValueName = dictionaryName.get('keyName')
This returns None, assumedly because it is looking for the String 'keyName' and the list is full of unicode values. I tried sticking a 'u' in front of my keyName when making the call, but it still returns none.
accessValueName = dictionaryName.get(u'keyName')
I also found several seemingly outdated methods to convert the entire dictionary to string values instead of unicode, however, they did not work, and I am not sure that I need the entire thing converted.
How can I either convert the entire dictionary from Unicode to String or just access the values using the keyname?
EDIT:
I just realized that I was trying to access a value from a nested dictionary that I did not notice was nested.
The solution is indeed:
accessValueName = dictionaryName.get('keyName')
Dictionaries store values in a hash table using the hash values of the object.
print(hash(u"example"))
print(hash("example"))
Yields the same result. Therefore the same dictionary value should be accessible with both.

Insert list into SQLite3 cell

I'm new to python and even newer to SQL and have just run into the following problem:
I want to insert a list (or actually, a list containing one or more dictionaries) into a single cell in my SQL database. This is one row of my data:
[a,b,c,[{key1: int, key2: int},{key1: int, key2: int}]]
As the number of dictionaries inside the lists varies and I want to iterate through the elements of the list later on, I thought it would make sense to keep it in one place (thus not splitting the list into its single elements). However, when trying to insert the list as it is, I get the following error:
sqlite3.InterfaceError: Error binding parameter 2 - probably unsupported type.
How can this kind of list be inserted into a single cell of my SQL database?
SQLite has no facility for a 'nested' column; you'd have to store your list as text or binary data blob; serialise it on the way in, deserialise it again on the way out.
How you serialise to text or binary data depends on your use-cases. JSON (via the json module could be suitable if your lists and dictionaries consist only of text, numbers, booleans and None (with the dictionaries only using strings as keys). JSON is supported by a wide range of other languages, so you keep your data reasonably compatible. Or you could use pickle, which lets you serialise to a binary format and can handle just about anything Python can throw at it, but it's specific to Python.
You can then register an adapter to handle converting between the serialisation format and Python lists:
import json
import sqlite
def adapt_list_to_JSON(lst):
return json.dumps(lst).encode('utf8')
def convert_JSON_to_list(data):
return json.loads(data.decode('utf8'))
sqlite3.register_adapter(list, adapt_list_to_JSON)
sqlite3.register_converter("json", convert_JSON_to_list)
then connect with detect_types=sqlite3.PARSE_DECLTYPES and declare your column type as json, or use detect_types=sqlite3.PARSE_COLNAMES and use [json] in a column alias (SELECT datacol AS "datacol [json]" FROM ...) to trigger the conversion on loading.

How to store numerical lookup table in Python (with labels)

I have a scientific model which I am running in Python which produces a lookup table as output. That is, it produces a many-dimensional 'table' where each dimension is a parameter in the model and the value in each cell is the output of the model.
My question is how best to store this lookup table in Python. I am running the model in a loop over every possible parameter combination (using the fantastic itertools.product function), but I can't work out how best to store the outputs.
It would seem sensible to simply store the output as a ndarray, but I'd really like to be able to access the outputs based on the parameter values not just indices. For example, rather than accessing the values as table[16][5][17][14] I'd prefer to access them somehow using variable names/values, for example:
table[solar_z=45, solar_a=170, type=17, reflectance=0.37]
or something similar to that. It'd be brilliant if I were able to iterate over the values and get their parameter values back - that is, being able to find out that table[16]... corresponds to the outputs for solar_z = 45.
Is there a sensible way to do this in Python?
Why don't you use a database? I have found MongoDB (and the official Python driver, Pymongo) to be a wonderful tool for scientific computing. Here are some advantages:
Easy to install - simply download the executables for your platform (2 minutes tops, seriously).
Schema-less data model
Blazing fast
Provides map/reduce functionality
Very good querying functionalities
So, you could store each entry as a MongoDB entry, for example:
{"_id":"run_unique_identifier",
"param1":"val1",
"param2":"val2" # etcetera
}
Then you could query the entries as you will:
import pymongo
data = pymongo.Connection("localhost", 27017)["mydb"]["mycollection"]
for entry in data.find(): # this will yield all results
yield entry["param1"] # do something with param1
Whether or not MongoDB/pymongo are the answer to your specific question, I don't know. However, you could really benefit from checking them out if you are into data-intensive scientific computing.
If you want to access the results by name, then you could use a python nested dictionary instead of ndarray, and serialize it in a .JSON text file using json module.
One option is to use a numpy ndarray for the data (as you do now), and write a parser function to convert the query values into row/column indices.
For example:
solar_z_dict = {...}
solar_a_dict = {...}
...
def lookup(dataArray, solar_z, solar_a, type, reflectance):
return dataArray[solar_z_dict[solar_z] ], solar_a_dict[solar_a], ...]
You could also convert to string and eval, if you want to have some of the fields to be given as "None" and be translated to ":" (to give the full table for that variable).
For example, rather than accessing the values as table[16][5][17][14]
I'd prefer to access them somehow using variable names/values
That's what numpy's dtypes are for:
dt = [('L','float64'),('T','float64'),('NMSF','float64'),('err','float64')]
data = plb.loadtxt(argv[1],dtype=dt)
Now you can access the data elements using date['T']['L']['NMSF']
More info on dtypes:
http://docs.scipy.org/doc/numpy/reference/generated/numpy.dtype.html

Store django forms.MultipleChoiceField in Models directly

Say I have choices defined as follows:
choices = (('1','a'),
('2','b'),
('3','c'))
And a form that renders and inputs these values in a MultipleChoiceField,
class Form1(forms.Form):
field = forms.MultipleChoiceField(choices=choices)
What is the right way to store field in a model.
I can of course loop through the forms.cleaned_data['field'] and obtain a value that fits in models.CommaSeperatedIntegerField.
Again, each time I retrieve these values, I will have to loop and convert into options.
I think there is a better way to do so, as in this way, I am in a way re-implementing the function that CommaSeperateIntegerField is supposed to do.
The first thing I would consider is better normalization of your database schema; if a single instance of your model can have multiple values for this field, the field perhaps should be a linked model with a ForeignKey instead.
If you're using Postgres, you could also use an ARRAY field; Django now has built-in support.
If you can't do either of those, then you do basically need to reimplement a (better) version of CommaSeparatedIntegerField. The reason is that CommaSeparatedIntegerField is nothing but a plain CharField whose default formfield representation is a regex-validated text input. In other words, it doesn't do anything that's useful to you.
What you need to write is a custom ListField or MultipleValuesField that expects a Python list and returns a Python list, but internally converts that list to/from a comma-separated string for insertion in the database. Read the documentation on custom model fields; I think in your case you'll want a subclass of CharField with two methods overridden: to_python (convert CSV string to Python list) and get_db_prep_value (convert Python list to CSV string).
I just had this same problem and the solution (to me) because as Carl Meyer put it. I don't want a normalized version of this "list of strings" is to just have a CharField in the model. This way your model will store the normalized list of items. In my case this is countries.
So the model declaration is just
countries = CharField(max_lenght=XXX)
where XXX is a precalculated value of 2x my country list. Because it's simpler for us to apply a check to see if the current country is in this list rather than do it as a M2M to a Country table.

Categories