I'm new in Python, and I'm trying to encode in Json an data dict.
My dict is :
data = { ('analogInput', 18) : [('objectName','AI8-Voltage'),
('presentValue',238.3),
('units','Volts')],
('analogInput', 3) : [('objectName','AI3-Pulse'),
('presentValue',100),
('units','Amp')]
}
And when i'm trying to do : foo = json.dumps(data)
I've got this message : Fatal error : keys must be str, int, float, bool or None, not tuple
I'm trying to search answers, but I dont understand how i can do proceed in my case
Thanx you any answers
First of all, not all types can be used for JSON keys.
Keys must be strings, and values must be a valid JSON data type (string, number, object, array, Boolean or null).
For more information, take a look at this.
Now as feasible solution, I recommend you to implement two functions that converts your tuples to string and converts your strings to tuple. A quite simple example is provided below:
import json
data = { ('analogInput', 18) : [('objectName','AI8-Voltage'),
('presentValue',238.3),
('units','Volts')],
('analogInput', 3) : [('objectName','AI3-Pulse'),
('presentValue',100),
('units','Amp')]
}
def tuple_to_str(t):
# It can be implemeneted with more options
return str(t[0])+'_'+str(t[1])
def str_to_tuple(s):
l =s.split('_')
# Your first (second) item is int
l[1] = int(l[1])
return tuple(l)
if __name__=="__main__":
# create a space for a dict of data with string keys
s_data= dict()
for key in data:
s_data[tuple_to_str(key)] = data[key]
x = json.dumps(s_data)
# create a space to load the json with string keys
raw_data = json.loads(x)
final_data = dict()
for key in raw_data:
final_data[str_to_tuple(key)] = raw_data[key]
# Ture
print(final_data)
The error is explicit. In a Python dict, the key can be any hashable type, including a tuple, a frozen set or a frozen dict (but neither a list, nor a set or a dict).
But in a Json object, dictionary keys can only be strings, numbers (int, or float), booleans or the special object None.
Long story short, your input dictionary cannot be directly converted to Json.
Possible workarounds:
use an different serialization tool. For example, pickle can accept any Python type, but is not portable to non Python application. But you could also use a custom serialization format, if you write both the serialization and de-serialization parts
convert the key to a string. At deserialization time, you would just have to convert the string back to a tuple with ast.literal_evel:
js = json.dumps({str(k): v for k,v in data.items()})
giving: {"('analogInput', 18)": [["objectName", "AI8-Voltage"], ["presentValue", 238.3], ["units", "Volts"]], "('analogInput', 3)": [["objectName", "AI3-Pulse"], ["presentValue", 100], ["units", "Amp"]]}
You can load it back with:
data2 = {ast.literal_eval(k): v for k,v in json.loads(js).items()}
giving {('analogInput', 18): [['objectName', 'AI8-Voltage'], ['presentValue', 238.3], ['units', 'Volts']], ('analogInput', 3): [['objectName', 'AI3-Pulse'], ['presentValue', 100], ['units', 'Amp']]}
You can just see that the json transformation has changed the tuples into lists.
Related
Sorry if it's too much of a noob question.
I have a dictionary where the keys are bytes (like b'access_token' ) instead of strings.
{
b'access_token': [b'b64ssscba8c5359bac7e88cf5894bc7922xxx'],
b'token_type': [b'bearer']
}
usually I access the elements of a dictionary by data_dict.get('key'), but in this case I was getting NoneType instead of the actual value.
How do I access them or is there a way to convert this bytes keyed dict to string keyed dict?
EDIT: I actually get this dict from parsing a query string like this access_token=absdhasd&scope=abc by urllib.parse.parse_qs(string)
You can use str.encode() and bytes.decode() to swap between the two (optionally, providing an argument that specifies the encoding. 'UTF-8' is the default). As a
result, you can take your dict:
my_dict = {
b'access_token': [b'b64ssscba8c5359bac7e88cf5894bc7922xxx'],
b'token_type': [b'bearer']
}
and just do a comprehension to swap all the keys:
new_dict = {k.decode(): v for k,v in my_dict.items()}
# {
# 'access_token': [b'b64ssscba8c5359bac7e88cf5894bc7922xxx'],
# 'token_type': [b'bearer']
# }
Similarly, you can just use .encode() when accessing the dict in order to get a bytes object from your string:
my_key = 'access_token'
my_value = my_dict[my_key.encode()]
# [b'b64ssscba8c5359bac7e88cf5894bc7922xxx']
Most probably, you are making some silly mistake.
It is working fine in my tests.
Perhaps you forgot to add the prefix b when trying to index the dictionary
d={
b'key1': [b'val1'],
b'key2': [b'val2']
}
d[b'key1'] # --> returns [b'val1']
d.get(b'key2') # --> returns [b'val2']
Perhaps this could be something you're looking for?
dict = {
b'access_token': [b'b64ssscba8c5359bac7e88cf5894bc7922xxx'],
b'token_type': [b'bearer']
}
print(dict.get( b'access_token'))
the dataframe 'dataset' is automatically generated by PowerBI here is the result of my dataset.head(10).to_clipboard(sep=',', index=False)
coordinates,status
"[143.4865219,-34.7560602]",not started
"[143.4865241,-34.7561332]",not started
"[143.4865264,-34.7562088]",not started
"[143.4865286,-34.7562818]",not started
"[143.4865305,-34.7563453]",not started
"[143.4865327,-34.7564183]",not started
"[143.486535,-34.756494]",not started
"[143.4865371,-34.756567]",not started
"[143.486539,-34.7566304]",not started
"[143.4865412,-34.7567034]",not started
then to get the json
i do this data=dataset.to_json(orient='records')
which give me this results
[{"coordinates":"[143.4865219,-34.7560602]","status":"not started"},{"coordinates":"[143.4865241,-34.7561332]","status":"not started"},
how i get this instead , no quotes on the coordinates values
[{"coordinates":[143.4865219,-34.7560602],"status":"not started"},{"coordinates":[143.4865241,-34.7561332],"status":"not started"},
edit
print(type(data))
<class 'str'>
You could use ast.literal_eval:
Safely evaluate an expression node or a string containing a Python
literal or container display. The string or node provided may only
consist of the following Python literal structures: strings, bytes,
numbers, tuples, lists, dicts, sets, booleans, and None.
This can be used for safely evaluating strings containing Python
values from untrusted sources without the need to parse the values
oneself.[...]
Your data seems to be a string, and not a list as Python would print it (it uses single quotes by default, the double quotes in your data seem to indicate that it is a string, ready to be saved in a json file for example). So, you have to convert it first to a Python object with json.loads:
from ast import literal_eval
import json
data = """[{"coordinates":"[143.4865219,-34.7560602]","status":"not started"},{"coordinates":"[143.4865241,-34.7561332]","status":"not started"}]"""
data = json.loads(data)
for d in data:
d['coordinates'] = literal_eval(d['coordinates'])
print(data)
# [{'coordinates': [143.4865219, -34.7560602], 'status': 'not started'}, {'coordinates': [143.4865241, -34.7561332], 'status': 'not started'}]
import json
s = '[{"coordinates":"[143.4865219,-34.7560602]","status":"not started"},{"coordinates":"[143.4865241,-34.7561332]","status":"not started"}]'
d = json.loads(s)
d[0]['coordinates'] = json.loads(d[0]['coordinates'])
Applying this concept to every value can be done as in
for dic in d:
for key, value in dic.items():
try:
temp = json.loads(value)
if isinstance(temp, list):
dic[key] = temp
except Exception:
pass
or if you are sure there will be a coordinates key in ever dictionary
and that key having a "list" value
for dic in d: dic['coordinates'] = json.loads(dic['coordinates'])
simply u can use eval function.
new =[]
l = '[{"coordinates":"[143.4865219,-34.7560602]","status":"not started"},{"coordinates":"[143.4865241,-34.7561332]","status":"not started"}]'
l=eval(l)
for each_element in l:
temp={}
for k,v in each_element.items():
if k =='coordinates' :
temp[k]=eval(v)
else:
temp[k]=v
new.append(temp)
print(temp)
I have data that look like this:
data = 'somekey:value4thekey&second-key:valu3-can.be?anything&third_k3y:it%can have spaces;too'
In a nice human-readable way it would look like this:
somekey : value4thekey
second-key : valu3-can.be?anything
third_k3y : it%can have spaces;too
How should I parse the data so when I do data['somekey'] I would get >>> value4thekey?
Note: The & is connecting all of the different items
How am I currently tackling with it
Currently, I use this ugly solution:
all = data.split('&')
for i in all:
if i.startswith('somekey'):
print i
This solution is very bad due to multiple obvious limitations. It would be much better if I can somehow parse it into a python tree object.
I'd split the string by & to get a list of key-value strings, and then split each such string by : to get key-value pairs. Using dict and list comprehensions actually makes this quite elegant:
result = {k:v for k, v in (part.split(':') for part in data.split('&'))}
You can parse your data directly to a dictionary - split on the item separator & then split again on the key,value separator ::
table = {
key: value for key, value in
(item.split(':') for item in data.split('&'))
}
This allows you direct access to elements, e.g. as table['somekey'].
If you don't have objects within a value, you can parse it to a dictionary
structure = {}
for ele in data.split('&'):
ele_split = ele.split(':')
structure[ele_split[0]] = ele_split[1]
You can now use structure to get the values:
print structure["somekey"]
#returns "value4thekey"
Since the keys have a common format of being in the form of "key":"value".
You can use it as a parameter to split on.
for i in x.split("&"):
print(i.split(":"))
This would generate an array of even items where every even index is the key and odd index being the value. Iterate through the array and load it into a dictionary. You should be good!
I'd format data to YAML and parse the YAML
import re
import yaml
data = 'somekey:value4thekey&second-key:valu3-can.be?anything&third_k3y:it%can have spaces;too'
yaml_data = re.sub('[:]', ': ', re.sub('[&]', '\n', data ))
y = yaml.load(yaml_data)
for k in y:
print "%s : %s" % (k,y[k])
Here's the output:
third_k3y : it%can have spaces;too
somekey : value4thekey
second-key : valu3-can.be?anything
We are attempting to refactor and modify a Python program such that it is able to take a user-defined JSON file, parse that file, and then execute a workflow based on the options that they user wants and had defined in the JSON. So basically, the user will have to specify a dictionary in JSON, and when this JSON file is parsed by the Python program, we obtain a python dictionary which we then pass in as an argument into a class that we instantiate in a top level module. To sum this up, the JSON dictionary defined by the user will eventually be added into the instance namespace when the python program is running.
Implementing the context managers to parse the JSON inputs was not a problem for us. However, we have a requirement that we be able to use the JSON dictionary (which gets subsequently added into the instance namespace) and generate multiple lines from a Jinja2 template file using looping within a template. We attempted to use this line for one of the key-value pairs in the JSON:
"extra_scripts" : [["Altera/AlteraCommon.lua",
"Altera/StratixIV/EP4SGX70HF35C2.lua"]]
and this is sitting in a large dictionary object, let's call it option_space_dict and for simplicity in this example, it has only 4 key-value pairs (assume that "extra_scripts" is 'key4' here), although for our program, it is much larger:
option_space_dict = {
'key1' : ['value1'],
'key2' : ['value2'],
'key3' : ['value3A', 'value3B', 'value3C'],
'key4' : [['value4A', 'value4B']]
}
which is the parsed by this line:
import itertools
option_space = [ dict(itertools.izip(option_space_dict, opt)) for opt in itertools.product(*option_space_dict.itervalues()) ]
to get the option_space which essentially differs from option_space_dict in that it is something like:
[
{ 'key1' : 'value1',
'key2' : 'value2',
'key3' : 'value3A'
'key4' : ['value4A', 'value4B'] },
{ 'key1' : 'value1',
'key2' : 'value2',
'key3' : 'value3B'
'key4' : ['value4A', 'value4B'] },
{ 'key1' : 'value1',
'key2' : 'value2',
'key3' : 'value3C'
'key4' : ['value4A', 'value4B'] }
]
So the option_space we generate serves us well for what we want to do with the jinja2 templating. However, in order to get this, the key4 key that we added to option_space_dict caused an issue somewhere else in the program which did:
# ignore self.option as it is not relevant to the issue here
def getOptionCompack(self) :
return [ (k, v) for k, v in self.option.iteritems() if set([v]) != set(self.option_space_dict[k])]
I get the error TypeError: unhashable type: 'list' stemming from the fact that the value of key4 contains a nested list structure, which is 'unhashable'.
So we kind of hit a barrier. Does anyone have a suggestion on how we could overcome this; being able to specify our JSON files in that way to do what we'd want with Jinja2 while still being able to parse the data structures out in the same format?
Thanks a million!
You can normalize your key data structures to use hashable types after they have parsed from JSON.
Since key4 is a list, you have two options:
Convert it to a tuple where order is significant. E.g.,
key = tuple(key)
Convert it to a frozenset where order is insignificant. E.g.,
key = frozenset(key)
If a key can contain a dictionary, then you'll have two additional options:
Convert it to either a sorted tuple or frozenset of its item tuples. E.g.,
key = tuple(sorted(key.iteritems())) # Use key.items() for Python 3.
# OR
key = frozenset(key.iteritems()) # Use key.items() for Python 3.
Convert it to a third-party frozendict (Python 3 compatible version here). E.g.,
import frozendict
key = frozendict.frozendict(key)
Depending on how simple or complex your keys are, you may have to apply the transformation recursively.
Since your keys come directly from JSON, you can check for the native types directly:
if isinstance(key, list):
# Freeze list.
elif isinstance(key, dict):
# Freeze dict.
If you want to support the generic types, you can do something similar to:
import collections
if isinstance(key, collections.Sequence) and not isinstance(key, basestring): # Use str for Python 2.
# NOTE: Make sure to exclude basestring because it meets the requirements for a Sequence (of characters).
# Freeze list.
elif isinstance(key, collections.Mapping):
# Freeze dict.
Here is a full example:
def getOptionCompack(self):
results = []
for k, v in self.option.iteritems():
k = self.freeze_key(k)
if set([v]) != set(self.option_space_dict[k]):
results.append((k, v))
return results
def freeze_key(self, key):
if isinstance(key, list):
return frozenset(self.freeze_key(subv) for subv in key)
# If dictionaries need to be supported, uncomment this.
#elif isinstance(key, dict):
# return frozendict((subk, self.freeze_key(subv)) for subk, subv in key.iteritems())
return key
Where self.option_space_dict already had its keys converted using self.freeze_key().
We have managed to figure out the solution to this problem. The main gist of our solution lies in that we implemented a Helper Function that assists us to actually convert a list into a tuple. Basically, going back to my question, remember we had this list: [["Altera/AlteraCommon.lua", "Altera/StratixIV/EP4SGX70HF35C2.lua"]]?
With our original getOptionCompack(self) method, and the way we were invoking it, what happened was that we directly tried to convert the list to a set with the statement
return [ (k, v) for k, v in self.option.iteritems() if set([v]) != set(self.option_space_dict[k])]
where set(self.option_space_dict[k]) and iterating over k would mean we will hit the dictionary key-value pair that would give us one instance of doing set([["Altera/AlteraCommon.lua", "Altera/StratixIV/EP4SGX70HF35C2.lua"]])
which was the cause of the error. This is because a list object is not hashable and set() would actually hash over each element within the outer list that is fed to it, and the element in this case is an inner list. Try doing set([[2]]) and you will see what I mean.
So we figured that the workaround would be to define a Helper function that would accept a list object, or any iterable object for that matter, and test whether each element in it is a list or not. If the element was not a list, it would not do any change to its object type, if it was (and that which would be a nested list), then the Helper function would convert that nested list to a tuple object instead, and in doing that iteratively, it actually constructs a set object that it returns to itself. The definition of the function is:
# Helper function to build a set
def Set(iterable) :
return { tuple(v) if isinstance(v, list) else v for v in iterable }
and so a call that invoked Set() would be in our example:
Set([["Altera/AlteraCommon.lua", "Altera/StratixIV/EP4SGX70HF35C2.lua"]])
and the object that it returns to itself would be:
{("Altera/AlteraCommon.lua", "Altera/StratixIV/EP4SGX70HF35C2.lua")}
The inner nested list gets converted to a tuple, which is an object type that fits within a set object, as denoted by the {} that encloses the tuple. That's why it can work now, that the set can be formed.
We proceeded to redefine out original method to use our own Set() function:
def getOptionCompack(self) :
return [ (k, v) for k, v in self.option.iteritems() if Set([v]) != Set(self.option_space_dict[k]) ]
and now we no longer have the TypeError, and solved the problem. Seems like a lot of trouble just to do this, but the reason why we went through all this was so as to have an objective means of comparing two objects by sort of "normalizing" them to be the same object type, a set, in order to perform some other action later on as part of our source code.
Let us say I have a custom data structure comprising of primitive dicts. I need to serialize this using JSON. My structure is as follows:
path_list_dict = {(node1, node2 .. nodeN): (float1, float2, float3)}
So this is keyed with a tuple and the value is a tuple of three values. Each node element in the key is a custom class object with a _str_ method written for it. The wrapper dict which identifies each dict entry in path_list_dict with a key is as follows:
path_options_dict = {‘Path1’: {(node1, node2 .. nodeN): (float1, float2, float3)}, ‘Path2’: {(nodeA1, nodeA2 .. nodeAN): (floatA1, floatA2, floatA3)} }
and so on.
When I try to serialize this using JSON, of course I run into a TypeError because the inner dict has tuples as keys and values and a dict needs to have keys as strings to be serialized. This can be easily taken care of for me by inserting into the dict as the str(tuple) representation instead of just the native tuple.
What I am concerned about is that when I receive it and unpack the values, I am going to have all strings at the receiving end. The key tuple of the inner dict that consists of custom class elements is now represented as a str. Will I be able to recover the embedded data? Or is these some other way to do this better?
For more clarity, I am using this JSON tutorial as reference.
You have several options:
Serialize with a custom key prefix that you can pick out and unserialize again:
tuple_key = '__tuple__({})'.format(','.join(key))
would produce '__tuple__(node1,node2,nodeN)' as a key, which you could parse back into a tuple on the other side:
if key.startswith('__tuple__('):
key = tuple(key[10:-1].split(','))
Demo:
>>> key = ('node1', 'node2', 'node3')
>>> '__tuple__({})'.format(','.join(key))
'__tuple__(node1,node2,node3)'
>>> mapped_key = '__tuple__({})'.format(','.join(key))
>>> tuple(mapped_key[10:-1].split(','))
('node1', 'node2', 'node3')
Don't use dictionaries, use a list of lists:
{'Path': [[[node1, node2 .. nodeN], [float1, float2, float3]], [...]]}
You can build such a list simply from the dict.items() result:
>>> json.dumps({(1, 2, 3): ('foo', 'bar')}.items())
'[[[1, 2, 3], ["foo", "bar"]]]'
and when decoding, feed the whole thing back into dict() while mapping each key-value list to tuples:
>>> dict(map(tuple, kv) for kv in json.loads('[[[1, 2, 3], ["foo", "bar"]]]'))
{(1, 2, 3): (u'foo', u'bar')}
The latter approach is more suitable for custom classes as well, as the JSONEncoder.default() method will still be handed these custom objects for you to serialize back to a suitable dictionary object, which gives means that a suitable object_hook passed to JSONDecoder() a chance to return fully deserialized custom objects again for those.