How to make a dictionary value the key of another value? - python

I have an ajax POST that sends a dictionary from javascript to my Flask back-end like this:
{'output[0][description]': ['Source File'],
'output[0][input]': ['Some_document.pdf'],
'output[1][description]': ['Name'],
'output[1][input]': ['Ari'],
'output[2][description]': ['Address'],
'output[2][input]': ['12 fake st']}
So I am trying to reorganize it on the back-end to look like this:
['Source File']:['Some_document.pdf'],
['Name']:['Ari],
['Address']:['12 fake st'],
Any ideas?

One problem : You can't use a list as the key of the dict because it's not hashable.
You could use the regular expression module (re) to examine each key to determine if it conforms to the expression
output\[(\d+)\]\[description\]
for each one that does, find the corresponding key
output[$1][input]
put them together in the final dict.
The following is a sketch:
import re
P=re.compile('output\[(\d+)\]\[description\]')
inp = {'output[0][description]': ['Source File'], 'output[0][input]': ['Some_document.pdf'],
'output[1][description]': ['Name'], 'output[1][input]': ['Ari'],
'output[2][description]': ['Address'], 'output[2][input]': ['12 fake st']}
out = {}
for key in inp :
m = P.fullmatch(key)
if m :
out[inp[key][0]] = inp['output['+str(m.group(1))+'][input]'][0]
print(out)

I agree with #Klaus D.'s comment, you need to reorganize your API to use JSONs, that would simplify things but until then the following solution would be a lot faster than using regex and deliver the expected output
i=0
for key,val in inp.items():
if i<3:
print(f"{inp['output['+str(i)+'][description]']}:{inp['output['+str(i)+'][input]']}")
i+=1

Related

Which serialization format is this?

I’m dealing with some serialized data fetched from an SQL Server database and that looks like this :
('|AFoo|BBaar|C61|DFoo Baar|E200060|F200523|G200240|', )
Any idea which format is this ? And is there any Python package that can deserilize this ?
What you show is a tuple that contains one value - a string. You can use string.split to construct a list of the string's component parts - i.e., AFoo, BBaar etc
t = ('|AFoo|BBaar|C61|DFoo Baar|E200060|F200523|G200240|', )
for e in t:
values = [v for v in e.split('|') if v]
print(values)
Output:
['AFoo', 'BBaar', 'C61', 'DFoo Baar', 'E200060', 'F200523', 'G200240']
Note:
The for loop is used as a generic approach that allows for multiple strings in the tuple. For the data fragment shown in the question, this isn't actually necessary

I can't get a value from a JSON API response in python

So I am struggling with getting a value from a JSON response. Looking in other post I have managed to write this code but when I try to search for the key (character_id) that I want in the dictionary python says that the key doesn't exist. My solution consists in getting the JSON object from the response, converting it into a string with json.dumps() and the converting it into a dictionary with json.loads(). Then I try to get 'character_id' from the dictionary but it doesn't exist. I am guessing it is related with the format of the dictionary but I have little to none experience in python. The code that makes the query and tries to get the values is this: (dataRequest is a fuction that makes the request and return the response from the api)
characterName = sys.argv[1];
response = dataRequest('http://census.daybreakgames.com/s:888/get/ps2:v2/character/?name.first_lower=' + characterName + '&c:show=character_id')
jsonString = json.dumps(response.json())
print(jsonString)
dic = json.loads(jsonString)
print(dic)
if 'character_id' in dic:
print(dic['character_id'])
The output of the code is:
{"character_list": [{"character_id": "5428662532301799649"}], "returned": 1}
{'character_list': [{'character_id': '5428662532301799649'}], 'returned': 1}
Welcome #Prieto! From what I can see, you probably don't need to serialize/de-serialize the JSON -- response.json() returns a python dictionary object already.
The issue is that you are looking for the 'character_id' key at the top-level of the dictionary, when it seems to be embedded inside another dictionary, that is inside a list. Try something like this:
#...omitted code
for char_obj in dic["character_list"]:
if "character_id" in char_obj:
print(char_obj["character_id"])
if your dic is like {"character_list": [{"character_id": "5428662532301799649"}], "returned": 1}
you get the value of character_id by
print(dic['character_list'][0][character_id])
The problem here is that you're trying to access a dictionary where the key is actually character_list.
What you need to do is to access the character_list value and iterate over or filter the character_id you want.
Like this:
print(jsonString)
dic = json.loads(jsonString)
print(dic)
character_information = dic['character_list'][0] # we access the character list and assume it is the first value
print(character_information["character_id"]) # this is your character id
The way I see it, the only hiccup with the code is this :
if 'character_id' in dic:
print(dic['character_id'])
The problem is that, the JSON file actually consists of actually 2 dictionaries , first is the main one, which has two keys, character_list and returned. There is a second sub-dictionary inside the array, which is the value for the key character_list.
So, what your code should actually look like is something like this:
for i in dic["character_list"]:
print(i["character_id"])
On a side-note, it will help to look at JSON file in this way :
{
"character_list": [
{
"character_id": "5428662532301799649"
}
],
"returned": 1
}
,where, elements enclosed in curly-brackets'{}' imply they are in a dictionary, whereas elements enclosed in curly-brackets'[]' imply they are in a list

Python 2.7 parsing data

I have data that look like this:
data = 'somekey:value4thekey&second-key:valu3-can.be?anything&third_k3y:it%can have spaces;too'
In a nice human-readable way it would look like this:
somekey : value4thekey
second-key : valu3-can.be?anything
third_k3y : it%can have spaces;too
How should I parse the data so when I do data['somekey'] I would get >>> value4thekey?
Note: The & is connecting all of the different items
How am I currently tackling with it
Currently, I use this ugly solution:
all = data.split('&')
for i in all:
if i.startswith('somekey'):
print i
This solution is very bad due to multiple obvious limitations. It would be much better if I can somehow parse it into a python tree object.
I'd split the string by & to get a list of key-value strings, and then split each such string by : to get key-value pairs. Using dict and list comprehensions actually makes this quite elegant:
result = {k:v for k, v in (part.split(':') for part in data.split('&'))}
You can parse your data directly to a dictionary - split on the item separator & then split again on the key,value separator ::
table = {
key: value for key, value in
(item.split(':') for item in data.split('&'))
}
This allows you direct access to elements, e.g. as table['somekey'].
If you don't have objects within a value, you can parse it to a dictionary
structure = {}
for ele in data.split('&'):
ele_split = ele.split(':')
structure[ele_split[0]] = ele_split[1]
You can now use structure to get the values:
print structure["somekey"]
#returns "value4thekey"
Since the keys have a common format of being in the form of "key":"value".
You can use it as a parameter to split on.
for i in x.split("&"):
print(i.split(":"))
This would generate an array of even items where every even index is the key and odd index being the value. Iterate through the array and load it into a dictionary. You should be good!
I'd format data to YAML and parse the YAML
import re
import yaml
data = 'somekey:value4thekey&second-key:valu3-can.be?anything&third_k3y:it%can have spaces;too'
yaml_data = re.sub('[:]', ': ', re.sub('[&]', '\n', data ))
y = yaml.load(yaml_data)
for k in y:
print "%s : %s" % (k,y[k])
Here's the output:
third_k3y : it%can have spaces;too
somekey : value4thekey
second-key : valu3-can.be?anything

Manipulating a list of dictionaries

I successfully imported from the web this json file, which looks like:
[{"h_mag":"19.7","i_deg":"9.65","moid_au":"0.035"},{"h_mag":"20.5","i_deg":"14.52","moid_au":"0.028"},
etc ...
I want to extract the values of the key moid_au, later compare moid_au with the key values of h_mag.
This works: print(data[1]['moid_au']), but if I try to ask all the elements of the list it won't, I tried: print(data[:]['moid_au']).
I tried iterators and a lambda function but still has not work yet, mostly because I'm new in data manipulation. It works when I have one dictionary, not with a list of dictionaries.
Thanks in advance for other tips. Some links were confusing.
Sounds like you are using lambda wrong because you need map as well:
c = [{"h_mag":"19.7","i_deg":"9.65","moid_au":"0.035"},{"h_mag":"20.5","i_deg":"14.52","moid_au":"0.028"}]
list(map(lambda rec: rec.get('moid_au'), c))
['0.035', '0.028']
Each lambda grabs a record from your list and you map your function to that.
Using print(data[:]['moid_au']) equals to print(data['moid_au']), and you can see that it won't work, as data has no key named 'moid_au'.
Try working with a loop:
for item in data:
print(item['moid_au'])
using your approach to iterate over the whole array to get all the instances of a key,this method might work for you
a = [data[i]['moid_au']for i in range(len(data))]
print(a)
In which exact way do you want to compare them?
Would it be useful getting the values in a way like this?
list_of_dicts = [{"h_mag":"19.7","i_deg":"9.65","moid_au":"0.035"}, {"h_mag":"20.5","i_deg":"14.52","moid_au":"0.028"}]
mod_au_values = [d["moid_au"] for d in list_of_dicts]
h_mag_values = [d["h_mag"] for d in list_of_dicts]
for key, value in my_list.items ():
print key
print value
for value in my_list.values ():
print value
for key in my_list.keys():
print key

Parse string with three-level delimitation into dictionary

I've found how to split a delimited string into key:value pairs in a dictionary elsewhere, but I have an incoming string that also includes two parameters that amount to dictionaries themselves: parameters with one or three key:value pairs inside:
clientid=b59694bf-c7c1-4a3a-8cd5-6dad69f4abb0&keyid=987654321&userdata=ip:192.168.10.10,deviceid:1234,optdata:75BCD15&md=AMT-Cam:avatar&playbackmode=st&ver=6&sessionid=&mk=PC&junketid=1342177342&version=6.7.8.9012
Obviously these are dummy parameters to obfuscate proprietary code, here. I'd like to dump all this into a dictionary with the userdata and md keys' values being dictionaries themselves:
requestdict {'clientid' : 'b59694bf-c7c1-4a3a-8cd5-6dad69f4abb0', 'keyid' : '987654321', 'userdata' : {'ip' : '192.168.10.10', 'deviceid' : '1234', 'optdata' : '75BCD15'}, 'md' : {'Cam' : 'avatar'}, 'playbackmode' : 'st', 'ver' : '6', 'sessionid' : '', 'mk' : 'PC', 'junketid' : '1342177342', 'version' : '6.7.8.9012'}
Can I take the slick two-level delimitation parsing command that I've found:
requestDict = dict(line.split('=') for line in clientRequest.split('&'))
and add a third level to it to handle & preserve the 2nd-level dictionaries? What would the syntax be? If not, I suppose I'll have to split by & and then check & handle splits that contain : but even then I can't figure out the syntax. Can someone help? Thanks!
I basically took Kyle's answer and made it more future-friendly:
def dictelem(input):
parts = input.split('&')
listing = [part.split('=') for part in parts]
result = {}
for entry in listing:
head, tail = entry[0], ''.join(entry[1:])
if ':' in tail:
entries = tail.split(',')
result.update({ head : dict(e.split(':') for e in entries) })
else:
result.update({head: tail})
return result
Here's a two-liner that does what I think you want:
dictelem = lambda x: x if ':' not in x[1] else [x[0],dict(y.split(':') for y in x[1].split(','))]
a = dict(dictelem(x.split('=')) for x in input.split('&'))
Can I take the slick two-level delimitation parsing command that I've found:
requestDict = dict(line.split('=') for line in clientRequest.split('&'))
and add a third level to it to handle & preserve the 2nd-level dictionaries?
Of course you can, but (a) you probably don't want to, because nested comprehensions beyond two levels tend to get unreadable, and (b) this super-simple syntax won't work for cases like yours, where only some of the data can be turned into a dict.
For example, what should happen with 'PC'? Do you want to make that into {'PC': None}? Or maybe the set {'PC'}? Or the list ['PC']? Or just leave it alone? You have to decide, and write the logic for that, and trying to write it as an expression will make your decision very hard to read.
So, let's put that logic in a separate function:
def parseCommasAndColons(s):
bits = [bit.split(':') for bit in s.split(',')]
try:
return dict(bits)
except ValueError:
return bits
This will return a dict like {'ip': '192.168.10.10', 'deviceid': '1234', 'optdata': '75BCD15'} or {'AMT-Cam': 'avatar'} for cases where each comma-separated component has a colon inside it, but a list like ['1342177342'] for cases where any of them don't.
Even this may be a little too clever; I might make the "is this in dictionary format" check more explicit instead of just trying to convert the list of lists and see what happens.
Either way, how would you put that back into your original comprehension?
Well, you want to call it on the value in the line.split('='). So let's add a function for that:
def parseCommasAndColonsForValue(keyvalue):
if len(keyvalue) == 2:
return keyvalue[0], parseCommasAndColons(keyvalue[1])
else:
return keyvalue
requestDict = dict(parseCommasAndColonsForValue(line.split('='))
for line in clientRequest.split('&'))
One last thing: Unless you need to run on older versions of Python, you shouldn't often be calling dict on a generator expression. If it can be rewritten as a dictionary comprehension, it will almost certainly be clearer that way, and if it can't be rewritten as a dictionary comprehension, it probably shouldn't be a 1-liner expression in the first place.
Of course breaking expressions up into separate expressions, turning some of them into statements or even functions, and naming them does make your code longer—but that doesn't necessarily mean worse. About half of the Zen of Python (import this) is devoted to explaining why. Or one quote from Guido: "Python is a bad language for code golf, on purpose."
If you really want to know what it would look like, let's break it into two steps:
>>> {k: [bit2.split(':') for bit2 in v.split(',')] for k, v in (bit.split('=') for bit in s.split('&'))}
{'clientid': [['b59694bf-c7c1-4a3a-8cd5-6dad69f4abb0']],
'junketid': [['1342177342']],
'keyid': [['987654321']],
'md': [['AMT-Cam', 'avatar']],
'mk': [['PC']],
'playbackmode': [['st']],
'sessionid': [['']],
'userdata': [['ip', '192.168.10.10'],
['deviceid', '1234'],
['optdata', '75BCD15']],
'ver': [['6']],
'version': [['6.7.8.9012']]}
That illustrates why you can't just add a dict call for the inner level—because most of those things aren't actually dictionaries, because they had no colons. If you changed that, then it would just be this:
{k: dict(bit2.split(':') for bit2 in v.split(',')) for k, v in (bit.split('=') for bit in s.split('&'))}
I don't think that's very readable, and I doubt most Python programmers would. Reading it 6 months from now and trying to figure out what I meant would take a lot more effort than writing it did.
And trying to debug it will not be fun. What happens if you run that on your input, with missing colons? ValueError: dictionary update sequence element #0 has length 1; 2 is required. Which sequence? No idea. You have to break it down step by step to see what doesn't work. That's no fun.
So, hopefully that illustrates why you don't want to do this.

Categories