I'm working on some code that processes a json database with very detailed information into a simpler format. It copies some of the fields and reserializes others into a new json file.
I'm currently using a dictionary comprehension like this MVCE:
converted_data = {
raw_item['name']: {
'state': raw_item['field_a'],
'variations': [variant for variant in raw_item['field_b']['variations']]
} for raw_item in json.loads(my_file.read())
}
An example file (not the actual data being used) is this:
[
{
"name": "Object A",
"field_a": "foo",
"field_b": {
"bar": "baz",
"variants": [
"foo",
"bar",
"baz"
]
}
},
{
"name": "Object B",
"field_a": "foo",
"field_b": {
"bar": "baz",
}
}
]
The challenge is that not all items contain variations. I see two potential solutions:
Use an if statement to conditionally apply the variations field into the dictionary.
Include an empty variations field for all items and fill it if the raw item contains variations.
I'll probably settle on the 2nd solution. However, is there a way to conditionally include a particular field within a dictionary comprehension?
Edit: In other words, is approach 1 possible inside a dictionary comprehension?
An example of the desired output (using a dictionary comprehension) would be as follows:
{
"Object A": {
"state": "foo",
"variants": ["foo", "bar", "baz"]
},
"Object B": {
"state": "foo"
}
}
I've found some other questions that change the entries conditionally or filter the entries, but these don't unconditionally create an item where a particular field (in the item) is conditionally absent.
I'm not sure you realise you can use the if inside an assignment, which seems like a very clean way to solve it to me:
converted_data = {
raw_item['name']: {
'state': raw_item['field_a'],
'variants': [] if 'variants' not in raw_item['field_b'] else
[str(variant) for variant in raw_item['field_b']['variants']]
} for raw_item in example
}
(Note: using str() instead of undefined function that was given in initial example)
After clarification of the question, here's an alternate solution that adds a different dictionary (missing the empty 'variations' key if there is none:
converted_data = {
raw_item['name']: {
'state': raw_item['field_a'],
'variants': [str(variant) for variant in raw_item['field_b']['variants']]
} if 'variants' in raw_item['field_b'] else {
'state': raw_item['field_a'],
} for raw_item in example
}
If the question actually is: can a key/value pair in a dictionary literal be optional (which would solve your problem) then the answer is simply "no". But the above achieves the same for this simple example.
If the real life situation is more complicated, simply construct the dictionary as in the first solution given here and then use del(dictionary['key']) to remove any added keys that have a None or [] value after construction.
For example, after the first example, converted_data could be cleaned up with:
for item in converted_data.values:
if not item['variants']:
del(item['variants'])
You could pass the process out to a function?
def check_age(age):
return age >= 18
my_dic = {
"age": 25,
"allowed_to_drink": check_age(25)
}
You end up with the value as the result of the function call
{'age': 25, 'allowed_to_drink': True}
How you would implement this I don't know, but some food for thought.
Related
Lets say I have a dictionary:
episode = {
"translations": [{
"language": {
"code": "de"
},
"title": "German"
}, {
"language": {
"code": "en"
},
"title": "English"
}, {
"language": {
"code": "fr"
},
"title": "French"
}]
};
I would like to get specifically the list that matches a specific language code. I could walk through the entire dictionary using the following code:
for translation in episode['translations']:
if translation['language']['code'] == 'fr':
language = translation;
break;
But that seems a bit excessive, and a waste of resources. Is there a better way of doing this, without having to walk through the entire array?
If the data is stored in a list, then the only way to extract queries based on a condition is to iterate over the entries. In the snippet you provide, the implicit assumption of using break is that there is a unique entry of interest (or perhaps the first match is of interest).
For more optimal queries of this data, it should be transformed to a different structure. For example, it's possible to convert it to a pandas dataframe or convert the data to a dictionary where keys are translation['language']['code'] (so look-ups become O(1)).
Short of modifying how you structured your data*, I don't see a way that doesn't involve traversing the whole dictionary. That being said, there are a few more elegant ways to do it, although elegance is highly subjective:
filter(lambda x: 'fr' in x['language'].values(), episode['translations'])
would give you an iterable that contains all the the entries in your dictionary that have the required language code. Calling next on it would give you the first one, for instance.
Edit:
* what SultanOrazbayev proposed in their answer is one such way to modify your data structure.
A list comprehension is neater although not really different from the original code except that it allows for more than one dictionary entry having a particular code. For example:
episode = {
"translations": [{
"language": {
"code": "de"
},
"title": "German"
}, {
"language": {
"code": "en"
},
"title": "English"
}, {
"language": {
"code": "fr"
},
"title": "French"
}]
}
list_ = [d for d in episode['translations'] if d['language']['code'] == 'en']
print(list_)
Output:
[{'language': {'code': 'en'}, 'title': 'English'}]
How do I access to visibilities?
I am trying like this: dev1['data']['results :visibilites ']
dev1 = {
"status": "OK",
"data": {
"results": [
{
"tradeRelCode": "ZT55",
"customerCode": "ZC0",
"customerName": "XYZ",
"tier": "null1",
"visibilites": [
{
"code": "ZS0004207",
"name": "Aabc Systems Co.,Ltd",
"siteVisibilityMap": {
},
"customerRef": "null1"
}
]
}
],
"pageNumber": 3,
"limit": 1,
"total": 186
}
}
You can use dev1['data']['results'][0]['visibilites'].
It will contain a list of one dictionary.
To access this dictionary directly, use dev1['data']['results'][0]['visibilites'][0]
dev['data'] represents a dictionary that has for key results.
You can access the item associated to results key (a list) using (dev1['data'])['results'].
To access the only member of this list, you use ((dev1['data'])['results'])[0].
This gives you a dictionary that has tradeRelCode, customerCode, customerName, tier and visibilites keys.
To access the item associated to visibilites key (a list), you have tu use (((dev1['data'])['results'])[0])['visibilites'].
To finally access the only dictionary contained in this list, you have tu use ((((dev1['data'])['results'])[0])['visibilites'])[0].
Parenthesis are here to show that python dig into each dictionary or list in order from left to right (python does not mind the parenthesis in the code, you can keep them if it is clearer for you.)
In your data structure use path
dev1['data']['results'][0]['visibilites']
Try this
dev1['data']['results'][0]['visibilites']
Reason:
This is a list -> dev1['data']['results']
So, access this -> dev1['data']['results'][0]
and then you obtain this ->
{'tradeRelCode': 'ZT55',
'customerCode': 'ZC0',
'customerName': 'XYZ',
'tier': 'null1',
'visibilites': [{'code': 'ZS0004207',
'name': 'Aabc Systems Co.,Ltd',
'siteVisibilityMap': {},
'customerRef': 'null1'}]}
and then you can have -> dev1['data']['results'][0]['visibilites']
which results in ->
[{'code': 'ZS0004207',
'name': 'Aabc Systems Co.,Ltd',
'siteVisibilityMap': {},
'customerRef': 'null1'}]
which is a list and you can index the first element which is another dictionary
Alright, so I'm struggling a little bit with trying to parse my JSON object.
My aim is to grab the certain JSON key and return it's value.
JSON File
{
"files": {
"resources": [
{
"name": "filename",
"hash": "0x001"
},
{
"name": "filename2",
"hash": "0x002"
}
]
}
}
I've developed a function which allows me to parse the JSON code above
Function
def parsePatcher():
url = '{0}/{1}'.format(downloadServer, patcherName)
patch = urllib2.urlopen(url)
data = json.loads(patch.read())
patch.close()
return data
Okay so now I would like to do a foreach statement which prints out each name and hash inside the "resources": [] object.
Foreach statement
for name, hash in patcher["files"]["resources"]:
print name
print hash
But it only prints out "name" and "hash" not "filename" and "0x001"
Am I doing something incorrect here?
By using name, hash as the for loop target, you are unpacking the dictionary:
>>> d = {"name": "filename", "hash": "0x001"}
>>> name, hash = d
>>> name
'name'
>>> hash
'hash'
This happens because iteration over a dictionary only produces the keys:
>>> list(d)
['name', 'hash']
and unpacking uses iteration to produce the values to be assigned to the target names.
That that worked at all is subject to random events even, on Python 3.3 and newer with hash randomisation enabled by default, the order of those two keys could equally be reversed.
Just use one name to assign the dictionary to, and use subscription on that dictionary:
for resource in patcher["files"]["resources"]:
print resource['name']
print resource['hash']
So what you intend to do is :
for dic in x["files"]["resources"]:
print dic['name'],dic['hash']
You need to iterate on those dictionaries in that array resources.
The problem seems to be you have a list of dictionaries, first get each element of the list, and then ask the element (which is the dictionary) for the values for keys name and hash
EDIT: this is tested and works
mydict = {"files": { "resources": [{ "name": "filename", "hash": "0x001"},{ "name": "filename2", "hash": "0x002"}]} }
for element in mydict["files"]["resources"]:
for d in element:
print d, element[d]
If in case you have multiple files and multiple resources inside it. This generalized solution works.
for keys in patcher:
for indices in patcher[keys].keys():
print(patcher[keys][indices])
Checked output from myside
for keys in patcher:
... for indices in patcher[keys].keys():
... print(patcher[keys][indices])
...
[{'hash': '0x001', 'name': 'filename'}, {'hash': '0x002', 'name': 'filename2'}]
I have JSON output as follows:
{
"service": [{
"name": ["Production"],
"id": ["256212"]
}, {
"name": ["Non-Production"],
"id": ["256213"]
}]
}
I wish to find all ID's where the pair contains "Non-Production" as a name.
I was thinking along the lines of running a loop to check, something like this:
data = json.load(urllib2.urlopen(URL))
for key, value in data.iteritems():
if "Non-Production" in key[value]: print key[value]
However, I can't seem to get the name and ID from the "service" tree, it returns:
if "Non-Production" in key[value]: print key[value]
TypeError: string indices must be integers
Assumptions:
The JSON is in a fixed format, this can't be changed
I do not have root access, and unable to install any additional packages
Essentially the goal is to obtain a list of ID's of non production "services" in the most optimal way.
Here you go:
data = {
"service": [
{"name": ["Production"],
"id": ["256212"]
},
{"name": ["Non-Production"],
"id": ["256213"]}
]
}
for item in data["service"]:
if "Non-Production" in item["name"]:
print(item["id"])
Whatever I see JSON I think about functionnal programming ! Anyone else ?!
I think it is a better idea if you use function like concat or flat, filter and reduce, etc.
Egg one liner:
[s.get('id', [0])[0] for s in filter(lambda srv : "Non-Production" not in srv.get('name', []), data.get('service', {}))]
EDIT:
I updated the code, even if data = {}, the result will be [] an empty id list.
I have a json response from an API in this way:-
{
"meta": {
"code": 200
},
"data": {
"username": "luxury_mpan",
"bio": "Recruitment Agents👑👑👑👑\nThe most powerful manufacturers,\nwe have the best quality.\n📱Wechat:13255996580💜💜\n📱Whatsapp:+8618820784535",
"website": "",
"profile_picture": "https://scontent.cdninstagram.com/t51.2885-19/10895140_395629273936966_528329141_a.jpg",
"full_name": "Mpan",
"counts": {
"media": 17774,
"followed_by": 7982,
"follows": 7264
},
"id": "1552277710"
}
}
I want to fetch the data in "media", "followed_by" and "follows" and store it in three different lists as shown in the below code:--
for r in range(1,5):
var=r,st.cell(row=r,column=3).value
xy=var[1]
ij=str(xy)
myopener=Myopener()
url=myopener.open('https://api.instagram.com/v1/users/'+ij+'/?access_token=641567093.1fb234f.a0ffbe574e844e1c818145097050cf33')
beta=json.load(url)
for item in beta['data']:
list1.append(item['media'])
list2.append(item['followed_by'])
list3.append(item['follows'])
When I run it, it shows the error TypeError: string indices must be integers
How would my loop change in order to fetch the above mentioned values?
Also, Asking out of curiosity:- Is there any way to fetch the Watzapp no from the "BIO" key in data dictionary?
I have referred questions similar to this and still did not get my answer. Please help!
beta['data'] is a dictionary object. When you iterate over it with for item in beta['data'], the values taken by item will be the keys of the dictionary: "username", "bio", etc.
So then when you ask for, e.g., item['media'] it's like asking for "username"['media'], which of course doesn't make any sense.
It isn't quite clear what it is that you want: is it just the stuff inside counts? If so, then instead of for item in beta['data']: you could just say item = beta['data']['counts'], and then item['media'] etc. will be the values you want.
As to your secondary question: I suggest looking into regular expressions.