This is my dictionary format:
quest_attr = {
"questions": [
{
"Tags": [
{
"tagname": ""
}
],
"Title": "",
"Authors": [
{
"name": ""
}
],
"Answers": [
{
"ans": ""
}
],
"Related_Questions": [
{
"quest": ""
}
]
}
]
}
I want to add list of "Tags" such that the result will be:
"questions":[
{
"Tags": [
{"tagname":"#Education"}, {"tagname":"#Social"}
],
remaining fields...
}
The remaining fields can be assumed to be null. And I want to add multiple questions to the main "questions" list.
I am using this code but he results are not as expected.
ind=0
size=len(tags)
while ind<size:
quest_attr["questions"].append({["Tags"].append({"tagname":tags[ind]})})
ind=ind+1
And if I maintain a variable for looping through the list of questions like:
quest_attr["questions"][ind]["Tags"].append({"tagname":tags[ind]
It gives an error that the index is out of range. What should I do?
It appears that the index variable ind is intended to iterate only through the list of tags. The way you have the append structured, your loop will attempt to attach the next tag to the next question in the questions list, instead of adding the rest of the tags to the same question.
If you were to add the same set to multiple questions, you need loop through the questions list separately while nesting your append statement for the tags inside another loop. On the other hand, if there's only one question you want to target, just use the index number, [0] in this case.
Something like this would perhaps work better but more context would help:
for question in quest_attr["questions"]:
for tag in tags:
question["Tags"].append({"tagname":tag})
Please don't make a mess with dict and list like your code.
Here I recommend a simpler deploy.
quest_attr = {
'questions': {
"Tags":[],
"Title":"",
"Authors":[],
"Answers":[],
"Related_Questions":[]
}
}
tags = [ {"tagname":"#Education"},{"tagname":"#Social"} ]
quest_attr["questions"]['Tags'] += tags
print(quest_attr)
Related
I have a file of hundreds of json schemas that I need to parse through and fix the schema format in a way that the "isRelatedto" property will have an object(dictionary) without the extra strings outside the object. Also, with the strings that startswith "pubmed", I want to change it to "pmid" and then insert it into the object inside of "isRelatedto" array(list). The problem I have is that some of the "isRelatedto" array has object and some of them do not as shown below. When I tried to use the code I wrote to make the change, the schema without an object does not change. I would really appreciate your help!
"isRelatedto": [
"pubmed:18984613",
"pubmed:25392406",
"pubmed:33147627"
]
"isRelatedto": [
"pubmed:20947564",
"pmcid:PMC3013774",
{
"#id": "https://doi.org/10.1093/nar/gkq901",
"#type": "sc:CreativeWork"
}
]
The expected result for the first one will be:
"isRelatedto": [
{
"pmid": "18984613",
"pmid": "25392406",
"pmid": "33147627"
}
]
and for the second one will be:
"isRelatedto": [
{
"#id": "https://doi.org/10.1093/nar/gkq901",
"#type": "sc:CreativeWork"
"pmid": "20947564"
}
]
I have a document structure like this:
{
"name": "Example",
"description": "foo",
"vocabulary": [
["apple", "pomme"],
["hello", "bonjour"],
["bus", "bus"]
]
}
Now I want to pull an array inside the vocabulary array by specifying the first item, a.E.:
{"$pull": {"vocabulary.$": ["apple"]}
Which should remove the array ["apple", "pomme"] from vocabulary, but this doesn't work.
I tried this ($pull from nested array), but it did not work, it threw
pymongo.errors.WriteError:
The positional operator did not find the match needed from the query., full error: {'index': 0, 'code': 2, 'errmsg': 'The positional operator did not find the match needed from the query.'}
Very tricky question.
I think for this case $ positional operator is not suitable.
Instead, you need an aggregation pipeline in update query.
Query ($filter) the values with the "apple" word is not ($not) existed ($in) in the vocabulary array field. Then $set to the vocabulary field.
db.collection.update({},
[
{
"$set": {
"vocabulary": {
$filter: {
input: "$vocabulary",
cond: {
$not: {
$in: [
"apple",
"$$this"
]
}
}
}
}
}
}
])
Sample Mongo Playground
I'm new to Python, trying to gather data from a json file that consists of a list that contains info inside dictionaries as follows. How do I extract the "count" data from this? (Without using list comprehension)
{
"stats":[
{
"name":"Ron",
"count":98
},
{
"name":"Sandy",
"count":89
},
{
"name":"Sam",
"count":77
}
]
}
Index the list using the stats key then iterate through it
data = {
"stats":[
{
"name":"Ron",
"count":98
},
{
"name":"Sandy",
"count":89
},
{
"name":"Sam",
"count":77
}
]
}
for stat in data['stats']:
count = stat['count']
Consider the dictionary data stored in a variable source.
source = {
"stats":[
{
"name":"Ron",
"count":98
},
{
"name":"Sandy",
"count":89
},
{
"name":"Sam",
"count":77
}
]
}
Now to access the count field inside of "stats" we use indexing.
For example, to view the count of "Ron" you would write:
print(source['stats'][0]['count'])
This will result in 98
Similarly, for "Sam" it will be
print(source['stats'][2]['count'])
And the result will be 77
In short, we first index the key of dictionary, then the array position and then provide the filed from array of which you want the data.
I hope it helped.
Simply append all those values to do calculations:
count_values = []
for dic in data['stats']:
count_values.append(dic['count'])
# Do anything with count_values
print(count_values)
According to the Zen of Python, "Simple is better than complex."
Thus, list comprehension is actually the best way to extract the information you need and still have it available for further processing (in the form of a list of count values).
d = <your dict-list>
count_data_list = [ x['count'] for x in d['stats'] ]
If not, and your intention is to process the "count" data as it is extracted, I'd suggest a for-loop:
d = <your dict-list>
for x in d['stats']:
count_data = x['count']
<process "count_data">
using a map function will do that in a single line
>>> result = list(map(lambda x: x['count'], data['stats']))
[98, 89, 77]
I'm working on some code that processes a json database with very detailed information into a simpler format. It copies some of the fields and reserializes others into a new json file.
I'm currently using a dictionary comprehension like this MVCE:
converted_data = {
raw_item['name']: {
'state': raw_item['field_a'],
'variations': [variant for variant in raw_item['field_b']['variations']]
} for raw_item in json.loads(my_file.read())
}
An example file (not the actual data being used) is this:
[
{
"name": "Object A",
"field_a": "foo",
"field_b": {
"bar": "baz",
"variants": [
"foo",
"bar",
"baz"
]
}
},
{
"name": "Object B",
"field_a": "foo",
"field_b": {
"bar": "baz",
}
}
]
The challenge is that not all items contain variations. I see two potential solutions:
Use an if statement to conditionally apply the variations field into the dictionary.
Include an empty variations field for all items and fill it if the raw item contains variations.
I'll probably settle on the 2nd solution. However, is there a way to conditionally include a particular field within a dictionary comprehension?
Edit: In other words, is approach 1 possible inside a dictionary comprehension?
An example of the desired output (using a dictionary comprehension) would be as follows:
{
"Object A": {
"state": "foo",
"variants": ["foo", "bar", "baz"]
},
"Object B": {
"state": "foo"
}
}
I've found some other questions that change the entries conditionally or filter the entries, but these don't unconditionally create an item where a particular field (in the item) is conditionally absent.
I'm not sure you realise you can use the if inside an assignment, which seems like a very clean way to solve it to me:
converted_data = {
raw_item['name']: {
'state': raw_item['field_a'],
'variants': [] if 'variants' not in raw_item['field_b'] else
[str(variant) for variant in raw_item['field_b']['variants']]
} for raw_item in example
}
(Note: using str() instead of undefined function that was given in initial example)
After clarification of the question, here's an alternate solution that adds a different dictionary (missing the empty 'variations' key if there is none:
converted_data = {
raw_item['name']: {
'state': raw_item['field_a'],
'variants': [str(variant) for variant in raw_item['field_b']['variants']]
} if 'variants' in raw_item['field_b'] else {
'state': raw_item['field_a'],
} for raw_item in example
}
If the question actually is: can a key/value pair in a dictionary literal be optional (which would solve your problem) then the answer is simply "no". But the above achieves the same for this simple example.
If the real life situation is more complicated, simply construct the dictionary as in the first solution given here and then use del(dictionary['key']) to remove any added keys that have a None or [] value after construction.
For example, after the first example, converted_data could be cleaned up with:
for item in converted_data.values:
if not item['variants']:
del(item['variants'])
You could pass the process out to a function?
def check_age(age):
return age >= 18
my_dic = {
"age": 25,
"allowed_to_drink": check_age(25)
}
You end up with the value as the result of the function call
{'age': 25, 'allowed_to_drink': True}
How you would implement this I don't know, but some food for thought.
I have JSON output as follows:
{
"service": [{
"name": ["Production"],
"id": ["256212"]
}, {
"name": ["Non-Production"],
"id": ["256213"]
}]
}
I wish to find all ID's where the pair contains "Non-Production" as a name.
I was thinking along the lines of running a loop to check, something like this:
data = json.load(urllib2.urlopen(URL))
for key, value in data.iteritems():
if "Non-Production" in key[value]: print key[value]
However, I can't seem to get the name and ID from the "service" tree, it returns:
if "Non-Production" in key[value]: print key[value]
TypeError: string indices must be integers
Assumptions:
The JSON is in a fixed format, this can't be changed
I do not have root access, and unable to install any additional packages
Essentially the goal is to obtain a list of ID's of non production "services" in the most optimal way.
Here you go:
data = {
"service": [
{"name": ["Production"],
"id": ["256212"]
},
{"name": ["Non-Production"],
"id": ["256213"]}
]
}
for item in data["service"]:
if "Non-Production" in item["name"]:
print(item["id"])
Whatever I see JSON I think about functionnal programming ! Anyone else ?!
I think it is a better idea if you use function like concat or flat, filter and reduce, etc.
Egg one liner:
[s.get('id', [0])[0] for s in filter(lambda srv : "Non-Production" not in srv.get('name', []), data.get('service', {}))]
EDIT:
I updated the code, even if data = {}, the result will be [] an empty id list.