Mongodb adding a new field in an existing document, with specific position - python

I am facing this issue where I need to insert a new field in an existing document at a specific position.
Sample document: {
"name": "user",
"age" : "21",
"designation": "Developer"
}
So the above one is the sample document,what I want is to add "university" : "ASU" under key "age" is this possible?

Here's what you can do, first take the document as a dict, then we will determine the index of age and then we will do some indexing, look below:
>>> dic = { "name": "user", "age" : "21", "designation": "Developer" }
>>> dic['university'] = 'ASU'
>>> dic
{'name': 'user', 'age': '21', 'designation': 'Developer', 'university': 'ASU'}
Added the university field, now we will do some exchanging by using dic.items().
>>> i = list(dic.items())
>>> i
[('name', 'user'), ('age', '21'), ('designation', 'Developer'), ('university', 'ASU')]
#now we will acquire index of 'age' field
>>> index = [j for j in range(len(i)) if 'age' in i[j]][0]
#it will return list of single val from which we will index the val for simplicity using [0]
>>> index
1
#insert the last element ('university') below the age field, so we need to increment the index
>>> i.insert(index+1,i[-1])
# then we will create dictionary by removing the last element which is already inserted below age
>>> dict(i[:-1])
{'name': 'user', 'age': '21', 'university': 'ASU', 'designation': 'Developer'}

Related

python dynamic nested dictionary to csv

The obtained output below are from query results.
{'_id': ObjectId('651f3e6e5723b7c1'), 'fruits': {'pineapple': '2', 'grape': '0', 'apple': 'unknown'},'day': 'Tues', 'month': 'July', 'address': 'long', 'buyer': 'B1001', 'seller': 'S1301', 'date': {'date': 210324}}
{'_id': ObjectId('651f3e6e5723b7c1'), 'fruits': {'lemon': '2', 'grape': '0', 'apple': 'unknown', 'strawberry': '1'},'day': 'Mon', 'month': 'January', 'address': 'longer', 'buyer': 'B1001', 'seller': 'S1301', 'date': {'date': 210324}}
#worked but not with fruits and dynamic header
date = json.dumps(q['date']) #convert it to string
date = re.split("(:|\}| )", date)[4] #and split to get value
for q in db.fruits.aggregate(query):
print('"' + q['day'] + '","' + q['month'] + '","' + date + '","' + q['time'] + '","' + q['buyer'] + '","' + q['seller'] + '"')
#below close to what I want but having issue with nested and repeated rows
ffile = open("fruits.csv", "w")
w = csv.DictWriter(ffile, q.keys())
w.writeheader()
w.writerow(q)
I want to create a csv from it.
I am able to get everything exactly like the below table shown but not the fruits. I am stuck at nested dictionary field, and with the dynamic table header.
Mongoexport doesn’t work for me at the moment.
The field fruits could have more different nested key and value for each time.
I am currently still trying/exploring on csv.writer and try to add condition if i found nested dict. [will update answer if i manage to create the csv]
A hint to create this csv will be nice to have.
Thank you if anyone is sharing the link to similar question.
Not a problem!
We'll need to flatten the deep structure so we can all possible keys from there to form a CSV with. That requires a recursive function (flatten_dict here) to take an input dict and turn it into an output dict that contains no more dicts; here, the keys are tuples, e.g. ('foo', 'bar', 'baz').
We run that function over all input rows, gathering up the keys we've encountered along the way to the known_keys set.
That set is sorted (since we assume that the original dicts don't really have an intrinsic order either) and the dots joined to re-form the CSV header row.
Then, the flattened rows are simply iterated over and written (taking care to write an empty string for non-existent values).
The output is e.g.
_id,address,buyer,date.date,day,fruits.apple,fruits.grape,fruits.lemon,fruits.pineapple,fruits.strawberry,month,seller
651f3e6e5723b7c1,long,B1001,210324,Tues,unknown,0,,2,,July,S1301
651f3e6e5723b7c2,longer,B1001,210324,Mon,unknown,0,2,,1,January,S1301
import csv
import sys
rows = [
{
"_id": "651f3e6e5723b7c1",
"fruits": {"pineapple": "2", "grape": "0", "apple": "unknown"},
"day": "Tues",
"month": "July",
"address": "long",
"buyer": "B1001",
"seller": "S1301",
"date": {"date": 210324},
},
{
"_id": "651f3e6e5723b7c2",
"fruits": {
"lemon": "2",
"grape": "0",
"apple": "unknown",
"strawberry": "1",
},
"day": "Mon",
"month": "January",
"address": "longer",
"buyer": "B1001",
"seller": "S1301",
"date": {"date": 210324},
},
]
def flatten_dict(d: dict) -> dict:
"""
Flatten hierarchical dicts into a dict of path tuples -> deep values.
"""
out = {}
def _flatten_into(into, pairs, prefix=()):
for key, value in pairs:
p_key = prefix + (key,)
if isinstance(value, list):
_flatten_into(into, enumerate(list), p_key)
elif isinstance(value, dict):
_flatten_into(into, value.items(), p_key)
else:
out[p_key] = value
_flatten_into(out, d.items())
return out
known_keys = set()
flat_rows = []
for row in rows:
flat_row = flatten_dict(row)
known_keys |= set(flat_row.keys())
flat_rows.append(flat_row)
ordered_keys = sorted(known_keys)
writer = csv.writer(sys.stdout)
writer.writerow([".".join(map(str, key)) for key in ordered_keys])
for flat_row in flat_rows:
writer.writerow([str(flat_row.get(key, "")) for key in ordered_keys])

Inserting items into the middle of a dict

I have a list of dictionaries where all dicts contain similar items but some dicts are missing certain items. Here's an example of what it would look like:
data = [
{
'city' : Toronto,
'colour' : blue
},
{
'city' : London,
'country' : UK
'colour' : green,
'name' : Alex
},
{
'city' : Kingston,
'colour' : purple,
'name' : Alex
}
]
I need to match the format of the largest dict by inserting items (with the same keys but blank values) into the smaller dicts. I also need to preserve the order of the keys so I can't just insert them at the end. Following the previous example, it would look like this:
data = [
{
'city' : Toronto,
'country' : ,
'colour' : blue,
'name' :
},
{
'city' : London,
'country' : UK
'colour' : green,
'name' : Alex
},
{
'city' : Kingston,
'country' : ,
'colour' : purple,
'name' : Alex
}
]
I'm not sure how to loop through and add entries to each dict since the dicts I'm comparing are different size. I've tried copying the largest dict and editing it, adding blank values to the end of each dict and reformatting those, and creating new dicts as I loop through but nothing has worked so far.
Here is my code so far (where all_keys is a list of all the keys in the correct order).
def format_data(input_data, all_keys)
formatted_list = [{} for i in range(len(input_data)) ]
increment = 0
for i in range(len(input_data)):
for key, value in input_data[i].items():
if (key == all_keys[increment]):
formatted_list[i][increment].update(all_keys[increment], ''))
else:
formatted_list[i][inrement].update(key, value)
increment += 1
increment = 0
return formatted_list
How can I format this? Thanks!
Dictionaries are considered unordered (unless you are using Python 3.7+). If you need a specific order, this must be specified explicitly.
For Python <3.7, you can use collections.OrderedDict: there's no concept of "inserting into the middle of a dictionary".
The example below uses set.union to calculate the union of all keys; and sorted to sort keys alphabetically.
from collections import OrderedDict
keys = sorted(set().union(*data))
res = [OrderedDict([(k, d.get(k, '')) for k in keys]) for d in data]
Result:
print(res)
[OrderedDict([('city', 'Toronto'),
('colour', 'blue'),
('country', ''),
('name', '')]),
OrderedDict([('city', 'London'),
('colour', 'green'),
('country', 'UK'),
('name', 'Alex')]),
OrderedDict([('city', 'Kingston'),
('colour', 'purple'),
('country', ''),
('name', 'Alex')])]
You can use set:
import json
d = [{'city': 'Toronto', 'colour': 'blue'}, {'city': 'London', 'country': 'UK', 'colour': 'green', 'name': 'Alex'}, {'city': 'Kingston', 'colour': 'purple', 'name': 'Alex'}]
full_keys = {i for b in map(dict.keys, d) for i in b}
final_dict = [{i:b.get(i) for i in full_keys} for b in d]
print(json.dumps(final_dict, indent=4))
Output:
[
{
"colour": "blue",
"city": "Toronto",
"name": null,
"country": null
},
{
"colour": "green",
"city": "London",
"name": "Alex",
"country": "UK"
},
{
"colour": "purple",
"city": "Kingston",
"name": "Alex",
"country": null
}
]

Quick way to add a key to every value in a list

Let's say I have a list of elements
tagsList = ['dun', 'dai', 'che']
How do I convert the above into the following?
tagsDictionaries = [
{
'name': 'dun'
},
{
'name': 'dai'
},
{
'name': 'che'
}
]
I want to do this with a for loop
tagsDictionaries = [{'name': item} for item in tagsList]
Something like this would work for a flat dictionary. It needs unique key values each time:
for tag in tagsList:
tagDictionary.update({tag + 'uniquekey': tag})
What you show in your example would be a list of dictionaries which can be accomplished as follows:
for tag in tagsList:
tagListDict.append({'name': tag})
Here is a basic for loop that will get you the output that you are seeking:
tagsList = ['dun', 'dai', 'che']
tagsDictionaries = []
for name in tagsList:
new_dict = {'name': name}
tagsDictionaries.append(new_dict)
print(tagsDictionaries)
Here is your output:
[{'name': 'dun'}, {'name': 'dai'}, {'name': 'che'}]

Generate single dictionary from a list of dictionaries

I have a method that takes a list of field names. In the method, I am making an API call out to get a record which will contain a list of dictionaries of fields.
API call example:
"fields": [
{
"datetime_value": "1987-02-03T00:00:00",
"name": "birth_date"
},
{
"text_value": "Dennis",
"name": "first_name"
},
{
"text_value": "Monsewicz",
"name": "last_name"
},
{
"text_value": "Male",
"name": "sex"
},
{
"text_value": "White",
"name": "socks"
}
]
My method makeup looks like contact(contact_id, contact_fields) where contact_fields looks like ['last_name', 'first_name']
The final fields dictionary I am trying to create would look like (not worried about order):
{
"last_name": "Monsewicz",
"first_name": "Dennis"
}
So, basically generate a single dictionary where the key is the name attribute from each dictionary in the list, but only if the name is in the list of field names passed into the method.
I've tried this:
"fields": {x: y for x, y in contact['fields'] if x in contact_fields}
Something like this?
>>> fields
[{'datetime_value': '1987-02-03T00:00:00', 'name': 'birth_date'},
{'name': 'first_name', 'text_value': 'Dennis'},
{'name': 'last_name', 'text_value': 'Monsewicz'},
{'name': 'sex', 'text_value': 'Male'},
{'name': 'socks', 'text_value': 'White'}]
>>> output = {}
>>> for field in fields:
... key = field.pop('name')
... _unused_key, value = field.popitem()
... output[key] = value
...
>>> output
{'birth_date': '1987-02-03T00:00:00',
'first_name': 'Dennis',
'last_name': 'Monsewicz',
'sex': 'Male',
'socks': 'White'}
How about this one-liner?
output = dict((x['name'], x['text_value']) for x in fields)
It basically loops through fields, pulls out name/text_value pairs then constructs a dict from it.

How to properly parse JSON with simplejson?

I can have the following JSON string:
{ "response" : [ [ { "name" : "LA_",
"uid" : 123456
} ],
[ { "cid" : "1",
"name" : "Something"
} ],
[ { "cid" : 1,
"name" : "Something-else"
} ]
] }
or one of the following:
{"error":"some-error"}
{ "response" : [ [ { "name" : "LA_",
"uid" : 123456
} ],
[ { "cid" : "1",
"name" : ""
} ],
[ { "cid" : 1,
"name" : "Something-else"
} ]
] }
{ "response" : [ [ { "name" : "LA_",
"uid" : 123456
} ] ] }
So, I am not sure if all childs and elements are there. Will it be enough to do the following verifications to get Something value:
if jsonstr.get('response'):
jsonstr = jsonstr.get('response')[1][0]
if jsonstr:
name = jsonstr.get('name')
if jsonstr: # I don't need empty value
# save in the database
Can the same be simplified?
You're not guaranteed that the ordering of your inner objects will be the same every time you parse it, so indexing is not a safe bet to reference the index of the object with the name attribute set to Something.
Instead of nesting all those if statements, you can get away with using a list comprehension. Observe that if you iterate the response key, you get a list of lists, each with a dictionary inside of it:
>>> data = {"response":[[{"uid":123456,"name":"LA_"}],[{"cid":"1","name":"Something"}],[{"cid":1,"name":"Something-else"}]]}
>>> [lst for lst in data.get('response')]
[[{'name': 'LA_', 'uid': 123456}], [{'name': 'Something', 'cid': '1'}], [{'name': 'Something-else', 'cid': 1}]]
If you index the first item in each list (lst[0]), you end up with a list of objects:
>>> [lst[0] for lst in data.get('response')]
[{'name': 'LA_', 'uid': 123456}, {'name': 'Something', 'cid': '1'}, {'name': 'Something-else', 'cid': 1}]
If you then add an if condition into your list comprehension to match the name attribute on the objects, you get a list with a single item containing your desired object:
>>> [lst[0] for lst in data.get('response') if lst[0].get('name') == 'Something']
[{'name': 'Something', 'cid': '1'}]
And then by indexing the first item that final list, you get the desired object:
>>> [lst[0] for lst in data.get('response') if lst[0].get('name') == 'Something'][0]
{'name': 'Something', 'cid': '1'}
So then you can just turn that into a function and move on with your life:
def get_obj_by_name(data, name):
objects = [lst[0] for lst in data.get('response', []) if lst[0].get('name') == name]
if objects:
return objects[0]
return None
print get_obj_by_name(data, 'Something')
# => {'name': 'Something', 'cid': '1'}
print get_obj_by_name(data, 'Something')['name']
# => 'Something'
And it should be resilient and return None if the response key isn't found:
print get_obj_by_name({"error":"some-error"}, 'Something')
# => None

Categories