How do I convert ObjectId to string after find()? - python

I have this example of an object in a movies collection:
{
"_id":{"$oid":"5f5101c31a05d8a343f944b1"},
"title":"Mother to Earth",
"year":2020,
"description":"A group of simps tries to find the source of an obscure meme game.",
"screenings":
[
{
"screeningID":{"$oid":"5f5101c31a05d8a343f944b0"},
"timedate":"2020-09-29, 18:00PM",
"tickets":46
}
]
}
And I want this to be the output of a find() function, with title as the query. However, when I include _id and screeningID, I get a TypeError: Object of type ObjectId is not JSON serializable error. I need the screeningID's value, in order to use it in a later part of my code, and preferably as a string. How do I do that?
EDIT: Here are the two lines of code in question:
result = movies.find_one({'title':data['title']})
result = {'title': result['title'],"year": result['year'],'description': result['description'],'screenings': [result['screenings']]}
I skipped the conditional checks I had in there, for simplicity's sake. As is, this produces the error I showed above. The only solution is to add {'_id':0, 'screenings.screeningID':0} in the projection of the first line, but this means losing the ObjectIds, and especially screeningID, which I need for later.

I ran your code and got no errors. If you run this do you get any errors?
from pymongo import MongoClient
from bson import ObjectId
data = {
"_id": ObjectId("5f5101c31a05d8a343f944b1"),
"title":"Mother to Earth",
"year":2020,
"description":"A group of simps tries to find the source of an obscure meme game.",
"screenings":
[
{
"screeningID": ObjectId("5f5101c31a05d8a343f944b0"),
"timedate":"2020-09-29, 18:00PM",
"tickets":46
}
]
}
db = MongoClient(port=27019)['testdatabase']
db.testcollection.delete_many({})
db.testcollection.insert_one(data)
result = db.testcollection.find_one({'title':data['title']})
result = {'title': result['title'],"year": result['year'],'description': result['description'],'screenings': [result['screenings']]}

Related

How to order bug reports by creation time

I am currently querying Bugzilla as follows:
r = requests.get(
"https://bugzilla.mozilla.org/rest/bug",
params={
"chfield": "[Bug creation]",
"chfieldfrom": "2015-01-01",
"chfieldto": "2016-01-01",
"resolution": "FIXED",
"limit": 200,
"api_key": api_key,
"include_fields": [
"id",
"description",
"creation_time",
],
},
)
and all I would like to add to my query is a method for ordering the bug reports. I have scoured the web for a method for ordering these results: ultimately, I would like them to be ordered from "2016-01-01" descending. I have tried adding the following key-value pairs to params:
"order": "creation_time desc"
"sort_by": "creation_time", "order" : "desc"
"chfieldorder": "desc"
and I've tried editing the URL to be https://bugzilla.mozilla.org/rest/bug?orderBy=creation_time:desc but none of these approaches have worked. Unfortunately, adding invalid keys fails without error: results are returned, just not in sorted order.
Ordering and ranges (ie., chfieldfrom and chfieldto) were not in any of the documentation that I found either.
I am aware that a hacked method of gathering ordered results would be to specify a narrow range of dates to get bug reports from, but I'm hoping there exists an actual key-value pair that can be specified to achieve the task.
Notably, of course: sorting after the request returns in r is invalid, because the results in r do not contain the most recent bugs.
You need to add
"order": [
"opendate DESC",
],
to your params.
Quick test
To see more easily that it works, just run something like this after you received the response in r:
data = json.loads(r.content)
bugs = data['bugs']
times = [x['creation_time'] for x in bugs]
print(times)
gives:
['2016-01-01T21:53:20Z', '2016-01-01T21:37:58Z', '2016-01-01T20:12:07Z', '2016-01-01T19:29:30Z', '2016-01-01T19:10:46Z', '2016-01-01T15:56:35Z',...
Details
If you are interested in the details: It looks like some fields in the Bugzilla codebase have different field names.
Take a look here https://github.com/bugzilla/bugzilla/blob/5.2/Bugzilla/Search.pm#L557:
# Backward-compatibility for old field names. Goes new_name => old_name.
# These are here and not in _translate_old_column because the rest of the
# code actually still uses the old names, while the fielddefs table uses
# the new names (which is not the case for the fields handled by
# _translate_old_column).
my %old_names = (
creation_ts => 'opendate',
delta_ts => 'changeddate',
work_time => 'actual_tFile.join(File.dirname(__FILE__), *%w[rel path here])ime',
);

How to iterate over a python dictionary, setting the key of the dictionary as another dictionary's value

I come from a C++ background, I am new to Python, and I suspect this problem has something to do with [im]mutability.
I am building a JSON representation in Python that involves several layers of nested lists and dictionaries in one "object". My goal is to call jsonify on the end result and have it look like nicely structured data.
I hit a problem while building out an object:
approval_groups_list = list()
approval_group_dict = dict()
for groupMemKey, groupvals in groupsAndMembersDict.items():
approval_group_dict["group_name"] = groupMemKey
approval_group_dict["name_dot_numbers"] = groupvals # groupvals is a list of strings
approval_groups_list.append(approval_group_dict)
entity_approval_unit["approval_groups"] = approval_groups_list
The first run does as expected, but after, whatever groupMemkey is touched last, that is what all other objects mirror.
groupsAndMembersDict= {
'Art': ['string.1', 'string.2', 'string.3'],
'Math': ['string.10', 'string.20', 'string.30']
}
Expected result:
approval_groups:
[
{
"group_name": "Art",
"name_dot_numbers": ['string.1', 'string.2', 'string.3']
},
{
"group_name": "Math",
"name_dot_numbers": ['string.10', 'string.20', 'string.30']
}
]
Actual Result:
approval_groups:
[
{
"group_name": "Math",
"name_dot_numbers": ['string.10', 'string.20', 'string.30']
},
{
"group_name": "Math",
"name_dot_numbers": ['string.10', 'string.20', 'string.30']
}
]
What is happening, and how do I fix it?
Your problem is not the immutability, but the mutability of objects. I'm sure you would have ended up with the same result with the equivalent C++ code.
You construct approval_group_dict before the for loop and keep reusing it. All you have to do is to move the construction inside for so that a new object is created for each loop:
approval_groups_list = list()
for groupMemKey, groupvals in groupsAndMembersDict.items():
approval_group_dict = dict()
...
Through writing this question, it dawned on me to try a few things including this, which fixed my problem - however, I still don't know exactly why this works. Perhaps it is more like a pointer/referencing problem?
approval_groups_list = list()
approval_group_dict = dict()
for groupMemKey, groupvals in groupsAndMembersDict.items():
approval_group_dict["group_name"] = groupMemKey
approval_group_dict["name_dot_numbers"] = groupvals
approval_groups_list.append(approval_group_dict.copy()) # <== note, here is the difference ".copy()"
entity_approval_unit["approval_groups"] = approval_groups_list
EDIT: The problem turns out to be that Python is Pass by [object] reference all the time. If you are new to Python like me, this means: "pass by reference, except when the thing you are passing is immutable, then its pass by value". So in a way it did have to do with [im]mutability. Mostly it had to do with my lack of understanding how Python passes references.

Accessing Nested JSON [AWS Metadata] with Python

I'm using Lambda to run through my AWS account, returning a list of all instances. I need to be able to print out all of the 'VolumeId' values, but I can't work out how to access them as they are nested. I am able to print out the first VolumeId for each instance, however, some of the instances have several volumes, and some only have one. I think I know why I get these results, but I can't work out what to do to get all of them back.
Here's a snippet of what the JSON for one instance looks like:
{
'Groups':[],
'Instances':[
{
'AmiLaunchIndex':0,
'ImageId':'ami-0',
'InstanceId':'i-0123',
'InstanceType':'big',
'KeyName':'nonprod',
'LaunchTime':'date',
'Monitoring':{
'State':'disabled'
},
'Placement':{
'AvailabilityZone':'world',
'GroupName':'',
'Tenancy':'default'
},
'PrivateDnsName':'secret',
'PrivateIpAddress':'1.2.3.4',
'ProductCodes':[
],
'PublicDnsName':'',
'State':{
'Code':80,
'Name':'stopped'
},
'StateTransitionReason':'User initiated',
'SubnetId':'subnet-1',
'VpcId':'vpc-1',
'Architecture':'yes',
'BlockDeviceMappings':[
{
'DeviceName':'/sda',
'Ebs':{
'AttachTime':'date',
'DeleteOnTermination':True,
'Status':'attached',
'VolumeId':'vol-1'
}
},
{
'DeviceName':'/sdb',
'Ebs':{
'AttachTime':'date'),
'DeleteOnTermination':False,
'Status':'attached',
'VolumeId':'vol-2'
}
}
],
This is what I'm doing to get the first VolumeId:
ec2client = boto3.client('ec2')
ec2 = ec2client.describe_instances()
for reservation in ec2["Reservations"]:
for instance in reservation["Instances"]:
instanceid = instance["InstanceId"]
volumes = instance["BlockDeviceMappings"][0]["Ebs"]["VolumeId"]
print("The associated volume IDs for this instance are: ",(volumes))
I think the reason that I'm getting just the first ID is because I'm referencing the first element within "BlockDeviceMappings", but I can't work out how to get the other ones. If I try it without specifying the [0], I get the list indices must be integers or slices, not str error. I tried to use a dictionary instead of a list too, but felt like I was barking up the wrong tree with that one. Any suggestions/help would be appreciated!
One possible answer, not particularly pythonic
...
id_list = []
volumes_data = instance["BlockDeviceMappings"]
for element in volumes_data:
id_list.append(element["Ebs"]["VolumeId"])
Or else use json.loads and then iterate though json using .get syntax like the final answer in this

TypeError : Trouble accessing JSON metadata with Python

So I'm trying to access the following JSON data with python and when i give the statement :
print school['students']
The underlying data gets printed but what I really want to be able to do is print the 'id' value.
{ 'students':[
{
'termone':{
'english':'fifty',
'science':'hundred'
},
'id':'RA1081310005'
}
]
}
So when I do the following I get an error :
print school ['students']['id']
TypeError: list indices must be integers, not str
Can anyone suggest how i can access the ID & where I'm going wrong!
school['students'] is a list. You are trying to access the first element of that list and id key belongs to that element. Instead, try this:
school['students'][0]['id']
Out: 'RA1081310005'
The problem here is that in your list, 'id' is not a part of a dictionary, it is part of a list. To fix this, change your dictionary to the following:
school = {'students':{
'termone': {
"english": "fifty:,
"science": "hundred
},
"id":"RA1081310005"
}
}
Basically, you have a list, and there is no reason to have it, so I removed it.

Return Random Result from JSON using PyMongo

I'm attempting to retrieve a random result from a collection of JSON data using PyMongo. I'm using Flask and MongoDB. Here is how it is set up:
def getData():
dataCollection = db["data"]
for item in dataCollection.find({},{"Category":1,"Name":1,"Location":1,"_id":0}):
return (jsonify(item)
return (jsonify(item) returns 1 result and it is always the first one. How can I randomize this?
I tried importing the random module (import random) and switched the last line to random.choice(jsonify(item) but that results in an error.
Here is what the data looks like that was imported into MongoDB:
[
{
"Category":"Tennis",
"Name":"ABC Courts",
"Location":"123 Fake St"
},
{
"Category":"Soccer",
"Name":"XYZ Arena",
"Location":"319 Ace Blvd"
},
{
"Category":"Basketball",
"Name":"Dome Courts",
"Location":"8934 My Way"
},
]
You're always getting one result because return jsonify(item) ends the request. jsonify returns a response it does not only just turn result from Mongo into a json object. if you want to turn your Mongo result into a sequence use list then random.choice
item = random.choice(list(dataCollection.find({},{"Category":1,"Name":1,"Location":1,"_id":0}))
return jsonify(item)

Categories