MongoDB find in array of objects - python

I want to query Mongodb: find all users, that have 'artist'=Iowa in any array item of objects.
Here is Robomongo of my collection:
In Python I'm doing:
Vkuser._get_collection().find({
'member_of_group': 20548570,
'my_music': {
'items': {
'$elemMatch': {
'artist': 'Iowa'
}
}
}
})
but this returns nothing. Also tried this:
{'member_of_group': 20548570, 'my_music': {'$elemMatch': {'$.artist': 'Iowa'}}} and that didn't work.
Here is part of document with array:
"can_see_audio" : 1,
"my_music" : {
"items" : [
{
"name" : "Anastasia Plotnikova",
"photo" : "http://cs620227.vk.me/v620227451/9c47/w_okXehPbYc.jpg",
"id" : "864451",
"name_gen" : "Anastasia"
},
{
"title" : "Ain't Talkin' 'Bout Dub",
"url" : "http://cs4964.vk.me/u14671028/audios/c5b8a0735224.mp3?extra=jgV4ZQrFrsfxZCJf4gsRgnKWvdAfIqjE0M6eMtxGFpj2yp4vjs5DYgAGImPMp4mCUSUGJzoyGeh2Es6L-H51TPa3Q_Q",
"lyrics_id" : 24313846,
"artist" : "Apollo 440",
"genre_id" : 18,
"id" : 344280392,
"owner_id" : 864451,
"duration" : 279
},
{
"title" : "Animals",
"url" : "http://cs1316.vk.me/u4198685/audios/4b9e4536e1be.mp3?extra=TScqXzQ_qaEFKHG8trrwbFyNvjvJKEOLnwOWHJZl_cW5EA6K3a9vimaMpx-Yk5_k41vRPywzuThN_IHT8mbKlPcSigw",
"lyrics_id" : 166037,
"artist" : "Nickelback",
"id" : 344280351,
"owner_id" : 864451,
"duration" : 186
},

The following query should work. You can use the dot notation to query into sub documents and arrays.
Vkuser._get_collection().find({
'member_of_group': 20548570,
'my_music.items.artist':'Iowa'
})
The following query worked for me in the mongo shell
db.collection1.find({ "my_music.items.artist" : "Iowa" })

Related

MongoEngine not deleting all documents

I have a some unit tests which submit some info to a server which saves the info into a document in a mongo engine. At the end of the test, I want to delete all of the documents made by the test:
#router.delete("/all", summary="Delete all jobs in an organization")
async def delete_all_jobs(job_data: AuthorizedResource = Depends(CanActOnResource("delete", "jobs"))):
MongoJob.objects(organization=job_data.organization).delete()
However when I run this endpoint, some of the documents are only partially deleted:
This is what the JSON looks like before being deleted:
{
"_id" : "242d07ac-eafb-4875-a8f4-8ec89c7bc21f",
"_cls" : "MongoJob",
"_created_by" : "tom.mclean",
"_date_created" : ISODate("2022-02-24T08:23:50.943Z"),
"_date_modified" : ISODate("2022-02-24T08:25:02.062Z"),
"_modified_by" : "tom.mclean",
"client_info" : {
"protocol" : "tcp",
"interface" : "0.0.0.0",
"port" : 0
},
"grib_data" : {
"grib_dir_clim" : "X:\\Weather_Files\\Climatology",
"grib_dir_wind" : "X:\\Weather_Files\\NOAA_Forcasts",
"grib_dir_wave" : "X:\\Weather_Files\\NOAA_Forcasts"
},
"organization" : "8b50d3f2-03fe-4aca-9cf6-9922854f2989",
"output_dir" : "C:\\Users\\Tom.Mclean\\src\\routingserver\\WeatherRouting\\WeatherRouting\\..\\output",
"polars" : [
"5d19d7d0-eba2-49e5-8719-760d352d50dc"
],
"result" : {
"costs" : {
"total_cost" : null,
"fuel_cost" : null,
"hire_cost" : null
}
},
"route_form" : {
"waypoints" : [
{
"type" : "Waypoint",
"lon" : -7.25,
"lat" : 49.42,
"normal_deviation" : 0.2
},
{
"type" : "Waypoint",
"lon" : -50.0,
"lat" : 40.0,
"normal_deviation" : 0.0
}
],
"start_time" : ISODate("2022-02-24T08:23:50.842Z"),
"arrival_window" : {
"early" : null,
"late" : null
},
"max_tws" : 40.0,
"max_lat" : 65.0,
"min_lat" : -40.0,
"max_speed" : 16.0,
"min_speed" : 8.0,
"great_circle" : false,
"objective_funcs" : [
{
"hire_cost" : 16000.0,
"fuel_cost" : 550.0
}
],
"decision_time" : 24.0,
"course_change_angle" : 15.0,
"speed_step" : 0.5
},
"ship" : "f2775ef8-c58d-4aa3-a6a0-b82539535e88",
"status" : "FAILED",
"wave_data" : false
}
And then after running that end point of the API, some of the documents are deleted however some are left with just three fields:
{
"_id" : "5f04ffc3-45a3-4652-a79d-68b37e737268",
"_date_modified" : ISODate("2022-02-24T15:13:28.013Z"),
"status" : "FAILED"
}
If I run the unit tests in debug mode and pause on the line which calls the delete endpoint and then run it later on, it safely deletes all the documents:
#classmethod
def tearDownClass(cls) -> None:
# TODO Once jobs can be deleted, clear test jobs from the routing server
loop = asyncio.new_event_loop()
loop.run_until_complete(cls.oauth.get_new_access_token())
organization_path = cls.api._organization_path
pathname = f"{organization_path}/jobs/all"
loop.run_until_complete(cls.api.delete(pathname, token=cls.oauth.access_token)) <- PAUSE HERE
How can I safely ensure that all of the documents are deleted? I could add a pause to the unit test before calling the delete endpoint, but this does not feel right and I should just try and fix the issue first.

MongoDB Python MongoEngine - Returning Document by filter of Embedded Documents Sum of Filtered property

I am using Python and MongoEngine to try and query the below Document in MongoDB.
I need a query to efficiently get the Documents only when they contain Embedded Documents 'Keywords' that match the following criteria:
Keywords Filtered where the Property 'SFR' is LTE '100000'
SUM the filtered keywords
Return the parent documents where SUM of the keywords matching the criteria is Greater than '9'
Example structure:
{
"_id" : ObjectId("5eae60e4055ef0e717f06a50"),
"registered_data" : ISODate("2020-05-03T16:12:51.999+0000"),
"UniqueName" : "SomeUniqueNameHere",
"keywords" : [
{
"keyword" : "carport",
"search_volume" : NumberInt(10532),
"sfr" : NumberInt(20127),
"percent_contribution" : 6.47,
"competing_product_count" : NumberInt(997),
"avg_review_count" : NumberInt(143),
"avg_review_score" : 4.05,
"avg_price" : 331.77,
"exact_ppc_bid" : 3.44,
"broad_ppc_bid" : 2.98,
"exact_hsa_bid" : 8.33,
"broad_hsa_bid" : 9.29
},
{
"keyword" : "party tent",
"search_volume" : NumberInt(6944),
"sfr" : NumberInt(35970),
"percent_contribution" : 4.27,
"competing_product_count" : NumberInt(2000),
"avg_review_count" : NumberInt(216),
"avg_review_score" : 3.72,
"avg_price" : 210.16,
"exact_ppc_bid" : 1.13,
"broad_ppc_bid" : 0.55,
"exact_hsa_bid" : 9.66,
"broad_hsa_bid" : 8.29
}
]
}
From the research I have been doing, I believe an Aggregate type query might do what I am attempting.
Unfortunately, being new to MongoDB / MongoEngine I am struggling to figure out how to structure the query and have failed in finding an example similar to what I am attempting to do (RED FLAG RIGHT????).
I did find an example of a aggregate but unsure how to structure my criteria in it, maybe something like this is getting close but does not work.
pipeline = [
{
"$lte": {
"$sum" : {
"keywords" : {
"$lte": {
"keyword": 100000
}
}
}: 9
}
}
]
data = product.objects().aggregate(pipeline)
Any guidance would be greatly appreciated.
Thanks,
Ben
you can try something like this
db.collection.aggregate([
{
$project: { // the first project to filter the keywords array
registered_data: 1,
UniqueName: 1,
keywords: {
$filter: {
input: "$keywords",
as: "item",
cond: {
$lte: [
"$$item.sfr",
100000
]
}
}
}
}
},
{
$project: { // the second project to get the length of the keywords array
registered_data: 1,
UniqueName: 1,
keywords: 1,
keywordsLength: {
$size: "$keywords"
}
}
},
{
$match: { // then do the match
keywordsLength: {
$gte: 9
}
}
}
])
you can test it here Mongo Playground
hope it helps
Note, I used sfr property only from the keywords array for simplicity

Extract values from oddly-nested Python

I must be really slow because I spent a whole day googling and trying to write Python code to simply list the "code" values only so my output will be Service1, Service2, Service2. I have extracted json values before from complex json or dict structure. But now I must have hit a mental block.
This is my json structure.
myjson='''
{
"formatVersion" : "ABC",
"publicationDate" : "2017-10-06",
"offers" : {
"Service1" : {
"code" : "Service1",
"version" : "1a1a1a1a",
"index" : "1c1c1c1c1c1c1"
},
"Service2" : {
"code" : "Service2",
"version" : "2a2a2a2a2",
"index" : "2c2c2c2c2c2"
},
"Service3" : {
"code" : "Service4",
"version" : "3a3a3a3a3a",
"index" : "3c3c3c3c3c3"
}
}
}
'''
#convert above string to json
somejson = json.loads(myjson)
print(somejson["offers"]) # I tried so many variations to no avail.
Or, if you want the "code" stuffs :
>>> [s['code'] for s in somejson['offers'].values()]
['Service1', 'Service2', 'Service4']
somejson["offers"] is a dictionary. It seems you want to print its keys.
In Python 2:
print(somejson["offers"].keys())
In Python 3:
print([x for x in somejson["offers"].keys()])
In Python 3 you must use the list comprehension because in Python 3 keys() is a 'view', not a list.
This should probably do the trick , if you are not certain about the number of Services in the json.
import json
myjson='''
{
"formatVersion" : "ABC",
"publicationDate" : "2017-10-06",
"offers" : {
"Service1" : {
"code" : "Service1",
"version" : "1a1a1a1a",
"index" : "1c1c1c1c1c1c1"
},
"Service2" : {
"code" : "Service2",
"version" : "2a2a2a2a2",
"index" : "2c2c2c2c2c2"
},
"Service3" : {
"code" : "Service4",
"version" : "3a3a3a3a3a",
"index" : "3c3c3c3c3c3"
}
}
}
'''
#convert above string to json
somejson = json.loads(myjson)
#Without knowing the Services:
offers = somejson["offers"]
keys = offers.keys()
for service in keys:
print(somejson["offers"][service]["code"])

How do I delete values from this document in MongoDB using Python

I am having a document which is structured like this
{
"_id" : ObjectId("564c0cb748f9fa2c8cdeb20f"),
"username" : "blah",
"useremail" : "blah#blahblah.com",
"groupTypeCustomer" : true,
"addedpartners" : [
"562f1a629410d3271ba74f74",
"562f1a6f9410d3271ba74f83"
],
"groupName" : "Mojito",
"groupTypeSupplier" : false,
"groupDescription" : "A group for fashion designers"
}
Now I want to delete one of the values from this 'addedpartners' array and update the document.
I want to just delete 562f1a6f9410d3271ba74f83 from the addedpartners array
This is what I had tried earlier.
db.myCollection.update({'_id':'564c0cb748f9fa2c8cdeb20f'},{'$pull':{'addedpartners':'562f1a6f9410d3271ba74f83'}})
db.myCollection.update(
{ _id: ObjectId(id) },
{ $pull: { 'addedpartners': '562f1a629410d3271ba74f74' } }
);
Try with this
db.myCollection.update({}, {$unset : {"addedpartners.1" : 1 }})
db.myCollection.update({}, {$pull : {"addedpartners" : null}})
No way to delete array directly, i think this is going to work, i haven't tried yet.

logical error in python dictionary traversal

one of my queries in mongoDB through pymongo returns:
{ "_id" : { "origin" : "ABE", "destination" : "DTW", "carrier" : "EV" }, "Ddelay" : -5.333333333333333,
"Adelay" : -12.666666666666666 }
{ "_id" : { "origin" : "ABE", "destination" : "ORD", "carrier" : "EV" }, "Ddelay" : -4, "Adelay" : 14 }
{ "_id" : { "origin" : "ABE", "destination" : "ATL", "carrier" : "EV" }, "Ddelay" : 6, "Adelay" : 14 }
I am traversing the result as below in my python module but I am not getting all the 3 results but only two. I believe I should not use len(results) as I am doing currently. Can you please help me correctly traverse the result as I need to display all three results in the resultant json document on web ui.
Thank you.
code:
pipe = [{ '$match': { 'origin': {"$in" : [origin_ID]}}},
{"$group" :{'_id': { 'origin':"$origin", 'destination': "$dest",'carrier':"$carrier"},
"Ddelay" : {'$avg' :"$dep_delay"},"Adelay" : {'$avg' :"$arr_delay"}}}, {"$limit" : 4}]
results = connect.aggregate(pipeline=pipe)
#pdb.set_trace()
DATETIME_FORMAT = '%Y-%m-%d'
for x in range(len(results)):
origin = (results['result'][x])['_id']['origin']
destination = (results['result'][x])['_id']['destination']
carrier = (results['result'][x])['_id']['carrier']
Adelay = (results['result'][x])['Adelay']
Ddelay = (results['result'][x])['Ddelay']
obj = {'Origin':origin,
'Destination':destination,
'Carrier': carrier,
'Avg Arrival Delay': Adelay,
'Avg Dep Delay': Ddelay}
json_result.append(obj)
return json.dumps(json_result,indent= 2, sort_keys=False,separators=(',',':'))
Pymongo returns result in format:
{u'ok': 1.0, u'result': [...]}
So you should iterate over result:
for x in results['result']:
...
In your code you try to calculate length of dict, not length of result container.

Categories