Mongodb Pymongo using $set to create an array/list/collection - python

I'm trying to use $set to create an array/list/collection (not sure which is proper terminology), and I'm not sure how to do it. For example:
I have a document inserted into my database that looks like this:
"_id": (unique, auto-generated id)
"Grade": Sophomore
I want to insert a collection/list/array using update. So, basically I want this:
"_id": (unique, auto-generated id)
"Grade": Sophomore
"Information"{
"Class_Info": [
{"Class_Name": "Math"}
]
What I've been doing so far is using .update and dot notation. So, what I was trying to do was use $set like this:
collection.update({'_id': unique ID}, {'$set': {'Information.Class_Info.Class_Name': 'Math}})
However, what that is doing is making Class_Info a document and not a list/collection/array, so it's doing:
"_id": (unique id)
"Grade": Sophomore
"Information"{
"Class_Info": {
"Class_Name": "Math"
}
How do I specify that I want Class_Info to be a list? IF for some reason I absolutely cannot use $set to do this, it is very important that I can use dot notation because of the way the rest of my program works, so if I'm supposed to use something other than $set, can it have dot notation to specify where to insert the list? (I know $push is another option, but it doesn't use dot notation, so I can't really use it in my case).
Thanks!

If you want to do it with only one instruction but starting up from NOT having any key created yet, this is the only way to do it ($set will never create an array that's not explicit, like {$set: {"somekey": [] }}
db.test.update(
{ _id: "(unique id)" },
{ $push: {
"Information.Class_Info": { "Class_Name": "Math" }
}}
)
This query does the trick, push to a non-existing key Information.Class_Info, the object you need to create as an array. This is the only possible solution with only one instruction, using dot notation and that works.

There is a way to do it with one instructions, $set and dot notation, as follows:
db.test.updateOne(
{ _id: "my-unique-id" },
{ $set: {
"Information.Class_Info": [ { "Class_Name": "Math" } ]
}}
)
There is also a way to do it with two instructions and the array index in the dot notation, allowing you to use similar statements to add more array elements:
db.test.updateOne(
{ _id: "my-unique-id" },
{ $set: { "Information.Class_Info": [] }}
)
db.test.updateOne(
{ _id: "my-unique-id" },
{ $set: {
"Information.Class_Info.0": { "Class_Name": "Math" },
"Information.Class_Info.1": { "Class_AltName": "Mathematics" }
}}
)
Deviating from these options has interesting failure modes:
If you try to combine the second option into a single updateOne() call, which is usually possible, MongoDB will complain that "Updating the path 'Information.Class_Info.0' would create a conflict at 'Information.Class_Info'"
If you try to use dot the notation with the array index ("Information.Class_Info.0.Class_Name": "Math") but without creating an empty array first, then MongoDB will create an object with numeric keys ("0", "1", …). It really refuses to create array except when told explicitly using […] (as also told in the answer by #Maximiliano).

Related

How to swap two elsticsearch indexes

I want to implement cash for highly loaded elasticsearch-based search system. I want to store cash in special elastic index. The problem is in cache warm-up: once an hour my system needs to update cached results with the fresh ones.
So, I'm creating a new empty index and fill it with updated results, then I need to swap old index and new index, so users can use fresh cached results.
The question is: how to swap two elasticsearch indexes efficiently?
For this kind of scenario you use something that is called "index alias swapping".
You have an alias that points to your current index, you fill a new index with the fresh records, and then you point this alias to the new index.
Something like this:
Current index name is items-2022-11-26-001
Create alias items pointing to items-2022-11-26-001
POST _aliases
{
"actions": [
{
"add": {
"index": "items-2022-11-26-001",
"alias": "items"
}
}
]
}
Create new index with fresh data items-2022-11-26-002
When it finishes, now point the items alias to items-2022-11-26-002
POST _aliases
{
"actions": [
{
"remove": {
"index": "items-2022-11-26-001",
"alias": "items"
}
},
{
"add": {
"index": "items-2022-11-26-002",
"alias": "items"
}
}
]
}
Delete items-2022-11-26-001
You run all your queries against "items" alias that will act as an index.
References:
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html

How to get "key" if value (i.e. data in string) in mongodb is konwn using python? [duplicate]

Is it possible to wildcard the key in a query? For instance, given the following record, I'd like to do a .find({'a.*': 4})
This was discussed here https://jira.mongodb.org/browse/SERVER-267 but it looks like it's not been resolved.
{
'a': {
'b': [1, 2],
'c': [3, 4]
}
}
As asked, this is not possible. The server issue you linked to is still under "issues we're not sure of".
MongoDB has some intelligence surrounding the use of arrays, and I think that's part of the complexity surrounding such a feature.
Take the following query db.foo.find({ 'a.b' : 4 } ). This query will match the following documents.
{ a: { b: 4 } }
{ a: [ { b: 4 } ] }
So what does "wildcard" do here? db.foo.find( { a.* : 4 } ) Does it match the first document? What about the second?
Moreover, what does this mean semantically? As you've described, the query is effectively "find documents where any field in that document has a value of 4". That's a little unusual.
Is there a specific semantic that you're trying to achieve? Maybe a change in the document structure will get you the query you want.
I've came across this question because I faced the same issue. The accepted answer provider here does explains why this is not supported but not really solves the issue itself.
I've ended up with a solution that makes the wildcard usage purposed here redundant and share here just in case someone will find this post some day
Why I wanted to use wildcards in my MongoDB queries?
In my case, I needed this "feature" in order to be able to find a match inside a dictionary (just as the question's code demonstrates).
What's the alternatives?
Use a reversed map (very similar to how DNS works) and simply use it. So, in our case we can use something similar to this:
{
"a": {
"map": {
"b": [1, 2, 3],
"c": [3, 4]
},
"reverse-map": {
"1": [ "b" ],
"2": [ "b" ],
"3": [ "b", "c" ],
"4": [ "c" ]
}
}
}
I know, it takes more memory and insert / update operations should validate this set is always symmetric and yet - it solves the problem. Now, instead of making an imaginary query like
db.foo.find( { a.map.* : 4 } )
I can make an actual query
db.foo.find( { a.reverse-map.4 : {$exists: true} } )
Which will return all items that have a specific value (in our example 4)
I know - this approach takes more memory and you need to manage indexes properly if you want to gain good performance (read the docs) and still - it's good for my use-case. Hope this helps someone else someday as well
Starting from MongoDB v3.4+, you can use $objectToArray to convert a into an array of k-v tuples for querying.
db.collection.aggregate([
{
"$addFields": {
"a": {
"$objectToArray": "$a"
}
}
},
{
$match: {
"a.v": 4
}
},
{
"$addFields": {
// cosmetics to revert back to original structure
"a": {
"$arrayToObject": "$a"
}
}
}
])
Here is the Mongo playground for your reference.

Select multiple values from a document

Background
I have data stored in the following format
{
"player_id": "VU3R5HNTAGMK",
"markers": {
"BICF2P964092": "GC",
"BICF2G630653981": "CG",
"BICF2P483996": "CT",
"BICF2S23452916": "CG",
"chr26_19147949": "TC",
}
}
You can imagine i have data stored for multiple players and each has a unique player_id and they all have varying number of markers with different marker values.
In the above case a marker is BICF2P964092 and it's marker value is GC.
I am trying to query my mongo db in various ways. One obvious way is by using player_id. To do that I do the following col.find({"player_id": "VU3R5HNTAGMK"})
Another thing i want to do is maybe I just want to know value of a specific marker for a specific player. So for that I can do the following col.find({"player_id": "VU3R5HNTAGMK"}, {'markers.BICF2P964092'})
ISSUE
I also want to be able to get values for multiple markers for a specific player and i am not able to do so. I have tried the following with no luck.
col.find({"player_id": "VU3R5HNTAGMK"},{'markers': {'$in': ["BICF2P964092", "chr26_19147949"]}})
col.find({"player_id": "VU3R5HNTAGMK"}, {'markers.BICF2P964092'}, {'markers.chr26_19147949'})
I would really appreciate it if someone can help me write a query where i can get multiple marker values for specified marker and player_id
You can simply do the following
col.find({“player_id”: “VU3R5HNTAGMK”}, {“markers.” + m: 1 for m in [“ BICF2P964092", “BICF2G630653981”]})
As you've tagged this pymongo, you might be as best to process the marker values in python after the find; e.g.
docs = col.find({"player_id": "VU3R5HNTAGMK"})
for doc in docs:
for marker, value in doc.get('markers').items():
if marker in ["BICF2P964092", "chr26_19147949"]:
print(marker, value)
#Belly Buster solution is good if you want to handle this using python.
But, there is a way to completely handle this on the MongoDB side using Aggregation.
You can combine $objectToArray, $filter, and $arrayToObject operators in $project stage.
collection.aggregate([
{
"$match": {
"player_id": "VU3R5HNTAGMK" # <-- All your match conditons
}
},
{
"$project": {
"player_id": 1, # All the other keys which you want to project
"markers": {
"$arrayToObject": {
"$filter": {
"input": {
"$objectToArray": "$markers"
},
"as": "elem",
"cond": {
"$in": [
"$$elem.k",
[
# <-- List of key names you want to project
"BICF2G630653981",
"BICF2P483996"
]
]
},
},
},
},
}
},
])
Note: You have to use MongoDB version >= 3.4.4 for this aggregation query to work.

how to fetch data from json schema? error shown-TypeError: string indices must be integers

I have a json response from an API in this way:-
{
"meta": {
"code": 200
},
"data": {
"username": "luxury_mpan",
"bio": "Recruitment Agents👑👑👑👑\nThe most powerful manufacturers,\nwe have the best quality.\n📱Wechat:13255996580💜💜\n📱Whatsapp:+8618820784535",
"website": "",
"profile_picture": "https://scontent.cdninstagram.com/t51.2885-19/10895140_395629273936966_528329141_a.jpg",
"full_name": "Mpan",
"counts": {
"media": 17774,
"followed_by": 7982,
"follows": 7264
},
"id": "1552277710"
}
}
I want to fetch the data in "media", "followed_by" and "follows" and store it in three different lists as shown in the below code:--
for r in range(1,5):
var=r,st.cell(row=r,column=3).value
xy=var[1]
ij=str(xy)
myopener=Myopener()
url=myopener.open('https://api.instagram.com/v1/users/'+ij+'/?access_token=641567093.1fb234f.a0ffbe574e844e1c818145097050cf33')
beta=json.load(url)
for item in beta['data']:
list1.append(item['media'])
list2.append(item['followed_by'])
list3.append(item['follows'])
When I run it, it shows the error TypeError: string indices must be integers
How would my loop change in order to fetch the above mentioned values?
Also, Asking out of curiosity:- Is there any way to fetch the Watzapp no from the "BIO" key in data dictionary?
I have referred questions similar to this and still did not get my answer. Please help!
beta['data'] is a dictionary object. When you iterate over it with for item in beta['data'], the values taken by item will be the keys of the dictionary: "username", "bio", etc.
So then when you ask for, e.g., item['media'] it's like asking for "username"['media'], which of course doesn't make any sense.
It isn't quite clear what it is that you want: is it just the stuff inside counts? If so, then instead of for item in beta['data']: you could just say item = beta['data']['counts'], and then item['media'] etc. will be the values you want.
As to your secondary question: I suggest looking into regular expressions.

Selectively retrieving depending on existence of key in map

Is it possible to selectively retrieve depending on the existence of keys in a map in mongodb? And if so, how do you go about doing it?
Suppose I have a document that looks like this for example..
{ "_id": 1234,
"parentfield1" : {
"childfield1" : { ...},
"childfield2" : { ...},
"childfield5" : { ...}, // There might be many childfields.. > 50
},
}
How would I be able to selectively retrieve from the document a/some particular childfields given multiple options to choose from? Some of which may not exist in the document.
i.e.
input "childfield1", "childfield2", "childfield3"
-> output
{ "_id": 1234,
"parentfield1": {
"childfield1" : { ... },
"childfield2" : { ... },
},
}
Is it even doable? Is it possible to do efficiently also?
Any help would be great (python, go).
Yes, that's the purpose of the projection parameter of find:
db.collection.find({_id: 1234}, {
'parentfield1.childfield1': 1,
'parentfield1.childfield2': 1,
'parentfield1.childfield3': 1
});
If a specified field isn't present in a given doc, the other matching fields will still be included.
Build up the projection parameter object programmatically if you want it to be dynamic.

Categories