Updating entire collection in MongoDB - python

I have a MongoDB collection with various documents in it. Every tot seconds my Python scripts retrieves some data from an API, i want to update each document of the collection with the updated version of the document, so the entire collection has to be updated.
result = db.main_tst.insert_one(dic)
This is how i insert the data. Now instead of inserting dic, i should update it. How can i do it with Python in MongoDB? I know there is the update_many() method, but i've only found how to update a certain document, instead of the entire collection.

It should be simple :
Let's suppose if you consider below, it would update all matching documents where field name = 'N/A' to "No name" :
filterQuery = { 'name': 'N/A'}
updateQuery = { "$set": { "name": "No name" } }
result = mycol.update_many(filterQuery, updateQuery);
Where as for your requirement as you need to update all documents in a collection, all you've to do is pass empty {} in place of filter, means it should update all documents :
filterQuery = {}
updateQuery = { "$set": { "name": "No name" } }
result = mycol.update_many(filterQuery, updateQuery)

Related

How to search for a value in two different type of field or index or heading of mongodb using python?

I am new to any kind of programming. This is an issue I encountered when using mongodb. Below is the collection structure of the document I imported from two different csv files.
{
"_id": {
"$oid": "61bc4217ed94f9d5fe6a350c"
},
"Telephone Number": "8429950810",
"Date of Birth": "01/01/1945"
}
{
"_id": {
"$oid": "61bc4217ed94f9d5fe6a350c"
},
"Telephone Number": "8129437810",
"Date of Birth": "01/01/1998"
}
{
"_id": {
"$oid": "61bd98d36cc90a9109ab253c"
},
"TELEPHONE_NUMBER": "9767022829",
"DATE_OF_BIRTH": "16-Jun-98"
}
{
"_id": {
"$oid": "61bd98d36cc9090109ab253c"
},
"TELEPHONE_NUMBER": "9567085829",
"DATE_OF_BIRTH": "16-Jan-91"
}
The first two entries are from a csv and the next two entries from another csv file. Now I am creating a user interface where users can search for a telephone number. How to write the query to search the telephone number value in both the index ( Telephone Number and TELEPHONE_NUMBER) using find() in the above case. If not possible is there a way to change the index's to a desired format while importing csv to db. Or is there a way where I create two different collection and then import csv to each collections and then perform a collective search of both the collections. Or can we create a compound index and then search the compound index instead. I am using pymongo for all the operations.
Thankyou.
You can use or query if different key is used to store same type of data.
yourmongocoll.find({"$or":[ {"Telephone Number":"8429950810"}, {"TELEPHONE_NUMBER":8429950810}]})
Assuming you have your connection string to connect via pymongo. Then the following is an example of how to query for the telephone number "8429950810":
from pymongo import MongoClient
client = MongoClient("connection_string")
db = client["db"]
collection = db["collection"]
results = collection.find({"Telephone Number":"8429950810"})
Please note this will return as type cursor, if you would like your documents in a list consider wrapping the query in list() like so:
results = list(collection.find({"Telephone Number":"8429950810"}))

How do i get the recent inserted document in MongoDB with all it's fields?

I'm working on this REST application in python Flask and a driver called pymongo. But if someone knows mongodb well he/she maybe able to answer my question.
Suppose Im inserting a new document in a collection say students. I want to get the whole inserted document as soon as the document is saved in the collection. Here is what i've tried so far.
res = db.students.insert_one({
"name": args["name"],
"surname": args["surname"],
"student_number": args["student_number"],
"course": args["course"],
"mark": args["mark"]
})
If i call:
print(res.inserted_id) ## i get the id
How can i get something like:
{
"name": "student1",
"surname": "surname1",
"mark": 78,
"course": "ML",
"student_number": 2
}
from the res object. Because if i print res i am getting <pymongo.results.InsertOneResult object at 0x00000203F96DCA80>
Put the data to be inserted into a dictionary variable; on insert, the variable will have the _id added by pymongo.
from pymongo import MongoClient
db = MongoClient()['mydatabase']
doc = {
"name": "name"
}
db.students.insert_one(doc)
print(doc)
prints:
{'name': 'name', '_id': ObjectId('60ce419c205a661d9f80ba23')}
Unfortunately, the commenters are correct. The PyMongo pattern doesn't specifically allow for what you are asking. You are expected to just use the inserted_id from the result and if you needed to get the full object from the collection later do a regular query operation afterwards

Update all MongoDB fields with their own values with PyMongo

I want to update every field of my MongoDB collection using the field's own value to do so.
Example: if I have this document: "string": "foo", a possible update would do this: "string": $string.lower(). Here, $string would be "foo", but I don't know how to do this with PyMongo.
I've tried this:
user_collection.update_many({}, { "$set": { "word": my_func("$word")}})
Which replaces everything with "$word".
I've been able to do it successfully iterating each document but it takes too long.
As I know you can't find and update in one statement using python function. You can either use mongo query language:
user_collection.update_many({}, { "$set": {"name": { "$concat": ["$name", "_2"]}}})
or use separate functions of pymongo:
for obj in user_collection.find({some query here}):
user_collection.update({"_id": obj['_id']}, { "$set": {"name": my_func(obj['name']) } })

MongoDB - Merge Update

I am having a little problem updating a MongoDB document (using pymongo). I have found several answers for similar questions that didn't work out for me.
Background: I am crawling some websites and saving information to a MongoDB.
Assume I got the following document from a web page and stored in a MongoDB collection:
original_doc = {
'id': some_id,
'data': {
'key1': value1,
'key2': value2
}
}
After some time, I may want to crawl the same page again and get the following document:
new_doc = {
'id': some_id,
'data': {
'key2': new_value2,
'new_key3': new_value3
}
}
Now I want to update the already existing MongoDB document in the collection so it looks like this:
updated_doc = {
'id': some_id,
'data': {
'key1': value1,
'key2': new_value2,
'new_key3': new_value3
}
}
So basically the old document should be overwritten with the new document, but without erasing / losing data from the original document, that does not exist in the new document.
I first thought I could use the $set to update the document, but then the (key1, value1) entry gets lost. And I do not know the key of the new entry as I am not in control of the data returned by the website, so I can't just use {$set: {data.new_key3: new_doc}} either.
Is there a solution for this?
You should use _id as selector to update document. The query will be like following query...
db.collection.update({"_id" : ObjectId("55c789499dd5f5f78633da59") //add mongoId to match here},
{ $set:{"data.key2":"new_value2","data.new_key3":"new_value3"}})
This query will update existing document with new data. The mongoId will be same as old document.

Removing _id element from Pymongo results

I'm attempting to create a web service using MongoDB and Flask (using the pymongo driver). A query to the database returns documents with the "_id" field included, of course. I don't want to send this to the client, so how do I remove it?
Here's a Flask route:
#app.route('/theobjects')
def index():
objects = db.collection.find()
return str(json.dumps({'results': list(objects)},
default = json_util.default,
indent = 4))
This returns:
{
"results": [
{
"whatever": {
"field1": "value",
"field2": "value",
},
"whatever2": {
"field3": "value"
},
...
"_id": {
"$oid": "..."
},
...
}
]}
I thought it was a dictionary and I could just delete the element before returning it:
del objects['_id']
But that returns a TypeError:
TypeError: 'Cursor' object does not support item deletion
So it isn't a dictionary, but something I have to iterate over with each result as a dictionary. So I try to do that with this code:
for object in objects:
del object['_id']
Each object dictionary looks the way I'd like it to now, but the objects cursor is empty. So I try to create a new dictionary and after deleting _id from each, add to a new dictionary that Flask will return:
new_object = {}
for object in objects:
for key, item in objects.items():
if key == '_id':
del object['_id']
new_object.update(object)
This just returns a dictionary with the first-level keys and nothing else.
So this is sort of a standard nested dictionaries problem, but I'm also shocked that MongoDB doesn't have a way to easily deal with this.
The MongoDB documentation explains that you can exclude _id with
{ _id : 0 }
But that does nothing with pymongo. The Pymongo documentation explains that you can list the fields you want returned, but "(“_id” will always be included)". Seriously? Is there no way around this? Is there something simple and stupid that I'm overlooking here?
To exclude the _id field in a find query in pymongo, you can use:
db.collection.find({}, {'_id': False})
The documentation is somewhat missleading on this as it says the _id field is always included. But you can exclude it like shown above.
Above answer fails if we want specific fields and still ignore _id. Use the following in such cases:
db.collection.find({'required_column_A':1,'required_col_B':1, '_id': False})
You are calling
del objects['_id']
on the cursor object!
The cursor object is obviously an iterable over the result set and not single
document that you can manipulate.
for obj in objects:
del obj['_id']
is likely what you want.
So your claim is completely wrong as the following code shows:
import pymongo
c = pymongo.Connection()
db = c['mydb']
db.foo.remove({})
db.foo.save({'foo' : 42})
for row in db.foo.find():
del row['_id']
print row
$ bin/python foo.py
> {u'foo': 42}

Categories