I am bulk-updating data to a MongoDB database in python using pymongo with the following code:
# Update data
if data_to_be_updated:
update = collection.bulk_write([
UpdateOne(
filter = {'source_id': item['source_id']},
update = {'$set': item}
) for item in data_to_be_updated
])
How do I get the MongoDB _id's of the updated documents?
Your update variable should have an attribute bulk_api_result containing information about what was changed in the bulk operation.
update.bulk_api_result['upserted'] will contain a list of the documents that were updated
Related
I have dictionary with updated values and i want to update the field value(s) with the dynamic values in solr using python code.
Bascially i am fetching tweets from tweepy using below logic:
outtweets = [[
dict_1["favorite_count"]=tweet.favorite_count
dict_1["retweet_count"]=tweet.retweet_count
for idx, tweet in enumerate(all_tweets)]
But favorite_count and retweet_count this fields are dynamic evertime it gets changed.
Below Apache solr Data / schema which already present.
{
"favorite_count":0,
"retweet_count":16,
}
As fields are dynamic How do i update this specific values in solr ?
You can use Solr atomic updates to update only the two fields.
import pysolr
solr = pysolr.Solr("http://localhost:8983/solr/collectionName")
solr.add({'id':1, 'favorite_count': {'set': 0}, 'retweet_count': {'set': 16}})
solr.commit()
I am using pymongo and have below query to get the documents from a collection:
coll_employee = db.get_collection('employeeDetails')
query1 = [{'$match': {'EmployeeId': ObjectId('5edde542f6468910e080e462')}}]
document = coll_employee.aggregate(query1)
tmp1_list = []
for i in document:
tmp1_list.append(i)
print(tmp1_list)
I am making a query based on EmployeeId which is an ObjectId. Running above code, I am getting all the documents of the collection. Is there any way we can only get the latest document which was created. Please help. Thanks
Hi you can do a sort by ObjectID ( -1 ) if you want the latest record and then use the limit operation to retrieve just the first record .
It's like running the following query on your collection
db.coll_employee.select({'$match': {'EmployeeId': ObjectId('5edde542f6468910e080e462')}}).sort('_id':-1).limit(1)
PyMongo's syntax is little bit different than Mongo Shell's syntax. You don't need to use aggregation for this, a simple find() method will can do the work along with sort() and limit() methods. Below query will be helpful:
db.coll_employee.find().sort('_id',-1).limit(1)
You can use any field, which gets incremented as we insert, like you were saying in comments, you have created field. So, instead of _id, you can use this field.
Here is the solution which worked for me:
query1 = {'EmployeeId': ObjectId('5edde542f6468910e080e462')}
document = coll_employee.find(query1).sort('_id', 1).limit(1)
I'm a beginner in mongodb and pymongo and I'm working on a project where I have a students mongodb collection . What I want is to add a new field and specifically an adrress of a student to each element in my collection (the field is obviously added everywhere as null and will be filled by me later).
However when I try using this specific example to add a new field I get a the following syntax error:
client = MongoClient('mongodb://localhost:27017/') #connect to local mongodb
db = client['InfoSys'] #choose infosys database
students = db['Students']
students.update( { $set : {"address":1} } ) #set address field to every column (error happens here)
How can I fix this error?
You are using the update operation in wrong manner. Update operation is having the following syntax:
db.collection.update(
<query>,
<update>,
<options>
)
The main parameter <query> is not at all mentioned. It has to be at least empty like {}, In your case the following query will work:
db.students.update(
{}, // To update the all the documents.
{$set : {"address": 1}}, // Update the address field.
{multi: true} // To do multiple updates, otherwise Mongo will just update the first matching document.
)
So, in python, you can use update_many to achieve this. So, it will be like:
students.update_many(
{},
{"$set" : {"address": 1}}
)
You can read more about this operation here.
The previous answer here is spot on, but it looks like your question may relate more to PyMongo and how it manages updates to collections. https://pymongo.readthedocs.io/en/stable/api/pymongo/collection.html
According to the docs, it looks like you may want to use the 'update_many()' function. You will still need to make your query (all documents, in this case) as the first argument, and the second argument is the operation to perform on all records.
client = MongoClient('mongodb://localhost:27017/') #connect to local mongodb
db = client['InfoSys'] #choose infosys database
students = db['Students']
sudents.update_many({}, {$set : {"address":1}})
I solved my problem by iterating through every element in my collection and inserting the address field to each one.
cursor = students.find({})
for student in cursor :
students.update_one(student, {'$set': {'address': '1'}})
I am trying to update the document if it exists and insert if it does not exist in a collection. I am inserting pandas data frame records as documents to collection based on _id. The insert of new document is working fine, but the update of fields in the old document is not working.
bulk = pymongo.bulk.BulkOperationBuilder(pros_rides,ordered=False)
for doc in bookings_df:
bulk.find({ "_id": doc["_id"] }).upsert().update({
"$setOnInsert": doc
})
response = bulk.execute()
What do I miss?
An upsert can either update or insert a document; the "$setOnInsert" operation is only executed when the document is inserted, not when it is updated. In order to update the document if it exists, you must provide some operations that will be executed when the document is updated.
Try something like this instead:
bulk = pros_rides.initialize_unordered_bulk_op()
for doc in books_df:
bulk.find({'_id': doc['_id']}).upsert().replace_one(doc)
bulk.execute()
I update collection in mongo db . but cant find matches. this is my code.
collection = MongoClient()["blog"]["users"]
client = MongoClient()
db = client.blog
result = db.test.update_many({"_id": '12345'}, {"$set": {"email":
"dmitry"}})
print (result.matched_count)
You are trying to update the _id field which is immutable, you will need to create a new entry and delete the old one by storing the document into a variable and then saving it with the new _id and then removing the old document.