Bulk upsert not working pymongo - python

I am trying to update the document if it exists and insert if it does not exist in a collection. I am inserting pandas data frame records as documents to collection based on _id. The insert of new document is working fine, but the update of fields in the old document is not working.
bulk = pymongo.bulk.BulkOperationBuilder(pros_rides,ordered=False)
for doc in bookings_df:
bulk.find({ "_id": doc["_id"] }).upsert().update({
"$setOnInsert": doc
})
response = bulk.execute()
What do I miss?

An upsert can either update or insert a document; the "$setOnInsert" operation is only executed when the document is inserted, not when it is updated. In order to update the document if it exists, you must provide some operations that will be executed when the document is updated.
Try something like this instead:
bulk = pros_rides.initialize_unordered_bulk_op()
for doc in books_df:
bulk.find({'_id': doc['_id']}).upsert().replace_one(doc)
bulk.execute()

Related

Insert or upsert or delete row in Big Query based on condition with efficient way

Here is my input json format
{
"ID": "ABC34567",
"NUM": "45678",
"ORG_ID": "161",
"CONTACT_NUMBER": null,
"FLAG": "N"
}
I need to check whether ID is already present or not. If present, need to update the Data else insert a row. If FLAG:"Del", Need to Delete if ID is already present in BigQuery table else leave as is.
I am struggling how to achieve above functionality in python.
To insert a json, I know below code in python using BigQuery API,
rows_to_insert = [{"ID":"sdf1234","NUM":"4567890","ORG_ID":"567","CONTACT_NUMBER":"456789", "FLAG":"N"}]
errors = client.insert_rows_json(table_id,rows_to_insert)
I found below code on internet,
from google.cloud import bigquery
client = bigquery.Client()
query_job = client.query(
"""
MERGE my-dataset.json_table T
USING my-dataset.json_table_source S
ON T.ID = S.ID
WHEN MATCHED THEN
UPDATE SET string_field_1 = s.string_field_1
WHEN NOT MATCHED THEN
INSERT (int64_field_0, string_field_1) VALUES(int64_field_0, string_field_1)"""
)
results = query_job.result() # Waits for job to complete.
I am totally stuck how can i able to pass json in above query or any other best way to update the existing row
Please guide me how can i achieve this in BigQuery API using python

MongoDB: Fetching every document from a collection

I'm creating a discord bot in python. I'm using MongoDB Atlas (NoSQL)
I have a user's document which looks like this
"user": 12345,
"created_at": 2012-12-31 01:48:24
I wanted to fetch every document in a collection and then take it's created_at.
How can I do this? I tried with db.inv.find({}), but it didn't work. I checked MongoDB's documentation, but they only told about JavaScript. How can I fetch every document in my collection?
Make sure your db is the database inside mongodb.if your db is mongdb's client.db,your code is right.
MONGODB_URI = "mongodb://user:password#host:port/"
client= pymongo.MongoClient(MONGODB_URI)
# database
db = client.db
# db.collection.find({}) will get the item list
result = db.inv.find({})
db.inv.find() will give you the cursor object to all the documents and then you need to iterate on it's returned documents to get the specified field.
Make sure you have connected to right collection
result=db.inv.find()
for entry in result:
print(entry["created_at"])

Can't delete mongodb document using pymongo

I'm trying to delete one specific document using (_id) with pymongo and i can't do it, some idea..
thks.
I have this code:
s = "ISODate('{0}')".format(nom_fitxer_clean)
#i generate the next string.. (ISODate('2018-11-07 00:00:00'))
myquery = { "_id": s }
#query string ({'_id': "ISODate('2018-10-07 00:00:00')"})
mycol.delete_one(myquery)
I do not get any errors or delete the document.
UPDATE:
Document
I think one possible solution could be to replace ISODate with ObjectId in your query string.
Moreover, delete_one deletes the first object which matches with your query. So it is possible that there exist multiple objects which match your query?

Bulk insert MongoDB - Pymongo limit issue

I was trying to bulk insert documents to MongoDB with Pymongo if no match on already existing ids (sourceID). However, most of my documents include quite long texts and therefore cannot be inserted - I am not sure whether this is due to a size limit or a character limit (it works for documents with little text). Instead I tried inserting with insert_many() but to my knowledge this will not do the required task of only inserting documents if no match on id (sourceID) leaving exiisting documents with a match unchanged.
Are there any alternative solutions to reach my desired goal?
This is the code that I have written for the bulk insert:
bulk = pymongo.bulk.BulkOperationBuilder(events,ordered=False)
for doc in test:
bulk.find({ "sourceID": doc["sourceID"] }).upsert().update({
"$setOnInsert": doc
})
response = bulk.execute()
'events' is the database that I am inserting to and 'test' is the array that I am trying to insert.

Get _id's of the updated documents in MongoDB - Python

I am bulk-updating data to a MongoDB database in python using pymongo with the following code:
# Update data
if data_to_be_updated:
update = collection.bulk_write([
UpdateOne(
filter = {'source_id': item['source_id']},
update = {'$set': item}
) for item in data_to_be_updated
])
How do I get the MongoDB _id's of the updated documents?
Your update variable should have an attribute bulk_api_result containing information about what was changed in the bulk operation.
update.bulk_api_result['upserted'] will contain a list of the documents that were updated

Categories