(PyMongo) - Parameters of function find_one_and_update()

(PyMongo) - Parameters of function find_one_and_update() - python

I am receiving a generic Django error on the line of code listed below. I am having a hard time understanding the pymongo docs on how the parameters are to be set up for this function. I am thinking i wrote it incorrectly. I have a collection of request documents. Each request document has a "request" key with a value(subreddit_name + "F"). This is what i would like to query and find the document by. Each document also have a "pdone" key with a value(pdone variable). This is the key-value inside the document that i would like to change.
Line of code where the error occurs:
self.collection_requests.find_one_and_update({'request': self.subreddit_name + "F"}, {'pdone': pdone}, return_document=ReturnDocument.AFTER)
Here is an insert for a document of the collection:
collection_requests.insert({'request': subreddit_name + "F", 'pdone': 0})
Edit: still receiving the same error at the same line of code after changing it to: self.collection_requests.find_one_and_update({'request': self.subreddit_name + "F"}, {'$set': {'pdone': pdone}}, return_document=ReturnDocument.AFTER)

hm... it seems you forget to specify the update operator
try something like:
self.collection_requests.find_one_and_update({'request': self.subreddit_name + "F"}, {'$set': {'pdone': pdone}}, return_document=ReturnDocument.AFTER)

Related

Python requests.post does not force Elasticsearch to create missing index

I want to push data to my Elasticsearch server using :
requests.post('http://localhost:9200/_bulk', data=data_1 + data_2)
and it complains that the index does not exist. I try creating the index manually:
curl -X PUT http://localhost:9200/_bulk
and it complains that I am not feeding a body to it:
{"error":{"root_cause":[{"type":"parse_exception","reason":"request body is required"}],"type":"parse_exception","reason":"request body is required"},"status":400}
Seems like a bit of a chicken and egg problem here. How can I create that _bulk index, and then post my data?
EDIT:
My data is very large to even understand the schema. Here is a small snippet:
'{"create":{"_index":"products-guys","_type":"t","_id":"0"}}\n{"url":"http://www.plaisio.gr/thleoraseis/tv/tileoraseis/LG-TV-43-43LH630V.htm","title":"TV LG 43\\" 43LH630V LED Full HD Smart","description":"\\u039a\\u03b1\\u03b9 \\u03cc\\u03bc\\u03bf\\u03c1\\u03c6\\u03b7 \\u03ba\\u03b1\\u03b9 \\u03ad\\u03be\\u03c5\\u03c0\\u03bd\\u03b7, \\u03bc\\u03b5 \\u03b9\\u03c3\\u03c7\\u03c5\\u03c1\\u03cc \\u03b5\\u03c0\\u03b5\\u03be\\u03b5\\u03c1\\u03b3\\u03b1\\u03c3\\u03c4\\u03ae \\u03b5\\u03b9\\u03ba\\u03cc\\u03bd\\u03b1\\u03c2 \\u03ba\\u03b1\\u03b9 \\u03bb\\u03b5\\u03b9\\u03c4\\u03bf\\u03c5\\u03c1\\u03b3\\u03b9\\u03ba\\u03cc webOS 3.0 \\u03b5\\u03af\\u03bd\\u03b1\\u03b9 \\u03b7 \\u03c4\\u03b7\\u03bb\\u03b5\\u03cc\\u03c1\\u03b1\\u03c3\\u03b7 \\u03c0\\u03bf\\u03c5 \\u03c0\\u03ac\\u03b5\\u03b9 \\u03c3\\u03c4\\u03bf \\u03c3\\u03b1\\u03bb\\u03cc\\u03bd\\u03b9 \\u03c3\\u03bf\\u03c5","priceCurrency":"EUR","price":369.0}\n{"create":{"_index":"products-guys","_type":"t","_id":"1"}}\n{"url":"http://www.plaisio.gr/thleoraseis/tv/tileoraseis/Samsung-TV-43-UE43M5502.htm","title":"TV Samsung 43\\" UE43M5502 LED ...
This is essentially someone else's code, that I need to make work. It seems that the "data" object I am passing to the PUT method is a string.
When I use requests.post('http://localhost:9200/_bulk', data=data)
I get <Response [406]>,

If you want to do a bulk request using requests
response = requests.post('http://localhost:9200/_bulk', data= data=data_1 + data_2, headers={'content-type':'application/json', 'charset':'UTF-8'})
Old Answer
I recommend using the bulk helper from the python library
from elasticsearch import Elasticsearch, helpers
client = Elasticsearch("localhost:9200")
def gendata():
mywords = ['foo', 'bar', 'baz']
for word in mywords:
yield {
"_index": "mywords",
"word": word,
}
resp = helpers.bulk(
client,
gendata(),
index = "some_index",
)
If you didn't touch the elasticsearch configuration a new index will be created on document indexing.
About the ways you tried:
Probably the query is malformed. To do bulk ingest the body shape is different than just sending the docs as a json array.
You are doing PUT instead of post and you have to specify the documents you want to ingest.
There is no need to create the empty index first. Just in case you want to do you can just do :
curl -X PUT http://localhost:9200/index_name

PyMongo Atlas Search not returning anything

I'm trying to do a full text search using Atlas for MongoDB. I'm doing this through the PyMongo driver in Python. I'm using the aggregate pipeline, and doing a $search but it seems to return nothing.
cursor = db.collection.aggregate([
{"$search": {"text": {"query": "hello", "path": "text_here"}}},
{"$project": {"file_name": 1}}
])
for x in cursor:
print(x)
What I'm trying to achieve with this code is to search through a field in the collection called "text_here", and I'm searching for a term "hello" and returning all the results that contain that term and listing them by their "file_name". However, it returns nothing and I'm quite confused as this is almost identical to the example code on the documentation website. The only thing I could think of right now is that possible the path isn't correct and it can't access the field I've specified. Also, this code returns no errors, simply just returns nothing as I've tested by looping through cursor.

I had the same issue. I solved it by also passing the name of the index in the query. For example:
{
index: "name_of_the_index",
text: {
query: 'john doe',
path: 'name'
}
}
I followed the tutorials but couldn't get any result back without specifying the "index" name. I wish this was mentioned in the documentation as mandatory.

If you are only doing a find and project, you don't need an aggregate query, just a find(). The syntax you want is:
db.collection.find({'$text': {'$search': 'hello'}}, {'file_name': 1})
Equivalent using aggregate:
cursor = db.collection.aggregate([
{'$match': {'$text': {'$search': 'hello'}}},
{'$project': {'file_name': 1}}])
Worked example:
from pymongo import MongoClient, TEXT
db = MongoClient()['mydatabase']
db.collection.create_index([('text_here', TEXT)])
db.collection.insert_one({"text_here": "hello, is it me you're looking for", "file_name": "foo.bar"})
cursor = db.collection.find({'$text': {'$search': 'hello'}}, {'file_name': 1})
for item in cursor:
print(item)
prints:
{'_id': ObjectId('5fc81ce9a4a46710459de610'), 'file_name': 'foo.bar'}

I am trying to update data but it doesn't get updated in the database

I am new to python and Mongo db. What am I trying to do is that I want to update data in my database and code seems to be working fine. But, still the data doesn't get updated in the database.
I have tried functions like update and update_one etc. But still no luck so far.
#app.route("/users/update_remedy", methods = ['POST'])
def update_remedy():
try:
remedy = mongo.db.Home_Remedies
name = request.get_json()['name']
desc = request.get_json()['desc']
print("S")
status = remedy.update_one({"name" : name}, {"$set": {"desc" : desc}})
print("h")
return jsonify({"result" : "Remedy Updated Successfully"})
except Exception:
return 'error'

It's likely that your update_one call is looking for a document that doesn't exist. If the query on a vanilla update doesn't return any documents then no update operation will be performed. Make sure that a doc with the field {"name" : name}
actually exists. You could also check the return value from the update_one to ensure an update happened. See UpdateResult for details.

Python 3 JSON/Dictionary help , not sure how to parse values?

I am attempting to write out some JSON output into a csv file but first i am trying to understand how the data is structured. I am working from a sample script which connects to an API and pulls down data based a query specified.
The json is returned from the server with this query:
response = api_client.get_search_results(search_id, 'application/json')
body = response.read().decode('utf-8')
body_json = json.loads(body)
If i perform a
print(body_json.keys())
i get the following output:
dict_keys(['events'])
So from this is it right to assume that the entries i am really interested in are another dictionary inside the events dictionary?
If so how can i 'access' them?
Sample JSON data the search query returns to the variable above
{
"events":[
{
"source_ip":"10.143.223.172",
"dest_ip":"104.20.251.41",
"domain":"www.theregister.co.uk",
"Domain Path":"NULL",
"Domain Query":"NULL",
"Http Method":"GET",
"Protocol":"HTTP",
"Category":"NULL",
"FullURL":"http://www.theregister.co.uk"
},
{
"source_ip":"10.143.223.172",
"dest_ip":"104.20.251.41",
"domain":"www.theregister.co.uk",
"Domain Path":"/2017/05/25/windows_is_now_built_on_git/",
"Domain Query":"NULL",
"Http Method":"GET",
"Protocol":"HTTP",
"Category":"NULL",
"FullURL":"http://www.theregister.co.uk/2017/05/25/windows_is_now_built_on_git/"
},
]
}
Any help would be greatly appreciated.

Json.keys() only returns the keys associated with json.
Here is the code:
for key in json_data.keys():
for i in range(len(json_data[key])):
key2 = json_data[key][i].keys()
for k in key2:
print k + ":" + json_data[key][i][k]
Output:
Http Method:GET
Category:NULL
domain:www.theregister.co.uk
Protocol:HTTP
Domain Query:NULL
Domain Path:NULL
source_ip:10.143.223.172
FullURL:http://www.theregister.co.uk
dest_ip:104.20.251.41
Http Method:GET
Category:NULL
domain:www.theregister.co.uk
Protocol:HTTP
Domain Query:NULL
Domain Path:/2017/05/25/windows_is_now_built_on_git/
source_ip:10.143.223.172
FullURL:http://www.theregister.co.uk/2017/05/25/windows_is_now_built_on_git/
dest_ip:104.20.251.41

To answer your question: yes. Your body_json has returned a dictionary with a key of "events" which contains a list of dictionaries.
The best way to 'access' them would be to iterate over them.
A very rudimentary example:
for i in body_json['events']:
print(i)
Of course, during the iteration you could access the specific data that you needed by replacing print(i) with print(i['FullURL'])and saving it to a variable and so on.
It's important to note that whenever you're working with API's that return a JSON response, you're simply working with dictionaries and Python data structures.
Best of luck.

Remove attribute from all MongoDB documents using Python and PyMongo

In my MongoDB, a bunch of these documents exist:
{ "_id" : ObjectId("5341eaae6e59875a9c80fa68"),
"parent" : {
"tokeep" : 0,
"toremove" : 0
}
}
I want to remove the parent.toremove attribute in every single one.
Using the MongoDB shell, I can accomplish this using:
db.collection.update({},{$unset: {'parent.toremove':1}},false,true)
But how do I do this within Python?
app = Flask(__name__)
mongo = PyMongo(app)
mongo.db.collection.update({},{$unset: {'parent.toremove':1}},false,true)
returns the following error:
File "myprogram.py", line 46
mongo.db.collection.update({},{$unset: {'parent.toremove':1}},false,true)
^
SyntaxError: invalid syntax

Put quotes around $unset, name the parameter you're including (multi) and use the correct syntax for true:
mongo.db.collection.update({}, {'$unset': {'parent.toremove':1}}, multi=True)

Just found weird to have to attach an arbitrary value for the field to remove, such as a small number (1), an empty string (''), etc, but it's really mentioned in MongoDB doc, with sample in JavaScript:
$unset
The $unset operator deletes a particular field. Consider the following syntax:
{ $unset: { field1: "", ... } }
The specified value in the $unset expression (i.e. "") does not impact
the operation.
For Python/PyMongo, I'd like to put a value None:
{'$unset': {'field1': None}}
So, for OP's question, it would be:
mongo.db.collection.update({}, {'$unset': {'parent.toremove': None}}, multi=True)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

(PyMongo) - Parameters of function find_one_and_update() - python

hm... it seems you forget to specify the update operator try something like: self.collection_requests.find_one_and_update({'request': self.subreddit_name + "F"}, {'$set': {'pdone': pdone}}, return_document=ReturnDocument.AFTER)

Related

Python requests.post does not force Elasticsearch to create missing index

PyMongo Atlas Search not returning anything

I am trying to update data but it doesn't get updated in the database

Python 3 JSON/Dictionary help , not sure how to parse values?

Remove attribute from all MongoDB documents using Python and PyMongo

Categories

Resources