I'm trying to do a full text search using Atlas for MongoDB. I'm doing this through the PyMongo driver in Python. I'm using the aggregate pipeline, and doing a $search but it seems to return nothing.
cursor = db.collection.aggregate([
{"$search": {"text": {"query": "hello", "path": "text_here"}}},
{"$project": {"file_name": 1}}
])
for x in cursor:
print(x)
What I'm trying to achieve with this code is to search through a field in the collection called "text_here", and I'm searching for a term "hello" and returning all the results that contain that term and listing them by their "file_name". However, it returns nothing and I'm quite confused as this is almost identical to the example code on the documentation website. The only thing I could think of right now is that possible the path isn't correct and it can't access the field I've specified. Also, this code returns no errors, simply just returns nothing as I've tested by looping through cursor.
I had the same issue. I solved it by also passing the name of the index in the query. For example:
{
index: "name_of_the_index",
text: {
query: 'john doe',
path: 'name'
}
}
I followed the tutorials but couldn't get any result back without specifying the "index" name. I wish this was mentioned in the documentation as mandatory.
If you are only doing a find and project, you don't need an aggregate query, just a find(). The syntax you want is:
db.collection.find({'$text': {'$search': 'hello'}}, {'file_name': 1})
Equivalent using aggregate:
cursor = db.collection.aggregate([
{'$match': {'$text': {'$search': 'hello'}}},
{'$project': {'file_name': 1}}])
Worked example:
from pymongo import MongoClient, TEXT
db = MongoClient()['mydatabase']
db.collection.create_index([('text_here', TEXT)])
db.collection.insert_one({"text_here": "hello, is it me you're looking for", "file_name": "foo.bar"})
cursor = db.collection.find({'$text': {'$search': 'hello'}}, {'file_name': 1})
for item in cursor:
print(item)
prints:
{'_id': ObjectId('5fc81ce9a4a46710459de610'), 'file_name': 'foo.bar'}
Related
I'm working on this REST application in python Flask and a driver called pymongo. But if someone knows mongodb well he/she maybe able to answer my question.
Suppose Im inserting a new document in a collection say students. I want to get the whole inserted document as soon as the document is saved in the collection. Here is what i've tried so far.
res = db.students.insert_one({
"name": args["name"],
"surname": args["surname"],
"student_number": args["student_number"],
"course": args["course"],
"mark": args["mark"]
})
If i call:
print(res.inserted_id) ## i get the id
How can i get something like:
{
"name": "student1",
"surname": "surname1",
"mark": 78,
"course": "ML",
"student_number": 2
}
from the res object. Because if i print res i am getting <pymongo.results.InsertOneResult object at 0x00000203F96DCA80>
Put the data to be inserted into a dictionary variable; on insert, the variable will have the _id added by pymongo.
from pymongo import MongoClient
db = MongoClient()['mydatabase']
doc = {
"name": "name"
}
db.students.insert_one(doc)
print(doc)
prints:
{'name': 'name', '_id': ObjectId('60ce419c205a661d9f80ba23')}
Unfortunately, the commenters are correct. The PyMongo pattern doesn't specifically allow for what you are asking. You are expected to just use the inserted_id from the result and if you needed to get the full object from the collection later do a regular query operation afterwards
I'm currently trying to save some python objects (websites) via PyRavenDB in a RavenDB database. The problem is that data are saved properly, but when I test it by querying the results, some of the attributes are returned empty.
The code is simple, I can't properly find be the problem.
The JSON object in the database is the following (verified via the DB web UI).
{
"htmlCode": "<code>TEST HTML</code>",
"added": "2017-02-21",
"uniqueid": "262e4584f3e546afa2c67045a0096b54",
"url": "www.example.com",
"myHash": "d41d8cd98f00b204e9800998ecf8427e",
"lastaccessed": "2017-02-21"
}
When I use this code to query
from pyravendb.store import document_store
store = document_store.documentstore(url="http://somewhere:someport", database="websites")
store.initialize()
with store.open_session() as session:
query_result = list(session.query().where_equals("www.example.com", url))
print query_result
print type(query_result)
return query_result
It returns this object :
{
'uniqueid': 'f942e86f965d4709a2d69caca3001f2a',
'url': '',
'myHash': 'd41d8cd98f00b204e9800998ecf8427e',
'htmlCode': '',
'added': '2017-02-21',
'lastaccessed': '2017-02-21'
}
As you can see, url and html code are empty. They should be okey since in DB they are properly stored.
Thanks.
The problem here is that you don't use the where_equal right.
where_equal first argument is the field name you want to query with and then the value (def where_equals(self, field_name, value)).
Just change this line query_result = list(session.query().where_equals("www.example.com", url))
To this query_result = list(session.query().where_equals("url", "www.example.com"))
This will fix your problem
I am receiving a generic Django error on the line of code listed below. I am having a hard time understanding the pymongo docs on how the parameters are to be set up for this function. I am thinking i wrote it incorrectly. I have a collection of request documents. Each request document has a "request" key with a value(subreddit_name + "F"). This is what i would like to query and find the document by. Each document also have a "pdone" key with a value(pdone variable). This is the key-value inside the document that i would like to change.
Line of code where the error occurs:
self.collection_requests.find_one_and_update({'request': self.subreddit_name + "F"}, {'pdone': pdone}, return_document=ReturnDocument.AFTER)
Here is an insert for a document of the collection:
collection_requests.insert({'request': subreddit_name + "F", 'pdone': 0})
Edit: still receiving the same error at the same line of code after changing it to: self.collection_requests.find_one_and_update({'request': self.subreddit_name + "F"}, {'$set': {'pdone': pdone}}, return_document=ReturnDocument.AFTER)
hm... it seems you forget to specify the update operator
try something like:
self.collection_requests.find_one_and_update({'request': self.subreddit_name + "F"}, {'$set': {'pdone': pdone}}, return_document=ReturnDocument.AFTER)
In my MongoDB, a bunch of these documents exist:
{ "_id" : ObjectId("5341eaae6e59875a9c80fa68"),
"parent" : {
"tokeep" : 0,
"toremove" : 0
}
}
I want to remove the parent.toremove attribute in every single one.
Using the MongoDB shell, I can accomplish this using:
db.collection.update({},{$unset: {'parent.toremove':1}},false,true)
But how do I do this within Python?
app = Flask(__name__)
mongo = PyMongo(app)
mongo.db.collection.update({},{$unset: {'parent.toremove':1}},false,true)
returns the following error:
File "myprogram.py", line 46
mongo.db.collection.update({},{$unset: {'parent.toremove':1}},false,true)
^
SyntaxError: invalid syntax
Put quotes around $unset, name the parameter you're including (multi) and use the correct syntax for true:
mongo.db.collection.update({}, {'$unset': {'parent.toremove':1}}, multi=True)
Just found weird to have to attach an arbitrary value for the field to remove, such as a small number (1), an empty string (''), etc, but it's really mentioned in MongoDB doc, with sample in JavaScript:
$unset
The $unset operator deletes a particular field. Consider the following syntax:
{ $unset: { field1: "", ... } }
The specified value in the $unset expression (i.e. "") does not impact
the operation.
For Python/PyMongo, I'd like to put a value None:
{'$unset': {'field1': None}}
So, for OP's question, it would be:
mongo.db.collection.update({}, {'$unset': {'parent.toremove': None}}, multi=True)
I'm attempting to create a web service using MongoDB and Flask (using the pymongo driver). A query to the database returns documents with the "_id" field included, of course. I don't want to send this to the client, so how do I remove it?
Here's a Flask route:
#app.route('/theobjects')
def index():
objects = db.collection.find()
return str(json.dumps({'results': list(objects)},
default = json_util.default,
indent = 4))
This returns:
{
"results": [
{
"whatever": {
"field1": "value",
"field2": "value",
},
"whatever2": {
"field3": "value"
},
...
"_id": {
"$oid": "..."
},
...
}
]}
I thought it was a dictionary and I could just delete the element before returning it:
del objects['_id']
But that returns a TypeError:
TypeError: 'Cursor' object does not support item deletion
So it isn't a dictionary, but something I have to iterate over with each result as a dictionary. So I try to do that with this code:
for object in objects:
del object['_id']
Each object dictionary looks the way I'd like it to now, but the objects cursor is empty. So I try to create a new dictionary and after deleting _id from each, add to a new dictionary that Flask will return:
new_object = {}
for object in objects:
for key, item in objects.items():
if key == '_id':
del object['_id']
new_object.update(object)
This just returns a dictionary with the first-level keys and nothing else.
So this is sort of a standard nested dictionaries problem, but I'm also shocked that MongoDB doesn't have a way to easily deal with this.
The MongoDB documentation explains that you can exclude _id with
{ _id : 0 }
But that does nothing with pymongo. The Pymongo documentation explains that you can list the fields you want returned, but "(“_id” will always be included)". Seriously? Is there no way around this? Is there something simple and stupid that I'm overlooking here?
To exclude the _id field in a find query in pymongo, you can use:
db.collection.find({}, {'_id': False})
The documentation is somewhat missleading on this as it says the _id field is always included. But you can exclude it like shown above.
Above answer fails if we want specific fields and still ignore _id. Use the following in such cases:
db.collection.find({'required_column_A':1,'required_col_B':1, '_id': False})
You are calling
del objects['_id']
on the cursor object!
The cursor object is obviously an iterable over the result set and not single
document that you can manipulate.
for obj in objects:
del obj['_id']
is likely what you want.
So your claim is completely wrong as the following code shows:
import pymongo
c = pymongo.Connection()
db = c['mydb']
db.foo.remove({})
db.foo.save({'foo' : 42})
for row in db.foo.find():
del row['_id']
print row
$ bin/python foo.py
> {u'foo': 42}