Python mongoengine How to query an EmbeddedDocument - python

I am trying to query document in mongodb.
Schema is below:
class Book(EmbeddedDocument):
name = StringField()
description = StringField()
class Category(EmbeddedDocument):
name = StringField()
books = ListField(EmbeddedDocumentField(Book))
class Main(Document):
category = EmbeddedDocumentField(Category)
What i need is to retrieve Book with name say "Python For Dummies".
I tried using
Main.objects(category__book__name="Python For Dummies")[0]
as well as
Main.objects(__raw__={'category.book.name': 'Python For Dummies'})[0]
Both are retrieving a single Main document out of the list which in which there is a book with name "Python For Dummies". But what i want is the Book embedded document alone not the entire document. My need is to list that single book information. In my case, now i have to traverse again through the books list of the Main document and match the name with the book's name to retrieve the correct book - I think there must be better ways in mongoengine/python to achieve this.
Please advice.

You can limit the output with only().
Query
Main.objects(category__books__name="Python For Dummies").only("category.books")
Result
[{"category": {"books": [{"name": "Python For Dummies"", "description": "a test book"}]}}]
But that would not get you exactly what you want. To achive this, you would need to use aggregate and $unwind.
Query
list(Main.objects.aggregate(
{"$match":{"category.books.name":"Python For Dummies"} },
{"$unwind": "$category.books" },
{"$group":{"_id": None, "books":{"$push":"$category.books"}}}
))
Result
{'_id': None,'books': [{'description': 'a test book', 'name': 'Python For Dummies"'}]}]

Related

How do i get the recent inserted document in MongoDB with all it's fields?

I'm working on this REST application in python Flask and a driver called pymongo. But if someone knows mongodb well he/she maybe able to answer my question.
Suppose Im inserting a new document in a collection say students. I want to get the whole inserted document as soon as the document is saved in the collection. Here is what i've tried so far.
res = db.students.insert_one({
"name": args["name"],
"surname": args["surname"],
"student_number": args["student_number"],
"course": args["course"],
"mark": args["mark"]
})
If i call:
print(res.inserted_id) ## i get the id
How can i get something like:
{
"name": "student1",
"surname": "surname1",
"mark": 78,
"course": "ML",
"student_number": 2
}
from the res object. Because if i print res i am getting <pymongo.results.InsertOneResult object at 0x00000203F96DCA80>
Put the data to be inserted into a dictionary variable; on insert, the variable will have the _id added by pymongo.
from pymongo import MongoClient
db = MongoClient()['mydatabase']
doc = {
"name": "name"
}
db.students.insert_one(doc)
print(doc)
prints:
{'name': 'name', '_id': ObjectId('60ce419c205a661d9f80ba23')}
Unfortunately, the commenters are correct. The PyMongo pattern doesn't specifically allow for what you are asking. You are expected to just use the inserted_id from the result and if you needed to get the full object from the collection later do a regular query operation afterwards

How to search via production_id?

I am attempting to get a field from a manufacturing order to a related work order.
I have tried:
for record in records:
if record.production_id:
so = env['mrp.production'].search([('name', '=', record.production_id)])
if so:
record.write({"x_customer": so.x_customer_nick_name})
This however does not work, but if I do a search of the actual production ID name for manufacturing name it works as intended:
for record in records:
if record.production_id:
so = env['mrp.production'].search([('name', '=', record.'Boost/BoMO/73222')])
if so:
record.write({"x_customer": so.x_customer_nick_name})
I believe this is due to production_id, from the raw data I can see it is [11212,'Boost/BoMO/73222'].
So I only need the first element, however:
so = env['mrp.production'].search([('name', '=', record.production_id[1])])
does not return the string name of the production_id. How should I go about getting this data?
Error Code
pobjs = [adapt(o) for o in self._seq]\npsycopg2.ProgrammingError: can\'t adapt type \'dict\'\n'>
Please try this if production_id is of mrp.production relation:
if record.production_id:
so = record.production_id
If so:
record.write({"x_customer":so.x_customer_nick_name})
production_id is a Many2One field to mrp.production, so you can use it to access any field in mrp.production, and in your case you'd like to get the name of the manufacturing order so it should be something like the following:
so = self.env['mrp.production'].search([('name', '=', record.production_id.name)])
Wasn't able to find a solution here or on the odoo forums. For anyone else looking for a solution, I ended up using the web API. I scraped the work orders and the production ID manufacture orders. Built a tuple wrote to WO with a loop.
models.execute_kw(db, uid, password, 'mrp.workorder', 'write', [workorders_id, {
'x_for_customer': x_customer_nick_name}
}])

PyMongo Atlas Search not returning anything

I'm trying to do a full text search using Atlas for MongoDB. I'm doing this through the PyMongo driver in Python. I'm using the aggregate pipeline, and doing a $search but it seems to return nothing.
cursor = db.collection.aggregate([
{"$search": {"text": {"query": "hello", "path": "text_here"}}},
{"$project": {"file_name": 1}}
])
for x in cursor:
print(x)
What I'm trying to achieve with this code is to search through a field in the collection called "text_here", and I'm searching for a term "hello" and returning all the results that contain that term and listing them by their "file_name". However, it returns nothing and I'm quite confused as this is almost identical to the example code on the documentation website. The only thing I could think of right now is that possible the path isn't correct and it can't access the field I've specified. Also, this code returns no errors, simply just returns nothing as I've tested by looping through cursor.
I had the same issue. I solved it by also passing the name of the index in the query. For example:
{
index: "name_of_the_index",
text: {
query: 'john doe',
path: 'name'
}
}
I followed the tutorials but couldn't get any result back without specifying the "index" name. I wish this was mentioned in the documentation as mandatory.
If you are only doing a find and project, you don't need an aggregate query, just a find(). The syntax you want is:
db.collection.find({'$text': {'$search': 'hello'}}, {'file_name': 1})
Equivalent using aggregate:
cursor = db.collection.aggregate([
{'$match': {'$text': {'$search': 'hello'}}},
{'$project': {'file_name': 1}}])
Worked example:
from pymongo import MongoClient, TEXT
db = MongoClient()['mydatabase']
db.collection.create_index([('text_here', TEXT)])
db.collection.insert_one({"text_here": "hello, is it me you're looking for", "file_name": "foo.bar"})
cursor = db.collection.find({'$text': {'$search': 'hello'}}, {'file_name': 1})
for item in cursor:
print(item)
prints:
{'_id': ObjectId('5fc81ce9a4a46710459de610'), 'file_name': 'foo.bar'}

Iterate over json on sqlalchemy query

I'm working with on a project that has a questionnaire in it.
I'm using Python, Flask, Postgres and Sqlalchemy.
I need to build a search endpoint that filters the documents by the title or by any of the answers in the questionnaire.
The database is structured the following way:
[Client] - One to Many - [Document] - One to Many - [DocumentVersion]
So that one Client can have many Documents and each document may have many Document Versions.
### DocumentVersion Model
class DocumentVersion(db.Model):
__tablename__ = 'document_version'
id = db.Column(db.Integer, primary_key=True)
document_id = db.Column(db.Integer, db.ForeignKey('document.id'), nullable=False)
answers = db.Column(JSON, nullable=False)
# ... other columns
document = relationship('Document', back_populates='versions')
#hybrid_method
def answers_contain(self, text):
'''returns True if the text appears in any of the answers'''
contains_text = False
for answer in self.answers:
if text == str(answer['answer'].astext):
contains_text = True
return contains_text
Inside the [DocumentVersion] table, there is a JSONB field storing the questions and the answers.
The json is structured the following way:
[{
"value": "question"
"answer": "foo",
...
},
{
"value": "question"
"answer": "bar",
...
},
...
]
The filter document by document title is working fine, but I can't figure out a way to filter by the answers in the json.
I believe I have to iterate over the json to make the filter. So I tried to create a #hybrid_method called answers_contain to do so,
but when I do for answer in self.answers in the hydrid method, the loop actually never ends. I wonder if it's possible to iterate over the json
while making the query. If I try len(self.answers) inside the hybrid method, I get a
TypeError: object of type 'InstrumentedAttribute' has no len().
### Search endpoint
try:
page = int(request.args.get('page', 1))
per_page = int(request.args.get('per_page', 20))
search_param = str(request.args.get('search', ''))
except:
abort(400, "invalid parameters")
paginated_query = Document.query \
.filter_by(client_id=current_user['client_id']) \
.join(Document.versions) \
.filter(or_(
Document.title.ilike(f'%{search_param}%'),
DocumentVersion.answers_contain(f'%{search_param}%'),
)) \
.order_by(desc(Document.created_at)) \
.paginate(page=page, per_page=per_page)
I also tried to filter like this:
DocumentVersion.answers.ilike(f'%{search_param}%'), which gives me an error and a hint:
HINT: No operator matches the given name and argument types. You might need to add explicit type casts. If I added explicit type casts I would have to hardcode the questions, but I can't, since they can change.
What is the best way to do this filtering? I'd like to avoid bringing all the client documents to the backend server, if possible.
Is there a way to iterate over the json while making the query, on the db server?
Thanks in advance.

Mongoengine: Querying against reference field returns no results, but field is valid on object

Simplified, I have two Documents, a Package and a Target. A Package contains lists of DBref() to the Targets and the Targets contain a DBRef to the Package. Simplified it's set up like so:
class Target(Document):
package = ReferenceField('Package')
name = StringField()
class Package(Document):
targets = ListField(ReferenceField('Target'))
name = StringField()
When I create one set of Targets the standard way using mongoengine things work as expected:
# in a method on Package, self = current Package instance
target = Target(name="My Name", package=self).save()
self.targets.insert(0, targets)
I can query across Targets by that package and it returns results as expected:
>>> p = Package.objects.get(id='53e4db7bc57d207fc3d70738')
>>> Target.objects(name="My Name", package=p)
[<Target: My Name>]
BUT for certain Targets we are creating a lot at once and instead use a bulk execute to create the documents (simplified):
bulk = Target._get_collection().initialize_unordered_bulk_op()
for line in csv_reader:
entry = dict(zip(header, line))
entry.update({'name': 'My Name', '_cls': 'Target', 'package': package.to_dbref()})
bulk.find({'name': 'My Name', 'package': package.to_dbref()}).upsert().update({'$set': entry})
bulk.execute()
Theses objects appear totally normal at glance, however, queries referencing their packages do not work even though reading their values seem to work without issue.
(Pdb) self
<Package: Video Package 53e3edc7c57d2079436a14c9>
(Pdb) Target.objects(name="My Name", package=self)
[]
(Pdb) Target.objects(name="My Name")
[<Target: My Name>]
(Pdb) Target.objects(name="My Name")[0].package
<Package: Video Package 53e3edc7c57d2079436a14c9>
Even doing a to_json the objects seem to look comparable:
>>> bad_target.to_json()
'{"_id": {"$oid": "53e4334cd2f370649dce5e20"}, "_cls": "Target", "package": {"$oid": "53e4334cc57d207f0a86a3c7"}, "name": "Bad Target"}'
>>> good_target.to_json()
'{"_id": {"$oid": "53e4db7bc57d207fc3d70739"}, "_cls": "Target", "package": {"$oid": "53e4db7bc57d207fc3d70738"}, "name": "Good Target"}'
It seems fairly clear that the bulk operation is the likely culprit here, but as far as I can tell it's being done correctly and normal operations such as retrieving the reference Package object work as expected, but queries will just not return anything. Any ideas or suggestions would be very much appreciated!
Update with new info
Using a raw pymongo query yields some more interesting results:
>>> Target.objects(__raw__={"name": "My Name", "package": p.to_dbref()})
[]
>>> Target.objects(__raw__={"name": "My Name", "package": p.id})
[<KeywordTarget: My Name>]
This seems to give more indication that it's the nature in which the Targets are saved that are indeed the issue.

Categories