I'm working with on a project that has a questionnaire in it.
I'm using Python, Flask, Postgres and Sqlalchemy.
I need to build a search endpoint that filters the documents by the title or by any of the answers in the questionnaire.
The database is structured the following way:
[Client] - One to Many - [Document] - One to Many - [DocumentVersion]
So that one Client can have many Documents and each document may have many Document Versions.
### DocumentVersion Model
class DocumentVersion(db.Model):
__tablename__ = 'document_version'
id = db.Column(db.Integer, primary_key=True)
document_id = db.Column(db.Integer, db.ForeignKey('document.id'), nullable=False)
answers = db.Column(JSON, nullable=False)
# ... other columns
document = relationship('Document', back_populates='versions')
#hybrid_method
def answers_contain(self, text):
'''returns True if the text appears in any of the answers'''
contains_text = False
for answer in self.answers:
if text == str(answer['answer'].astext):
contains_text = True
return contains_text
Inside the [DocumentVersion] table, there is a JSONB field storing the questions and the answers.
The json is structured the following way:
[{
"value": "question"
"answer": "foo",
...
},
{
"value": "question"
"answer": "bar",
...
},
...
]
The filter document by document title is working fine, but I can't figure out a way to filter by the answers in the json.
I believe I have to iterate over the json to make the filter. So I tried to create a #hybrid_method called answers_contain to do so,
but when I do for answer in self.answers in the hydrid method, the loop actually never ends. I wonder if it's possible to iterate over the json
while making the query. If I try len(self.answers) inside the hybrid method, I get a
TypeError: object of type 'InstrumentedAttribute' has no len().
### Search endpoint
try:
page = int(request.args.get('page', 1))
per_page = int(request.args.get('per_page', 20))
search_param = str(request.args.get('search', ''))
except:
abort(400, "invalid parameters")
paginated_query = Document.query \
.filter_by(client_id=current_user['client_id']) \
.join(Document.versions) \
.filter(or_(
Document.title.ilike(f'%{search_param}%'),
DocumentVersion.answers_contain(f'%{search_param}%'),
)) \
.order_by(desc(Document.created_at)) \
.paginate(page=page, per_page=per_page)
I also tried to filter like this:
DocumentVersion.answers.ilike(f'%{search_param}%'), which gives me an error and a hint:
HINT: No operator matches the given name and argument types. You might need to add explicit type casts. If I added explicit type casts I would have to hardcode the questions, but I can't, since they can change.
What is the best way to do this filtering? I'd like to avoid bringing all the client documents to the backend server, if possible.
Is there a way to iterate over the json while making the query, on the db server?
Thanks in advance.
Related
I have the following problem.
I have a class User simplified example:
class User:
def __init__(self, name, lastname, status, id=None):
self.id = id
self.name = name
self.lastname = lastname
self.status = status
def set_status(self,status)
# call to the api to change status
def get_data_from_db_by_id(self)
# select data from db where id = self.id
def __eq__(self, other):
if not isinstance(other, User):
return NotImplemented
return (self.id, self.name, self.lastname, self.status) == \
(other.id, other.name, other.lastname, other.status)
And I have a database structure like:
id, name, lastname, status
1, Alex, Brown, free
And json response from an API:
{
"id": 1,
"name": "Alex",
"lastname": "Brown",
"status": "Sleeping"
}
My question is:
What the best way to compare json vs sql responses?
What for? - it's only for testing purposes - I have to check that API has changed the DB correctly.
How can I deserialize Json and DB resul to the same class? Is there any common /best practices ?
For now, I'm trying to use marshmallow for json and sqlalchemy for DB, but have no luck with it.
Convert the database row to a dictionary:
def row2dict(row):
d = {}
for column in row.__table__.columns:
d[column.name] = str(getattr(row, column.name))
return d
Then convert json string to a dictionary:
d2 = json.loads(json_response)
And finally compare:
d2 == d
If you are using SQLAlchemy for the database, then I would recommend using SQLAthanor (full disclosure: I am the library’s author).
SQLAthanor is a serialization and de-serialization library for SQLAlchemy that lets you configure robust rules for how to serialize / de-serialize your model instances to JSON. One way of checking your instance and JSON for equivalence is to execute the following logic in your Python code:
First, serialize your DB instance to JSON. Using SQLAthanor you can do that as simply as:
instance_as_json = my_instance.dump_to_json()
This will take your instance and dump all of its attributes to a JSON string. If you want more fine-grained control over which model attributes end up on your JSON, you can also use my_instance.to_json() which respects the configuration rules applied to your model.
Once you have your serialized JSON string, you can use the Validator-Collection to convert your JSON strings to dicts, and then check if your instance dict (from your instance JSON string) is equivalent to the JSON from the API (full disclosure: I’m also the author of the Validator-Collection library):
from validator_collection import checkers, validators
api_json_as_dict = validators.dict(api_json_as_string)
instance_json_as_dict = validators.dict(instance_as_json)
are_equivalent = checkers.are_dicts_equivalent(instance_json_as_dict, api_json_as_dict)
Depending on your specific situation and objectives, you can construct even more elaborate checks and validations as well, using SQLAthanor’s rich serialization and deserialization options.
Here are some links that you might find helpful:
SQLAthanor Documentation on ReadTheDocs
SQLAthanor on Github
.dump_to_json() documentation
.to_json() documentation
Validator-Collection Documentation
validators.dict() documentation
checkers.are_dicts_equivalent() documentation
Hope this helps!
Inside the expenses collection I have this Json:
{
"_id" : ObjectId("5ad0870d2602ff20497b71b8"),
"Hotel" : {}
}
I want to insert a document or another object if possible inside Hotel using Python.
My Python code:
from pymongo import MongoClient
client = MongoClient('localhost', 27017)
db = client['db']
collection_expenses = db ['expenses']
#insert
d = int(input('Insert how many days did you stay?: '))
founded_expenses = collection_expenses.insert_one({'days':d})
The code above inserts the document inside the collection. What should I change to add the days inside de Hotel object?
Thanks in advance.
Instead of using insert_one, you may want to take a look to the save method, which is a little bit more permissive.
Admitting your document is already created in the collection:
[...]
expenses = db['expenses']
# Find your document
expense = expense.find_one({})
expense["Hotel"] = { "days": d }
# This will either update or save as a new document the expense dict,
# depending on whether or not it already has an _id parameter
expenses.save(expense)
Knowing that find_one will return you None if no such document exist, you may want to upsert a document. You can thus easily do so with save.
I'm really new to Python & as new to Pyramid (this is the first thing I've written in Python) and am having trouble with a database query...
I have the following models (relevant to my question anyway):
MetadataRef (contains info about a given metadata type)
Metadata (contains actual metadata) -- this is a child of MetadataRef
User (contains users) -- this is linked to metadata. MetadataRef.model = 'User' and metadata.model_id = user.id
I need access to name from MetadataRef and value from Metadata.
Here's my code:
class User(Base):
...
_meta = None
def meta(self):
if self._meta == None:
self._meta = {}
try:
for item in DBSession.query(MetadataRef.key, Metadata.value).\
outerjoin(MetadataRef.meta).\
filter(
Metadata.model_id == self.id,
MetadataRef.model == 'User'
):
self._meta[item.key] = item.value
except DBAPIError:
##TODO: actually do something with this
self._meta = {}
return self._meta
The query SQLAlchemy is generating does return what I need (close enough anyway -- it needs to query model_id as part of the ON clause rather than the WHERE, but that's minor and I'm pretty sure I can figure that out myself):
SELECT metadata_refs.`key` AS metadata_refs_key, metadata.value AS metadata_value
FROM metadata_refs LEFT OUTER JOIN metadata ON metadata_refs.id = metadata.metadata_ref_id
WHERE metadata.model_id = %s AND metadata_refs.model = %s
However, when I access the objects I get this error:
AttributeError: 'KeyedTuple' object has no attribute 'metadata_value'
This leads me to think there's some other way I need to access it, but I can't figure out how. I've tried both .value and .metadata_value. .key does work as expected.
Any ideas?
You're querying separate attributes ("ORM-enabled descriptors" in SA docs):
DBSession.query(MetadataRef.key, Metadata.value)
in this case the query returns not full ORM-mapped objects, but a KeyedTuple, which is a cross between a tuple and an object with attributes corresponding to the "labels" of the fields.
So, one way to access the data is by its index:
ref_key = item[0]
metadata_value = item[1]
Alternatively, to make SA to use a specific name for column, you may use Column.label() method:
for item in DBSession.query(MetadataRef.key.label('ref_key'), Metadata.value.label('meta_value'))...
self._meta[item.key] = item.meta_value
For debugging you can use Query.column_descriptions() method which will tell you the names of the columns returned by the query.
I am trying to update an already existing document by ID. My intention is to find the doc by its id, then change its "firstName" with new value coming in "json", then update it into the CouchDB database.
Here is my code:
def updateDoc(self, id, json):
doc = self.db.get(id)
doc["firstName"] = json["firstName"]
doc_id, doc_rev = self.db.save(doc)
print doc_id, doc_rev
print "Saved"
//"json" is retrieved from PUT request (request.json)
at self.db.save(doc) I'm getting exception as "too many values to unpack".
I am using Bottle framework, Python 2.7 and Couch Query.
How do I update the document by id? what is the right way to do it?
In couchdb-python the db.save(doc) method returns tuple of _id and _rev. You're using couch-query - a bit different project that also has a db.save(doc) method, but it returns a different result. So your code should look like this:
def updateDoc(self, id, json):
doc = self.db.get(id)
doc["firstName"] = json["firstName"]
doc = self.db.save(doc)
print doc['_id'], doc['_rev']
print "Saved"
I am trying to use endpoints to update some JSON values in my datastore. I have the following Datastore in GAE...
class UsersList(ndb.Model):
UserID = ndb.StringProperty(required=True)
ArticlesRead = ndb.JsonProperty()
ArticlesPush = ndb.JsonProperty()
In general what I am trying to do with the API is have the method take in a UserID and a list of articles read (with an article being represented by a dictionary holding an ID and a boolean field saying whether or not the user liked the article). My messages (centered on this logic) are the following...
class UserID(messages.Message):
id = messages.StringField(1, required=True)
class Articles(messages.Message):
id = messages.StringField(1, required=True)
userLiked = messages.BooleanField(2, required=True)
class UserIDAndArticles(messages.Message):
id = messages.StringField(1, required=True)
items = messages.MessageField(Articles, 2, repeated=True)
class ArticleList(messages.Message):
items = messages.MessageField(Articles, 1, repeated=True)
And my API/Endpoint method that is trying to do this update is the following...
#endpoints.method(UserIDAndArticles, ArticleList,
name='user.update',
path='update',
http_method='GET')
def get_update(self, request):
userID = request.id
articleList = request.items
queryResult = UsersList.query(UsersList.UserID == userID)
currentList = []
#This query always returns only one result back, and this for loop is the only way
# I could figure out how to access the query results.
for thing in queryResult:
currentList = json.loads(thing.ArticlesRead)
for item in articleList:
currentList.append(item)
for blah in queryResult:
blah.ArticlesRead = json.dumps(currentList)
blah.put()
for thisThing in queryResult:
pushList = json.loads(thisThing.ArticlesPush)
return ArticleList(items = pushList)
I am having two problems with this code. The first is that I can't seem to figure out (using the localhost Google APIs Explorer) how to send a list of articles to the endpoints method using my UserIDAndArticles class. Is it possible to have a messages.MessageField() as an input to an endpoint method?
The other problem is that I am getting an error on the 'blah.ArticlesRead = json.dumps(currentList)' line. When I try to run this method with some random inputs, I get the following error...
TypeError: <Articles
id: u'hi'
userLiked: False> is not JSON serializable
I know that I have to make my own JSON encoder to get around this, but I'm not sure what the format of the incoming request.items is like and how I should encode it.
I am new to GAE and endpoints (as well as this kind of server side programming in general), so please bear with me. And thanks so much in advance for the help.
A couple things:
http_method should definitely be POST, or better yet PATCH because you're not overwriting all existing values but only modifying a list, i.e. patching.
you don't need json.loads and json.dumps, NDB does it automatically for you.
you're mixing Endpoints messages and NDB model properties.
Here's the method body I came up with:
# get UsersList entity and raise an exception if none found.
uid = request.id
userlist = UsersList.query(UsersList.UserID == uid).get()
if userlist is None:
raise endpoints.NotFoundException('List for user ID %s not found' % uid)
# update user's read articles list, which is actually a dict.
for item in request.items:
userslist.ArticlesRead[item.id] = item.userLiked
userslist.put()
# assuming userlist.ArticlesPush is actually a list of article IDs.
pushItems = [Article(id=id) for id in userlist.ArticlesPush]
return ArticleList(items=pushItems)
Also, you should probably wrap this method in a transaction.