Neomodel Most Efficient Way to Query Relationship Data - python

Suppose I have the following models from the neomodel documentation.
class FriendRel(StructuredRel):
since = DateTimeProperty(
default=lambda: datetime.now(pytz.utc)
)
met = StringProperty()
class Person(StructuredNode):
name = StringProperty()
friends = RelationshipTo('Person', 'FRIEND', model=FriendRel)
And I create the following data.
bob = Person(name='bob').save()
frank = Person(name='frank').save()
rel = bob.friends.connect(frank, {'since': dt.datetime.now(), 'met': 'Germany'})
Now my question is how I should go about retrieving both the friends of an object and the corresponding FriendshipRel objects between those relationships.
The Neomodel docs seem to say to do the following.
>>> bob = Person.nodes.get(name='bob')
>>> frank = bob.friends[0] # get bob's friend frank using database query?
>>> rel = bob.friends.relationship(frank) # query database again?
>>> rel.met
'Germany'
When doing this, it really feels like there would be a better way of retrieving relationship objects without another database query. I would expect these relationship objects to already be cached when you retrieve a node's friends?
So in a loop, would this be the best way to retrieve all of a Person's friends and the FriendshipRel objects for those friendships?
# source: https://stackoverflow.com/questions/67821341/retrieve-the-relationship-object-in-neomodel
for friend in bob.friends:
rel = bob.friends.relationship(friend)
This seems quite inefficient, as doesn't it require another database query for each relationship? Or am I not understanding correctly?
With cypher, I would just do the following:
MATCH(i:Person{name: 'bob'})-[j:FRIEND]->(k) RETURN i,j,k
So my question: is there a way, using neomodel, to retrieve a node's relationships and the objects for those relationships both at the same time?

I've checked the neomodel source code and there doesn't seem to be a way to achieve what I want in a more efficient way than what I found in this stackoverflow answer.
But I now know how to do this using cypher queries like so:
from neomodel import db
from models import Person, FriendRel
bob = Person.nodes.get(name='bob')
# Only one database query. Yay!
results, cols = db.cypher_query(f"""MATCH (node)-[rel]-(neighbor)
WHERE id(node)={john.id}
RETURN node, rel, neighbor""")
rels = {} # friendships mapped to neighbor node ids
neighbors = []
for row in results:
neighbor = Person.inflate(row[cols.index('neighbor')])
neighbors.append(neighbor)
rel = FriendRel.inflate(row[cols.index('rel')])
rels[neighbor.id] = rel
Then, now that you've stored all neighbors and the relationships between them, you can loop through them like so:
for neighbor, rel in rels:
print(f"bob has a friendship with {neighbor}.")
print(f"They've been friends since {rel.since}")
Or like so:
for neighbor in neighbors:
rel = rels[neighbor.id]
Thanks to everyone's helpful advice!

Related

Retrieve the Relationship object in Neomodel

I am using Neomodel and Python for my project. I have a number of nodes defined and am storing relevant information on the relationships between them. However I can't seem to find a mechanism for retrieving the relationship object itself to be able to use the attributes - I can only filter by the relationship attribute to return the Nodes.
class MyRelationship(StructuredRel):
source = StringProperty()
class Person(StructuredNode):
uid=UniqueIdProperty()
first_name = StringProperty()
last_name = StringProperty()
people = RelationshipTo('Person', "PERSON_RELATIONSHIP", model = MyRelationship)
I have a number of relationships of the same type [PERSON_RELATIONSHIP] between the same two nodes, but they differ by attribute. I want to be able to iterate through them and print out the to node and the attribute.
Given an Object person of type Person
for p in person.people:
gives me the Person objects
person.people.relationship(p).source always gives me the value for the first relationship only
A Traversal also seems to give me the Person objects as well
The only way it seems to get a Relationship object is on .connect.
Any clues? Thanks.
I just stumbled over the same problem and managed to solve it like below. But i am not sute if it is the most performant solution.
If you already have a Person node object in variable person:
for p in person.people:
r = person.people.relationship(p)
Or iterating over all Person nodes:
for person in Person.nodes.all():
for p in person.people:
r = person.people.relationship(p)
I've checked the neomodel source code and there doesn't seem to be a way to achieve what you want in a more efficient way than what Roman said.
but you could always use cypher queries.
from neomodel import db
from models import Person, MyRelationship
john = Person.nodes.get(name='John')
results, cols = db.cypher_query(f"""MATCH (node)-[rel]-(neighbor)
WHERE id(node)={john.id}
RETURN node, rel, neighbor""")
rels = {}
neighbors = []
for row in results:
neighbor = Person.inflate(row[cols.index('neighbor')])
neighbors.append(neighbor)
rel = MyRelationship.inflate(row[cols.index('rel')])
rels[neighbor.id] = rel
Then, now that you've stored all neighbors and the relationships between them, you can loop through them like so:
for neighbor, rel in rels:
print(f"john has a friendship with {neighbor} which has the source {rel.source}")
Hope this helps!
Ethan

google-app-engine: How to apply filter in query when filter parameter is db.ReferenceProperty?

This is next step to this question, where I need to run queries and apply filters.
Here is my Model
from google.appengine.ext import db
Class Car(db.Model):
name=db.StringProperty()
model=db.StringProperty()
mileage=db.IntegerProperty()
person = db.ReferenceProperty(Person, collection_name='person')
Class Person(db.Model):
name=db.StringProperty()
age=db.IntegerProperty()
Question : For a Person I want to get all the cars he own
Approach : Get all the cars and apply filter on person whose name is "Random"
I tried the following but it doesn't work
s = Car.all()
s.filter('person.name =', 'Random') # It fails here
result = s.fetch(1)[0] # just first result for now
print result.text
print result.votes
print result.page.language
How can I make this query run?
Thank you
You can only use filter on indexed attributes. person.name is from other entity! In sql you would need to use join (which is impossible when data grows big), in google bigtable like in many other non-relational databases tables join is not possible. Luckily your case is very simple, you can select all cars if you know persons key:
>>> person = Person.all().filter('name =', 'Mr. Random').fetch(1)[0]
>>> cars = Car.all().filter('person =', person.key())
If you had used a more reasonable value for collection_name
Class Car(db.Model):
...
person = db.ReferenceProperty(Person, collection_name='cars_collection')
you could access all cars like this:
>>> person = Person.all().filter('name =', 'Mr. Random').fetch(1)[0]
>>> mrs_randoms_cars = person.cars_collection

Help in constructing a GQL query for a one to many relationship

A] Problem Summary:
I have one to many data models in my project. I need help constructing a query to get the “many” side data object based on the key of the “one” side data object.
Please refer to "EDIT#1" for the code that has worked, but still inefficient.
B] Problem details:
1] I have “UserReportedCountry” (one) to “UserReportedCity” (many) , “UserReportedCity” (one) to “UserReportedStatus” (many) data model relationships.
2] The key for UserReportedCountry is a country_name , example -- “unitedstates”.
The key for UserReportedCity is country_name:city_name, example “unitedstates:boston.
UserReportedStatus has no special key name.
3] I have the users country and city name in my python code and I want to retrieve all “UserReportedStatus” objects based on the city key name which is “county_name:city_name”
C] Code excerpts:
1] database models:
class UserReportedCountry(db.Model):
country_name = db.StringProperty( required=True,
choices=['Afghanistan','Aring land Islands']
)
class UserReportedCity(db.Model):
country = db.ReferenceProperty(UserReportedCountry, collection_name='cities')
city_name = db.StringProperty(required=True)
class UserReportedStatus(db.Model):
city = db.ReferenceProperty(UserReportedCity, collection_name='statuses')
status = db.BooleanProperty(required=True)
date_time = db.DateTimeProperty(auto_now_add=True)
2] query I have tried so far:
def get_data_for_users_country_and_city(self,users_country,users_city):
key_name_for_user_reported_city = users_country + ":" + users_city
return UserReportedStatus.all().filter('city.__key__=', key_name_for_user_reported_city ).fetch(limit=10)
D] Technologies being used
1] Python
2] Google App engine
3] Django
4] Django models.
[EDIT#1]
I have tried the following mechanism to query the status objects based on the given city and country. This has worked, but i believe this is an inefficient mechanism to perform the task.
def get_data_for_users_country_and_city(self,users_country,users_city):
key_name_for_user_reported_city = users_country + ":" + users_city
city_object = UserReportedCity.get_by_key_name( key_name_for_user_reported_city )
return UserReportedStatus.gql('WHERE city=:1', city_object)
Your last query looks correct and I don't see any problem of inefficiency on it; in fact querying by_key_name is pretty fast and efficient.
I think you have room of improvement on your normalized RDMBS oriented Models design instead; since GAE does not support JOINS, after the get_data_for_users_country_and_city call, you will end in hitting the datastore too much to fetch the city_name and country_name properties when dereferenced.
What could you do? Denormalization and prefetching.
Add a country_name property in the UserReportedCity model definition
Prefetch ReferenceProperty to retrieve the UserReportedCity object for each UserReportedStatus object

SQLAlchemy/Elixir - querying to check entity's membership in a many-to-many relationship list

I am trying to construct a sqlalchemy query to get the list of names of all professors who are assistants professors on MIT. Note that there can be multiple assistant professors associated with a certain course.
What I'm trying to do is roughly equivalent to:
uni_mit = University.get_by(name='MIT')
s = select([Professor.name],
and_(Professor.in_(Course.assistants),
Course.university = uni_mit))
session.execute(s)
This won't work, because in_ is only defined for entity's fields, not for the whole entity.. Can't use Professor.id.in_ as Course.assistants is a list of Professors, not a list of their ids. I also tried contains but I didn't work either.
My Elixir model is:
class Course(Entity):
id = Field(Integer, primary_key=True)
assistants = ManyToMany('Professor', inverse='courses_assisted', ondelete='cascade')
university = ManyToOne('University')
..
class Professor(Entity):
id = Field(Integer, primary_key=True)
name = Field(String(50), required=True)
courses_assisted = ManyToMany('Course', inverse='assistants', ondelete='cascade')
..
This would be trivial if I could access the intermediate many-to-many entity (the condition would be and_(interm_table.prof_id = Professor.id, interm_table.course = Course.id), but SQLAlchemy apparently hides this table from me.
I'm using Elixir 0.7 and SQLAlchemy 0.6.
Btw: This question is different from Sqlalchemy+elixir: How query with a ManyToMany relationship? in that I need to check the professors against all courses which satisfy a condition, not a single, static one.
You can find the intermediate table where Elixir has hidden it away, but note that it uses fully qualified column names (such as __package_path_with_underscores__course_id). To avoid this, define your ManyToMany using e.g.
class Course(Entity):
...
assistants = ManyToMany('Professor', inverse='courses_assisted',
local_colname='course_id', remote_colname='prof_id',
ondelete='cascade')
and then you can access the intermediate table using
rel = Course._descriptor.find_relationship('assistants')
assert rel
table = rel.table
and can access the columns using table.c.prof_id, etc.
Update: Of course you can do this at a higher level, but not in a single query, because SQLAlchemy doesn't yet support in_ for relationships. For example, with two queries:
>>> mit_courses = set(Course.query.join(
... University).filter(University.name == 'MIT'))
>>> [p.name for p in Professor.query if set(
... p.courses_assisted).intersection(mit_courses)]
Or, alternatively:
>>> plist = [c.assistants for c in Course.query.join(
... University).filter(University.name == 'MIT')]
>>> [p.name for p in set(itertools.chain(*plist))]
The first step creates a list of lists of assistants. The second step flattens the list of lists and removes duplicates through making a set.

Query on ReferenceProperty in Google App Engine

I have two entity kinds and one referring to other like:
class Entity1(db.Expando):
prop1=db.StringProperty()
prop2=db.StringProperty()
class Entity2(db.Expando):
prop3=db.ReferenceProperty(Entity1)
prop4=db.StringProperty()
Can I write a query like:
q=Entity2.all().filter("prop3.prop1 =",somevalue)
Here prop3 has a reference and will be referring to some entity of kind Entity1 and I want to know all those entities of kind Entity2 which refer to those entities of Entity1 that have prop1 as somevalue.
In my example I first fetched the key for the referenceproperty to use as a filter for the query. Then you can do the filtering based on the key in a the same amounts of queries you found keys for.
An example:
The models:
class Order(db.Model):
waiter = db.ReferenceProperty(Waiter,required=True)
date=db.DateTimeProperty(required=True)
delivered=db.BooleanProperty(default=False)
class Waiter(db.Model):
firstname=db.StringProperty(required=True)
lastname=db.StringProperty(required=True)
uuid=db.StringProperty(required=True)
The web request function:
def get(self):
waiteruuid=self.request.get("waiter")
q=Waiter.all()
q.filter('uuid =', waiteruuid)
waiterkey=q.get()
result={}
result['orders']=[]
q=Order.all()
if waiteruuid:
q.filter('waiter =', waiterkey)
orders=q.run()
Google Datastore doesn't support joins. You can fetch all the entites of Entity2 and does some manipulation to achieve what you have said.Somewhat similar to what #Mani suggested. But you can do it like
entities2 = Entity2.all()
for entity2 in entities2:
Entity1= entity.prop3.get()
if Entity1.prop1== somevalue:
#Do your processing here
Define Entity 2 as:
class Entity2(db.Expando):
entity_1_ref = db.ReferenceProperty(Entity1, collection_name = "set_of_entity_2_elements")
prop4=db.StringProperty()
This defines a collection name which can be operated from the other side of reference. (Entity1 in this case). I have taken the liberty to rename prop3 as something more appropriate.
Now you can do q = entity_1_object.set_of_entity_2_elements (a new property for all your Entity1 objects) which will give you the results you want.
For more information, read this article indepth: http://code.google.com/appengine/articles/modeling.html
Update: Sorry, I got it wrong. The above suggestion doesnt get only those elements with entity_1_object.prop1 == somevalue
You can still get it in a round about way as follows:
for obj in q:
if ( obj.prop1 == somevalue):
# Do your processing here
or
for obj in q:
if ( obj.prop1 != somevalue):
# Delete this object from the list 'q'
But obviously this is not the best way. Lets wait for a better answer!

Categories