SQL/SqlAlchemy: Querying all objects in a dependancy tree - python

I have a table with a self, asymmetric many-to-many relationship of dependancies between objects. I use that relationship to create a dependably tree between objects.
Having a set of object IDs, I would like to fetch all objects that are somewhere in that dependancy tree.
Here's an example objects table:
+----+------+
| ID | Name |
+----+------+
| 1 | A |
| 2 | B |
| 3 | C |
| 4 | D |
| 5 | E |
+----+------+
And a table of relationships:
+------------+-----------+
| Dependancy | Dependant |
+------------+-----------+
| 2 | 1 |
| 3 | 2 |
| 4 | 1 |
+------------+-----------+
Showing A (ID: 1) depends on both B(2) and D(4), and that B(2) depends on C(3).
Now, I would like to construct a single SQL query that given {1} as a set with a single ID will return the four objects in A's dependancy tree: A, B, D and C.
Alternatively, using one query to fetch all needed object IDs and another to fetch their actual data is also acceptable.
This should be work regardless of the number of levels in the dependency/hierarchy tree.
I'll be happy with either an SQLAlchemy example or plain SQL for the postgresql 10 database (which I'll see how to implement with SQLAlchemy later on).
Thanks!

Related

How to create a table from another table with GridDB?

I have a GridDB container where I have stored my database. I want to copy the table but this would exclude a few columns. The function I need should extract all columns matching a given keyword and then create a new table from that. It must always include the first column *id because it is needed on every table.
For example, in the table given below:
'''
-- | employee_id | department_id | employee_first_name | employee_last_name | employee_gender |
-- |-------------|---------------|---------------------|---------------------|-----------------|
-- | 1 | 1 | John | Matthew | M |
-- | 2 | 1 | Alexandra | Philips | F |
-- | 3 | 2 | Hen | Lotte | M |
'''
Suppose I need to get the first column and every other column starting with "employee". How can I do this through a Python function?
I am using GridDB Python client on my Ubuntu machine and I have already stored the database.csv file in the container. Thanks in advance for your help!

Link lists that share common elements

I have an issue similar to this one with a few differences/complications
I have a list of groups containing members, rather than merging the groups that share members I need to preserve the groupings and create a new set of edges based on which groups have members in common, and do so conditionally based on attributes of the groups
The source data looks like this:
+----------+------------+-----------+
| Group ID | Group Type | Member ID |
+----------+------------+-----------+
| A | Type 1 | 1 |
| A | Type 1 | 2 |
| B | Type 1 | 2 |
| B | Type 1 | 3 |
| C | Type 1 | 3 |
| C | Type 1 | 4 |
| D | Type 2 | 4 |
| D | Type 2 | 5 |
+----------+------------+-----------+
Desired output is this:
+----------+-----------------+
| Group ID | Linked Group ID |
+----------+-----------------+
| A | B |
| B | C |
+----------+-----------------+
A is linked to B because it shares 2 in common
B is linked to C because it shares 3 in common
C is not linked to D, it has a member in common but is of a different type
The number of shared members doesn't matter for my purposes, a single member in common means they're linked
The output is being used as the edges of a graph, so if the output is a graph that fits the rules that's fine
The source dataset is large (hundreds of millions of rows), so performance is a consideration
This poses a similar question, however I'm new to Python and can't figure out how to get the source data to a point where I can use the answer, or work in the additional requirement of the group type matching
Try some thing like this-
df1=df.groupby(['Group Type','Member ID'])['Group ID'].apply(','.join).reset_index()
df2=df1[df1['Group ID'].str.contains(",")]
This might not handle the case of cyclic grouping.

Query Enum column in sqlalchemy leads to LookupError

I thought I was following the docs pretty closely setting up an ENUM field in a Postgres DB with sqlalchemy, but I'm clearly doing something (hopefully something simple) wrong.
My table has a type contact_type:
List of data types
Schema | Name | Internal name | Size | Elements | Owner | Access privileges | Description
--------+---------------+---------------+------+---------------+----------+-------------------+-------------
public | contact_types | contact_types | 4 | unknown +| postgres | |
| | | | incoming_text+| | |
| | | | incoming_call+| | |
| | | | outgoing_call | | |
and in the table:
Table "public.calls"
Column | Type | Modifiers
--------------+--------------------------+----------------------------------------------------
contact_type | contact_types |
In python I created a subclass of enum per the docs:
import enum
class contact_types(enum.Enum):
unknown: 1
incoming_text: 2
incoming_call: 3
outgoing_call: 4
and passed it to the model:
class Call(db.Model):
contact_type = db.Column(db.Enum(contact_types))
It all looked good. Inserts work and I can see the values when looking at the table, but SQLAlchemy's validation seems to be unhappy when querying. This leads to an error:
calls = Call.query.order_by(Call.time.desc()).limit(pagesize).offset(offset)
for c in calls:
print(c)
LookupError: "unknown" is not among the defined enum values
'unknown' is in the Enum. Am I missing a step somewhere to connect the query to the enum class?
there should be = in enum definition, not :
class contact_types(enum.Enum):
unknown = 1
incoming_text = 2
incoming_call = 3
outgoing_call = 4

Proper way to store ordered set of strings in database

First of all, I have xml file I need to save in mysql database. I have child elements that can occur from one to unbounded times. Are there any constraints I can use in sqlalchemy ORM or I have to save order from application?
The table should look like:
+------+-----------+-------+-----------+
| id | name | part | parent_id |
+------+-----------+-------+-----------+
| 1 | foo | 1 | 123 |
+------+-----------+-------+-----------+
| 2 | bar | 2 | 123 |
+------+-----------+-------+-----------+
| 3 | baz | 1 | 345 |
+------+-----------+-------+-----------+
In other words, what is a proper way to add explicit ordering to many-to-many relationship?
Any ordering needs to be done in code. Once inserted in a table and selected from that table the order is not guaranteed. So also on retrieval you will have to apply an order, in that part adding ORDER BY in SQL is the handiest way to go.

flask-sqlalchemy count function

Consider a table named result with the following schema
+----+-----+---------+
| id | tag | user_id |
+----+-----+---------+
| 0 | A | 0 |
| 1 | A | 0 |
| 2 | B | 0 |
| 3 | B | 0 |
+----+-----+---------+
for user with id=0 I would like to count they number of times a result with tag=A has been appeared. For now I have implemented it using raw SQL statement
db.session.execute('select tag, count(tag) from result where user_id = :id group by tag', {'id':user.id})
How can I write it using flask-sqlalchemy APIs?
Most of results I get mention the sqlalchemy function db.func.count() which is not available in flask-sqlalchemy or has a different path which I am not aware of.
I was using PyCharm as my IDE and it was not showing module members correctly, hence I thought count is missing. Here is my solution for the above
user.results.add_columns(Result.tag, db.func.count(Result.tag)).group_by(Result.tag).all()

Categories