How to recursively traverse a Snowflake role grant hierarchy in Python?

How to recursively traverse a Snowflake role grant hierarchy in Python? - python

Has anyone written Python code to traverse a Snowflake Role hierarchy?
A role can be granted a role which can be granted a role.
Nested roles are possible.
In Python, I want to start with a specific role
Query all grants to the role that are roles
store each of those in a unique list
for each of those roles execute a SQL statement that will query all roles granted to that role
and traverse down the hierarchy, not missing a single role and also not traversing a role that has already been queried or that is in the list to be queried
Something recursive to crawl this tree using Python would be helpful.
At this point, given a role, I have no way to execute a SQL statement that will give me all the first level sub-roles granted to the given role.
There is no way to select on fields and get the grant that are just of type ROLE.
Any help would be appreciated.
Once complete, I will post all contributions and the final the code to a github account.
thank you,
I am expecting to end up with a list of Snowflake roles with no duplicates in the list.

You said "At this point, given a role, I have no way to execute a SQL statement that will give me all the first level sub-roles granted to the given role." - it is not true.
You can do it this way:
SHOW GRANTS TO ROLE THE_ROLE; -- replace THE_ROLE with your role name
SELECT distinct "name" FROM table(result_scan(last_query_id())) where "privilege" = 'USAGE' and "granted_on" = 'ROLE';
Well, they are actually two queries.
But it can be achieved with one too:
select name
from snowflake.account_usage.grants_to_roles
where grantee_name = 'THE_ROLE' -- replace THE_ROLE with your role name
and granted_on = 'ROLE'
and privilege = 'USAGE'
and deleted_on is null;
You can also get the whole hierarchy with a recursive query, like this:
with cte as (
select *
from snowflake.account_usage.grants_to_roles
where grantee_name = 'THE_ROLE' -- replace THE_ROLE with your role name
and granted_on = 'ROLE'
and privilege = 'USAGE'
and deleted_on is null
UNION ALL
select gtr.*
from snowflake.account_usage.grants_to_roles gtr
join cte on gtr.grantee_name = cte.name
where gtr.granted_on = 'ROLE'
and gtr.privilege = 'USAGE'
and gtr.deleted_on is null
)
select * from cte;
Querying the SNOWFLAKE.ACCOUNT_USAGE views is annoyingly snow, so I recommend creating a copy in your database and using it for playing and testing.

Related

Using top level item in nested subquery with SQLAlchemy

I have a query when I'm attempting to find a link between two tables, but I require few checks with association tables within the same query. I think my problem stems from having to check across multiple levels of relationships, where I want to filter a subquery based on the top level item, but I've hit an issue and have no idea how to proceed.
More specifically I want to query Script using the name of an Application, but narrow the results down to when the Application's Language matches the Script's Language.
Tables: Script (id, language_id), Application (id, name), Language (id)
Association Tables: ApplicationLanguage (app_id, language_id), ScriptApplication (script_id, app_id)
Current attempt: (it's important this stays as a single query)
value = 'appname'
# Search applications for a value
app_search = select([Application.id]).where(Application.name==value).as_scalar()
# Search for applications matching the language of the script
lang_search = select([ApplicationLanguage.app_id]).where(
ApplicationLanguage.language_id==Script.language_id
).as_scalar()
# Find the script based on which applications appear in both subqueries.
script_search = select([ScriptApplication.script_id]).where(and_(
ScriptApplication.app_id.in_(app_search),
ScriptApplication.app_id.in_(lang_search),
)).as_scalar()
# Turn it into an SQL expression
query = Script.id.in_(script_search)
Resulting SQL code:
SELECT script.id AS script_id
FROM script
WHERE script.id IN (SELECT script_application.script_id
FROM script_application
WHERE script_application.application_id IN (SELECT application.id
FROM application
WHERE application.name = ?) AND script_application.application_id IN (SELECT application_language.application_id
FROM application_language, script
WHERE script.language_id = application_language.language_id))
My theory
I believe the issue is on the line ApplicationLanguage.language_id==Script.language_id, because if I change it to (ApplicationLanguage.language_id==3, 3 being the value I'm expecting), then it works perfectly. In the SQL code, I assume it's the FROM application_language, script which is overwriting the top level script
How would I go about either rearranging or fixing this query? My current method seems to work fine if it's across a single relationship, just doesn't work if I try and do anything more complex.

I'd still love to know how I'd go about fixing the original query as I believe it'll come in useful in the future, but I managed to rearrange it.
I reversed the lang_search to grab languages for each application from app_search, and used that as part of the final query, instead of attempting to combine it in a subquery.
value = 'appname'
app_search = select([Application.id]).where(Application.name==value).as_scalar()
lang_search = select([ApplicationLanguage.language_id]).where(
ApplicationLanguage.app_id.in(app_search)
).as_scalar()
script_search = select([ScriptApplication.script_id]).where(and_(
ScriptApplication.app_id.in(app_search),
)).as_scalar()
query = and_(
Script.id.in_(script_search),
Script.language_id.in_(lang_search),
)
Final SQL query:
SELECT script.id AS script_id
FROM script
WHERE script.id IN (SELECT script_application.script_id
FROM script_application
WHERE script_application.application_id IN (SELECT application.id
FROM application
WHERE lower(application.name) = ?)) AND script.language_id IN (SELECT application_language.language_id
FROM application_language
WHERE application_language.application_id IN (SELECT application.id
FROM application
WHERE lower(application.name) = ?))

Check if entity already exists in a table

I want to check if entity already exists in a table, I tried to search google from this and I found this , but it didn't help me.
I want to return False if the entity already exists but it always insert the user.
def insert_admin(admin_name) -> Union[bool, None]:
cursor.execute(f"SELECT name FROM admin WHERE name='{admin_name}'")
print(cursor.fetchall()) # always return empty list []
if cursor.fetchone():
return False
cursor.execute(f"INSERT INTO admin VALUES('{admin_name}')") # insert the name
def current_admins() -> list:
print(cursor.execute('SELECT * FROM admin').fetchall()) # [('myname',)]
When I run the program again, I can still see that print(cursor.fetchall()) return empty list. Why is this happening if I already insert one name into the table, and how can I check if the name already exists ?

If you want to avoid duplicate names in the table, then let the database do the work -- define a unique constraint or index:
ALTER TABLE admin ADD CONSTRAINT unq_admin_name UNIQUE (name);
You can attempt to insert the same name multiple times. But it will only work once, returning an error on subsequent attempts.
Note that this is also much, much better than attempting to do this at the application level. In particular, the different threads could still insert the same name at (roughly) the same time -- because they run the first query, see the name is not there and then insert the same row.
When the database validates the data integrity, you don't have to worry about such race conditions.

How to design and set up user and role tables to retrieve values from both?

In Flask there is a function that loads the user from a session cookie or sets it before requests, but what is the proper way to determine what role/permissions this user have using SQL?
The tables:
CREATE TABLE app_user (
id SERIAL PRIMARY KEY,
username VARCHAR (50) UNIQUE NOT NULL,
password VARCHAR (500) NOT NULL
);
CREATE TABLE role (
role_id SERIAL PRIMARY KEY,
role_description VARCHAR (25)
);
INSERT INTO role (role_description) VALUES
('New'),
('Active'),
('Moderator');
CREATE TABLE user_role (
app_user_id INTEGER NOT NULL,
app_user_role_id INTEGER NOT NULL,
FOREIGN KEY (app_user_id) REFERENCES app_user (id),
FOREIGN KEY (app_user_role_id) REFERENCES role (role_id)
);
Flask, loading user:
#bp.before_app_request
def load_logged_in_user():
user_id = session.get('user_id')
if user_id is None:
g.user = None
else:
db = get_db().cursor()
load_user = db.execute(
'SELECT * FROM app_user WHERE id = %s', (user_id,)
# SELECT u.id, username, app_user_role_id
# WHERE u.id = %s', (user_id,)
# FROM app_user u JOIN user_role r ON u.id = app_user_id
# Don't know how to do this, or how it is usually done
# As it stands now it doesn't make any sense, as I have
# been fiddling with it for too long.
)
load_user = db.fetchone()
g.user = load_user
And is it better in terms of performance to put everything in the user table instead? Because there may be much info that is not always needed, so would several tables make it faster or just create a need for many more connections?
Is it normal to use ORM's in huge applications or to write raw SQL? Is it true that writing raw SQL increases the performance up to three times in comparison to using an ORM?

To get the user and it's privilieges you could use a join :
select app_user.username, role.role_description
from app_user
join user_role
on app_user.id = user_role.app_user_id
join role
on user_role.app_user_role_id = role.id
where app_user.id = %s
As for your second question I would certainly use a few tables as you did instead of just one, as relationnal databases just work that way.
If you'd like to use a unique table, you should probably look for document oriented databases, like mongoDB or ElasticSearch (ES might be overkill for a small project).
https://www.mongodb.com
https://www.elastic.co/products/elasticsearch
Edit : for your new question
The advantage of using an ORM is that it makes the coding part much faster, easier and safer.
Of course sometimes there will be complicated requests that an ORM won't optimize properply.
I'm thinking about django's ORM, it's known that in a few cases of complicated requests it doesn't shine.
But then, for those specific cases, you can always write your own queries manually to get better performances.
There is an interesting article on the topic : https://medium.com/#hansonkd/performance-problems-in-the-django-orm-1f62b3d04785

retain only specific rows psycopg2

I have a table containing jobs like this
id owner collaborator privilege
90 "919886297050" "919886212378" "read"
90 "919886297050" "919886297052" "read"
88 "919886297050" "919886212378" "read"
88 "919886297050" "919886297052" "read"
primary key is a composite of id, owner and collaborator
I want to pass in details of only the collaborators i want to retain. For example, if my collaborator = "919886212378" it means I want to delete the row for "919886297052" and keep the row for "919886212378"
Is there a way to do this in one query / execution instead of fetching the details separately and then performing the delete after filtering the missing values?
EDIT: My use case might have new collaborators added and old ones deleted. However, my input will just have a set of chosen collaborators so I will need to cross check with the old list, retain existing, add new and delete missing collaborators.

DELETE FROM table WHERE collaborator NOT IN ("919886212378", "id ..")
does the delete for the specific case you mentioned. But I don't know
how you get these id's. You give too little information regarding your exact case.
If you can get these id's by a query, you could make it a subquery like:
DELETE FROM table WHERE collaborator NOT IN (SELECT ... FROM ...)

Comparing the old and new collaborator lists in python kind of did the trick for me
original_list = ["C1","C2","C3","C4"] // result from query
updated_list= ["C1","C6","C7"] // list obtained from request
# compute the differences
to_be_added = set(updated_list).difference(set(original_list) )
to_be_deleted = set(original_list).difference(set(updated_list) )
Then I use an insert and delete statement within a transaction using the above two lists to make an update.

SQLite Inner Join with Limit on Left Table

Firstly, let me describe a scenario similar to the one I am facing; to better explain my issue. In this scenario, I am creating a system which needs to select 'n' random blog posts from a table and then get all the replies for those selected posts.
Imagine my structure like so:
blog_posts(id INTEGER PRIMARY KEY, thepost TEXT)
blog_replies(id INTEGER PRIMARY KEY, postid INTEGER FOREIGN KEY REFERENCES blog_posts(id), thereply TEXT)
This is the current SQL I have, but am getting an error:
SELECT blog_post.id, blog_post.thepost, blog_replies.id, blog_replies.thereply
FROM (SELECT blog_post.id, blog_post.thepost FROM blog_post ORDER BY RANDOM() LIMIT ?)
INNER JOIN blog_replies
ON blog_post.id=blog_replies_options.postid;
Here is the error:
sqlite3.OperationalError: no such column: hmquestion.id

Your query needs an alias added to the subquery. This will allow you to reference the fields from your subquery within the outer query:
SELECT hmquestion.id, hmquestion.question,
hmquestion_options.id, hmquestion_options.option
FROM (SELECT hmquestion.id, hmquestion.question
FROM hmquestion ORDER BY RANDOM() LIMIT ?) AS hmquestion <--Add Alias Here
INNER JOIN hmquestion_options
ON hmquestion.id=hmquestion_options.questionid;
As is, you're outer query doesn't know what hmquestion references.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to recursively traverse a Snowflake role grant hierarchy in Python? - python

Related

Using top level item in nested subquery with SQLAlchemy

Check if entity already exists in a table

How to design and set up user and role tables to retrieve values from both?

retain only specific rows psycopg2

SQLite Inner Join with Limit on Left Table

Categories

Resources