I have more than 100 google sheets that are shared with a lot of people. I am trying to remove inactive people from the access list. Is there a way in python to extract the list of people who have contributed to the google sheet from the version history?
I used gspread library to access the sheet but not sure how to get the list of contributing users.
from oauth2client.service_account import ServiceAccountCredentials
from googleapiclient.discovery import build
scope = ['https://www.googleapis.com/auth/drive.activity.readonly']
creds = ServiceAccountCredentials.from_json_keyfile_name('accessAPI.json', scopes = scope)
drive_service = build('driveactivity', 'v2', credentials=creds)
edit_activities = drive_service.activity().query(body={"filter":"detail.action_detail_case:EDIT",
"itemName":"items/xyz",
"consolidationStrategy":"legacy"}).execute()
# Call the People API
scope = ['https://www.googleapis.com/auth/contacts.readonly']
creds = ServiceAccountCredentials.from_json_keyfile_name('accessAPI.json', scopes = scope)
service = build('people', 'v1', credentials=creds)
results = service.people().get(resourceName='people/1080745054',personFields='names,emailAddresses').execute()
Running a people ID through people API gives back the below result. It doesn't contain the email address
{'resourceName': 'people/1080745054',
'etag': '%EgcBAj0JPjcuGgQBAgUH'}
Is the output being truncated?
Approach
Using Python you can achieve this behavior passing through the Drive API and the People API.
Get EDIT activity for your google sheets using Drive Activities API
Get editors people ids from the actors object resource in the Drive Activities API response body.
Get editors email addresses from People API with the editors people ids.
List all file permissions on you google sheets with Drive API Permissions resource endpoint.
Update the permission if the user is not in the editor list according to your logic.
Here is the proposed script in pseudocode:
loop your_google_sheets:
edit_activities = drive_service.activites().query(filter="detail.action_detail_case:EDIT", itemName="items/"+your_google_sheets.id)
editors = edit_activities.get(actors_ids)
loop editors:
editors_emails += people_service.people().get(resourceName=editors.personName, personFields="emailAddresses")
file_permissions = drive_service.permissions().list(fileId=your_google_sheets.id)
loop file_permissions:
update_if_not_editor(editors_email, file_permissions.id) # Implement your own logic
References
People API get
Drive Activities API query
Drive API Permissions list
Drive API Permissions update
Related
How I can get a list of users in account GCP using python. I can't find how I can authorize using python in account and get a list. Can anybody help me?
I am assuming that you are just getting started with Google Cloud and the Python SDKs. If you are already experienced, skip to the bottom of my answer for the actual example code.
The documentation for the Google Cloud Python SDKs can be hard to figure out. The key detail is that Google documents the APIs using automated tools. Google publishes a document that SDKs can read to automatically build APIs. This might appear strange at first, but very clever when you think about it. SDKs that automatically update themselves to support the latest API implementation.
Start with the root document: Google API Client Library for Python Docs
Near the bottom is the link for documentation:
Library reference documentation by API
For your case, listing users with IAM bindings in a project, scroll down to cloudresourcemanager. Sometimes there are multiple API versions. Usually, pick the latest version. In this case, v3.
Knowing which API to use is built from experience. As you develop more and more software in Google Cloud, the logic to the architecture becomes automatic.
Cloud Resource Manager API
The API provides multiple Instance Methods. In your case, the instance method is projects.
Cloud Resource Manager API - projects
Within projects are Instance Methods. In your case, getIamPolicy().
getIamPolicy(resource, body=None, x__xgafv=None)
Sometimes you need to review the REST API to understand parameters and returned values.
Resource Manager REST API: Method: projects.getIamPolicy
For example, to understand the response from the Python SDK API, review the response documented by the REST API which includes several examples:
Resource Manager REST API: Policy
Now that I have covered the basics of discovering how to use the documentation, let's create an example that will list the roles and IAM members.
Import the required Python libraries:
from google.oauth2 import service_account
import googleapiclient.discovery
Create a variable with your Project ID. Note: do not use Project Name.
PROJECT_ID='development-123456'
Note: In the following explanation, I use a service account. Later in this answer, I show an example using ADC (Application Default Credentials) set up by the Google Cloud CLI (gcloud).
Create a variable with the full pathname to your Google Cloud Service Account JSON Key file:
SA_FILE='/config/service-account.json'
Create a variable for the required Google Cloud IAM Scopes. Typically I use the following scope as I prefer to control permissions via IAM Roles assigned to the service account:
SCOPES=['https://www.googleapis.com/auth/cloud-platform']
Create OAuth credentials from the service account:
credentials = service_account.Credentials.from_service_account_file(
filename=SA_FILE,
scopes=SCOPES)
Now we are at the point to start using the API documentation. The following code builds the API discovery document and loads the APIs for cloudresourcemanager:
service = googleapiclient.discovery.build(
'cloudresourcemanager',
'v3',
credentials=credentials)
Now call the API which will return a JSON response details the roles and members with bindings to the project:
resource = 'projects/' + PROJECT_ID
response = service.projects().getIamPolicy(resource=resource, body={}).execute()
The following is simple code to print part of the returned JSON:
for binding in response['bindings']:
print('Role:', binding['role'])
for member in binding['members']:
print(member)
Complete example that uses ADC (Application Default Credentials):
import googleapiclient.discovery
PROJECT_ID='development-123456'
service = googleapiclient.discovery.build('cloudresourcemanager', 'v3')
resource = 'projects/' + PROJECT_ID
response = service.projects().getIamPolicy(resource=resource, body={}).execute()
for binding in response['bindings']:
print('Role:', binding['role'])
for member in binding['members']:
print(member)
Complete example using a service account:
from google.oauth2 import service_account
import googleapiclient.discovery
PROJECT_ID='development-123456'
SA_FILE='/config/service-account.json'
SCOPES=['https://www.googleapis.com/auth/cloud-platform']
credentials = service_account.Credentials.from_service_account_file(
filename=SA_FILE,
scopes=SCOPES)
service = googleapiclient.discovery.build(
'cloudresourcemanager', 'v3', credentials=credentials)
resource = 'projects/' + PROJECT_ID
response = service.projects().getIamPolicy(resource=resource, body={}).execute()
for binding in response['bindings']:
print('Role:', binding['role'])
for member in binding['members']:
print(member)
Hei
I'm having trouble keeping my google OAuth Refresh Token valid for a small application I'm writing. I need to get data from a spreadsheet to a server / desktop application.
I'm trying to authorize with OAuth, which works for a week, then stops.
According to this post, this is expected behaviour:
https://stackoverflow.com/a/67966982/16509954
Another answer in the same thread posts a method how to permanently give access and not get your token expired:
https://stackoverflow.com/a/66292541/16509954
I did this but my token still keeps expiring.
Any ideas what I'm doing wrong?
I'm using the python library, my code is pretty much identical to the example given in the documentation quickstart.py:
https://developers.google.com/sheets/api/quickstart/python
Refresh tokens can expire for a number of reasons the main one these days being that your application is still in the testing phase.
Set your application over to production in Google cloud console and have it verified and the refresh tokens will not expire after a week.
from apiclient.discovery import build
from oauth2client.service_account import ServiceAccountCredentials
SCOPES = ['https://www.googleapis.com/auth/drive.drive']
KEY_FILE_LOCATION = '<REPLACE_WITH_JSON_FILE>'
VIEW_ID = '<REPLACE_WITH_VIEW_ID>'
def initialize_sheets():
"""Initializes an sheets service object.
Returns:
An authorized sheets service object.
"""
credentials = ServiceAccountCredentials.from_json_keyfile_name(
KEY_FILE_LOCATION, SCOPES)
# Build the service object.
service = build('sheets', 'v4', credentials=creds)
return service
I have been trying to access some simple information on Google Shared Drive files from a Python 3.7 script:
The last time a Google Sheets file on a shared drive was modified.
I have created a service account in the GCP Drive API menu and it can access/edit/etc Google Sheets without any problem the via the Sheets API.
However, when I use the same service account for the Drive API, it does not return any info on files outside its own folder (which contains only one file: "Getting Started"). The account has access to all Cloud APIs, has Domain-wide Delegation with all scopes related to Drive API included in the API control menu in GSuite.
The email address of the service account has been properly added to all folders in the shared drive.
Any idea? Basically all I need is to know when is the last time a sheet was modified by any given user.
secret_cred_file = ...
SCOPES = ['https://www.googleapis.com/auth/drive']
credentials = service_account.Credentials.from_service_account_file(secret_cred_file, scopes=SCOPES)
service = discovery.build('drive', 'v3', credentials=credentials)
results = service.files().list(pageSize=10, fields="nextPageToken, files(id, name,modifiedTime)").execute()
items = results.get('files', [])
PS: I have seen this: Getting files from shared folder but it does not help
I was able to list shared drive files without impersonating a user by adding some parameters to the list method as stated on google documentation:
Implement shared drive support
Shared drives follow different organization, sharing, and ownership models from My Drive. If your app is going to create and manage files on shared drives, you must implement shared drive support in your app. To begin, you need to include the supportsAllDrives=true query parameter in your requests when your app performs these operations:
files.get, files.list, files.create, files.update, files.copy, files.delete, changes.list, changes.getStartPageToken, permissions.list, permissions.get, permissions.create, permissions.update, permissions.delete
Search for content on a shared drive
Use the files.list method to search for shared drives. This section covers shared drive-specific fields in the files.list method. To search for shared drive, refer to Search for files and folders.
The files.list method contains the following shared drive-specific fields and query modes:
driveId — ID of shared drive to search.
includeItemsFromAllDrives — Whether shared drive items should be included in results. If not present or set to false, then shared drive items are not returned.
corpora — Bodies of items (files/documents) to which the query applies. Supported bodies are user, domain, drive, and allDrives. Prefer user or drive to allDrives for efficiency.
supportsAllDrives — Whether the requesting application supports both My Drives and shared drives. If false, then shared drive items are not included in the response.
Example
service.files().list(includeItemsFromAllDrives=True, supportsAllDrives=True, pageSize=10, fields="nextPageToken, files(id, name,modifiedTime)").execute()
It is nice to remember that the folder or files needs to be shared with the service account.
You need to impersonate your users.
It is not possible to make an API call to get all the files in your domain in one go.
In the Service Accounts article it says:
Service accounts are not members of your Google Workspace domain, unlike user accounts. For example, if you share assets with all members in your Google Workspace domain, they will not be shared with service accounts...This doesn't apply when using domain-wide delegation, because API calls are authorized as the impersonated user, not the service account itself.
So unfortunately you can't just share a file with a service account. To get all the files in your domain you would need to:
Impersonate an admin account and get a list of all the users.
Impersonate each user and make Drive API request for each.
Here is a good quick start for the Python Library, specifically this section
Remember to set permissions in both the GCP console and the Admin console though it seems like you have done this correctly.
Example script
from google.oauth2 import service_account
from googleapiclient.discovery import build
def main():
SCOPES = ['https://www.googleapis.com/auth/drive.metadata.readonly',
'https://www.googleapis.com/auth/admin.directory.user.readonly']
SERVICE_ACCOUNT_FILE = 'credentials.json'
credentials = service_account.Credentials.from_service_account_file(
SERVICE_ACCOUNT_FILE, scopes=SCOPES)
# Admin SDK to get users
admin_delegated_credentials = credentials.with_subject('[ADMIN_EMAIL]')
admin_service = build(
'admin',
'directory_v1',
credentials=admin_delegated_credentials
)
admin_results = admin_service.users().list(customer='my_customer', maxResults=10,
orderBy='email').execute()
users = admin_results.get('users', [])
if not users:
print('No users in the domain.')
else:
for user in users:
print(u'{0} ({1})'.format(user['primaryEmail'],
user['name']['fullName']))
# Drive to get files for each user
delegated_credentials = credentials.with_subject(user['primaryEmail'])
drive_service = build(
'drive',
'v3',
credentials=delegated_credentials
)
drive_results = drive_service.files().list(
pageSize=10,
fields="nextPageToken, files(id, name,modifiedTime)"
).execute()
items = drive_results.get('files', [])
if not items:
print('No files found.')
else:
print('Files:')
for item in items:
print(u'{0} ({1})'.format(item['name'],
item['id']))
if __name__ == '__main__':
main()
Explanation
This script has two scopes:
'https://www.googleapis.com/auth/drive.metadata.readonly'
'https://www.googleapis.com/auth/admin.directory.user.readonly'
The project initialized in the GCP Cloud console has also been granted these scopes from within the Admin console > Security > API Controls > Domain wide delegation > Add new
The first thing the script does is build the credentials using from_service_account_file:
credentials = service_account.Credentials.from_service_account_file(
SERVICE_ACCOUNT_FILE, scopes=SCOPES)
Then it builds the delegated credentials, that is, the user to be impersonated:
admin_delegated_credentials = credentials.with_subject('[ADMIN_EMAIL]')
From there it can build the service as normal. It gets a list of the users, loops through the users and lists their files. You could adapt this to your needs.
References
Service Accounts
Using OAuth 2.0 for Server to Server Applications
I'm trying use google drive api. I created a service account credentials and downloaded from console cloud. The problem is that I'm part of an organization in gsuit and when I try list my files, it's empty, but I have files in my drive.
from apiclient.discovery import build
from oauth2client.service_account import ServiceAccountCredentials
credentials = ServiceAccountCredentials.from_json_keyfile_name(
"credentials.json", scopes=['https://www.googleapis.com/auth/drive'])
service = build('drive', 'v3', credentials=credentials)
print(service.files().list().execute())
What could be?
Actually you are not providing a lot of information but make sure on the api credentials you issued you selected the 'Other UI' option on the field 'Where will you be calling the API from' and you chose 'User data' instead of 'Application data', also the scope should be 'https://www.googleapis.com/auth/drive.readonly.metadata' for listing data.
'https://www.googleapis.com/auth/drive' is correct too but given that it is a gsuite account there can be limitations on generic scopes even for your own data.
Also you should do service = DRIVE.files().list().execute().get('files', [])
for f in files:
print(f['name'])
and enumerate that files array to get the files.
if that doesn't work have a look at the api docs and if you can't figure it out please post more details and try to do some debugging and post the results here.
Edit: Try using the restapi too with the appropriate credentials and see if the files are fetched successfully there. https://developers.google.com/drive/api/v2/reference/files/list
I'm using the Google People API to access my contacts.
I activated it in the Google Developers Console and created a project, a service account (ending with ....iam.gserviceaccount.com) and a key for authentication which is stored in JSON format.
When I access the contacts, it seems to take the contacts of my service account address rather than my Google account which results in an empty list.
How can I tell the API to use my account rather than the service account?
This is the code I have so far:
from google.oauth2 import service_account
from googleapiclient.discovery import build
# pip install google-auth google-auth-httplib2 google-api-python-client
SCOPES = ['https://www.googleapis.com/auth/contacts.readonly']
KEY = '~/private.json'
credentials = service_account.Credentials.from_service_account_file(
KEY, scopes=SCOPES)
service = build(
serviceName='people', version='v1', credentials=credentials)
connections = service.people().connections().list(
resourceName='people/me', personFields='names').execute()
print(connections)
# result: {}
A service account is NOT you a service account is a dummy user it has its own google drive account, google calendar and apparently google contacts. The reason that you are seeing an empty result set is that you have not added any contacts to the service accounts account.
Service accounts are most often used to grant access to data that the developer owns. For example you can take the service account email address and share one of your folders on google drive it will then have acccess to that folder on your google drive account. You can do the same with google calendar.
There are some apis that do not give you the ablity to share your data with other users. Youtube, adwords, blogger and google contacts to name a few.
You cant use a service account to access your personal google contacts. Your best bet would be to authenticate your application with oauth2 and access them that way.
Note about Google Workspace
If you have a google workspace account, a serivce account can be configured to act on behalf of a user on the domain, but only a user on the domain. Perform Google Workspace domain-wide delegation of authority
Not a python expert but I've just performed the task the OP is talking about in .NET and I am pretty sure it's feasable with Python too.
So it looks like all needs to be done is delegating domain-wide authority to the SA. I.e. assign required scopes for your SA, in my case it was https://www.googleapis.com/auth/contacts.readonly.
Then you should do your call and specify an account you're trying to impersonate (took the python example from here)
from google.oauth2 import service_account
SCOPES = ['https://www.googleapis.com/auth/sqlservice.admin']
SERVICE_ACCOUNT_FILE = '/path/to/service.json'
credentials = service_account.Credentials.from_service_account_file(
SERVICE_ACCOUNT_FILE, scopes=SCOPES)
# this is the line you apparently were missing
delegated_credentials = credentials.with_subject('user#example.org')
Then you'll be able to do the people/me calls. Worked for me in .NET as I said.