Please could someone help me with a query related to permissions on the Google cloud platform? I realise that this is only loosely programming related so I apologise if this is the wrong forum!
I have a project ("ProjectA") written in Python that uses Googles cloud storage and compute engine. The project has various buckets that are accessed using python code from both compute instances and from my home computer. This project uses a service account which is a Project "owner", I believe it has all APIs enabled and the project works really well. The service account name is "master#projectA.iam.gserviceaccount.com".
Recently I started a new project that needs similar resources (storage, compute) etc, but I want to keep it separate. The new project is called "ProjectB" and I set up a new master service account called master#projectB.iam.gserviceaccount.com. My code in ProjectB generates an error related to access permissions and is demonstrated even if I strip the code down to these few lines:
The code from ProjectA looked like this:
from google.cloud import storage
client = storage.Client(project='projectA')
mybucket = storage.bucket.Bucket(client=client, name='projectA-bucket-name')
currentblob = mybucket.get_blob('somefile.txt')
The code from ProjectB looks like this:
from google.cloud import storage
client = storage.Client(project='projectB')
mybucket = storage.bucket.Bucket(client=client, name='projectB-bucket-name')
currentblob = mybucket.get_blob('somefile.txt')
Both buckets definitely exist, and obviously if "somefile.text" does not exist then currentblob is None, which is fine, but when I execute this code I receive the following error:
Traceback (most recent call last):
File .... .py", line 6, in <module>
currentblob = mybucket.get_blob('somefile.txt')
File "C:\Python27\lib\site-packages\google\cloud\storage\bucket.py", line 599, in get_blob
_target_object=blob,
File "C:\Python27\lib\site-packages\google\cloud\_http.py", line 319, in api_request
raise exceptions.from_http_response(response)
google.api_core.exceptions.Forbidden: 403 GET https://www.googleapis.com/storage/v1/b/<ProjectB-bucket>/o/somefile.txt: master#ProjectA.iam.gserviceaccount.com does not have storage.objects.get access to projectB/somefile.txt.
Notice how the error message says "ProjectA" service account doesn't have ProjectB access - well, I would somewhat expect that but I was expecting to use the service account on ProjectB!
Upon reading the documentation and links such as this and this, but even after removing and reinstating the service account or giving it limited scopes it hasnt helped. I have tried a few things:
1) Make sure that my new service account was "activated" on my local machine (where the code is being run for now):
gcloud auth activate-service-account master#projectB.iam.gserviceaccount.com --key-file="C:\my-path-to-file\123456789.json"
This appears to be successful.
2) Verify the list of credentialled accounts:
gcloud auth list
This lists two accounts, one is my email address (that I use for gmail, etc), and the other is master#projectB.iam.gserviceaccount.com, so it appears that my account is "registered" properly.
3) Set the service account as the active account:
gcloud config set account master#projectB.iam.gserviceaccount.com
When I look at the auth list again, there is an asterisk "*" next to the service account, so presumably this is good.
4) Check that the project is set to ProjectB:
gcloud config set project projectB
This also appears to be ok.
Its strange that when I run the python code, it is "using" the service account from my old project even though I have changed seemingly everything to refer to project B - Ive activated the account, selected it, etc.
Please could someone point me in the direction of something that I might have missed? I don't recall going through this much pain when setting up my original project and Im finding it so incredibly frustrating that something I thought would be simple is proving so difficult.
Thank you to anyone who can offer me any assistance.
I'm not entirely sure, but this answer is from a similar question on here:
Permission to Google Cloud Storage via service account in Python
Specifying the account explicitly by pointing to the credentials in your code. As documented here:
Example from the documentation page:
def explicit():
from google.cloud import storage
# Explicitly use service account credentials by specifying the private key
# file.
storage_client = storage.Client.from_service_account_json(
'service_account.json')
# Make an authenticated API request
buckets = list(storage_client.list_buckets())
print(buckets)
Don't you have a configured GOOGLE_APPLICATION_CREDENTIALS env variable which points project A's SA?
The default behavior of Google SDK is to takes the service account from the environment variable GOOGLE_APPLICATION_CREDENTIALS.
If you want to change the account you can do something like:
from google.cloud import storage
credentials_json_file = os.environ.get('env_var_with_path_to_account_json')
client= storage.Client.from_service_account_json(credentials)
The above assumes you have creates a json account file like in: https://cloud.google.com/iam/docs/creating-managing-service-account-keys
and that the json account file is in the environment variable env_var_with_path_to_account_json
This way you can have 2 account files and decide which one to use.
Related
There is a GCP project that contains a bucket that I have read and write permissions to, but I don't know the name of the project nor am I part of the project. None of the contents of this bucket are public.
I have successfully authenticated my user locally using gcloud auth application-default login.
I can successfully download from this bucket using gsutil cat gs://BUCKET/PATH.
However, if I use the google.cloud.storage Python API, it fails at the point of identifying the project, presumably because I don't have access to the project itself:
from google.cloud import storage
client = storage.Client()
storage.Blob.from_string("gs://BUCKET/PATH", client=client).download_as_text()
The billing account for the owning project is disabled in state closed: ('Request failed with status code', 403, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.PARTIAL_CONTENT: 206>)
I can't use storage.Client.create_anonymous_client() since this is only relevant for public buckets, but I suspect that I could fix this by changing the credentials argument to Client().
Can anyone help me download the file from Google Cloud in this case?
If you have permission, you can find the project number for a given bucket with the bucket get API call. See this guide for how to do it with various client libraries.
So I am trying to orchestrate a workflow in Airflow. One task is to read GCP Cloud Storage, which needs me to specify the Google Application Credentials.
I decided to create a new folder in the dag folder and put the JSON key. Then I specified this in the dag.py file;
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "dags\support\keys\key.json"
Unfortunately, I am getting this error below;
google.auth.exceptions.DefaultCredentialsError: File dags\support\keys\dummy-surveillance-project-6915f229d012.json was not found
Can anyone help with how I should go about declaring the service account key?
Thank you.
You can create a connection to Google Cloud from Airflow webserver admin menu. In this menu you can pass the Service Account key file path.
In this picture, the keyfile Path is /usr/local/airflow/dags/gcp.json.
Beforehand you need to mount your key file as a volume in your Docker container with the previous path.
You can also directly copy the key json content in the Airflow connection, in the keyfile Json field :
You can check from these following links :
Airflow-connections
Airflow-with-google-cloud
Airflow-composer-managing-connections
If you trying to download data from Google Cloud Storage using Airflow, you should use the GCSToLocalFilesystemOperator operator described here. It is already provided as part of the standard Airflow library (if you installed it) so you don't have to write the code yourself using the Python operator.
Also, if you use this operator you can enter the GCP credentials into the connections screen (where it should be). This is a better approach to putting your credentials in a folder with your DAGs as this could lead to your credentials being committed into your version control system which could lead to security issues.
I am trying to get Firestore working in emulator-mode with Python on my local Linux PC. Is it possible to use anonymous credentials for this so I don't have to create new credentials?
I have tried two methods of getting access to the Firestore database from within a Python Notebook, after having run firebase emulators:start from the command-line:
First method:
from firebase_admin import credentials, firestore, initialize_app
project_id = 'my_project_id'
cred = credentials.ApplicationDefault()
initialize_app(cred, {'projectId': project_id})
db = firestore.client()
This raises the following exception:
DefaultCredentialsError: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started
Second method:
from google.auth.credentials import AnonymousCredentials
from google.cloud.firestore import Client
cred = AnonymousCredentials()
db = Client(project=project_id, credentials=cred)
This works, but then I try and access a document from the database that I have manually inserted into the database using the web-interface, using the following Python code:
doc = db.collection('my_collection').document('my_doc_id').get()
And then I get the following error, which perhaps indicates that the anonymous credentials don't work:
PermissionDenied: 403 Missing or insufficient permissions.
Thoughts
It is a surprisingly complicated system and although I have read numerous doc-pages, watched tutorial videos, etc., there seems to be an endless labyrinth of configurations that need to be setup in order for it to work. It would be nice if I could get Firestore working on my local PC with minimal effort, to see if the database would work for my application.
Thanks!
Method 2 works if an environment variable is set. In the following change localhost:8080 to the Firestore server address shown when the emulator is started using firebase emulators:start
import os
os.environ['FIRESTORE_EMULATOR_HOST'] = 'localhost:8080'
I don't know how to make it work with Method 1 above using the firebase_admin Python package. Perhaps someone else knows.
Also note that the emulated Firestore database will discard all its data when the server is shut down. To persist and reuse the data start the emulator with a command like this:
firebase emulators:start --import=./emulator_data --export-on-exit
I'm having issues executing a Cloud Function on GCP which tries to update some google sheets of mine. I got this script working in Jupyter but have struggled to deploy it virtually as a Cloud Function. My issue seems to be authorizing the CF to post to google sheets.
I've tried many things over 6+ hours, most questions on stackoverflow, medium articles github but haven't been able to find a working solution for me. I don't think it's a roles/permissions issues. I understand how some of these may work when you are outside cloud functions but not inside of it.
Ultimately, I think from what I've seen the best way is to host my JSON secret key inside of a storage bucket and call that, I've tried this to no success and this does seem somewhat convoluted as everything is from a google service.
I've honestly gone back to my orignal code so am back to the first error which is simply that my JSON key cannot be found as when I was running it in Jupyter it was in the same directory...hence why I created a google storage bucket to try to link to.
import pandas as pd
import gspread
from oauth2client.service_account import ServiceAccountCredentials
import google.cloud
from df2gspread import df2gspread as d2g
from df2gspread import gspread2df as g2d
import datetime
import time
import numpy as np
def myGet(event, context):
scope = ['https://spreadsheets.google.com/feeds', 'https://www.googleapis.com/auth/drive']
credentials = ServiceAccountCredentials.from_json_keyfile_name('my-key-name.json', scope)
gc = gspread.authorize(credentials)
spreadsheet_key = '--removed actual key/id--'
ERROR: File "/env/local/lib/python3.7/site-packages/oauth2client/service_account.py", line 219, in from_json_keyfile_name with open(filename, 'r') as file_obj: FileNotFoundError: [Errno 2] No such file or directory: 'my-key-name.json'
Thanks very much for any guidance and support on this. I have thouroughly looked and tried to solve this on my own. EDIT: Please keep in mind, this is not a .py file living in a directory, that's part of my issue, I don't know where to link to as its an isolated "Cloud Function" as far as I can tell.
Some links I've looked at in my 20+ attempts to fix this issue just to name a few:
How can I get access to Google Cloud Storage using an access and a secret key
Accessing google cloud storage bucket from cloud functions throws 500 error
https://cloud.google.com/docs/authentication/getting-started#auth-cloud-implicit-python
https://cloud.google.com/docs/authentication/getting-started#setting_the_environment_variable
UPDATE:
I realized you could upload a zip of your files to show three files in the inline editor. At the beginning I was not doing this so could not figure out where to put the JSON key. Now I have it viewable and need to figure out how to call it in the method.
When I do a test run of the CF, I get a non-descript error which doesn't show up in the logs and can't test it from the Cloud Schedular like I could previously. I found this on stack overflow and feel like I now need the same version but for python and figure out what calls to make from the google docs.
Cloud Functions: how to upload additional file for use in code?
My advice is to not use JSON key file in your Cloud Functions (and on all GCP product). With Cloud Function, like with other GCP product, you have the capability to load automatically the service account during your deployment.
The advantage of Cloud Function Identity is that you haven't a key file to store secretly, you don't have to rotate your key file for increasing the security, you don't have risk of leak of key file,...
By the way, use the default service account in your code.
If you need to get the credential object, you can use the oauth2 python library for this.
import google.auth
credentials, project_id = google.auth.default()
You'll need to specify a relative filename instead, e.g. ./my-key-name.json, assuming the file is in the same directory as your main.py file.
I had the same problem and solved it like this:
import google.auth
credentials, _ = google.auth.default()
gc = gspread.authorize(credentials)
That should work for you.
In my python server script which is running on a google cloud VM instance, it tries to save an image(jpeg) in the storage. But it throws following error.
File "/home/thamindudj_16/server/object_detection/object_detector.py",
line 109, in detect Hand
new_img.save("slicedhand/{}#sliced_image{}.jpeg".format(threadname,
i)) File
"/home/thamindudj_16/.local/lib/python3.5/site-packages/PIL/Image.py",
line 2004, in save
fp = builtins.open(filename, "w+b")
OSError: [Errno 5] Input/output error: 'slicedhand/thread_1#sliced_image0.jpeg'
All the files including python scripts are in a google storage bucket and have mounted to the VM instance using gcsfuse. App tries to save new image in the slicedhand folder.
Python code snippet where image saving happen.
from PIL import Image
...
...
i = 0
new_img = Image.fromarray(bounding_box_img) ## conversion to an image
new_img.save("slicedhand/{}#sliced_image{}.jpeg".format(threadname, i))
I think may be the problem is with permission access. Doc says to use --key_file. But what is the key file I should use and where I can find that. I'm not clear whether this is the problem or something else.
Any help would be appreciated.
I understand that you are using gcfuse on your Linux VM Instance to access Google Cloud Storage.
Key file is a Service Account credentials key, that will allow you to initiate Cloud SDK or Client Library as another Service Account. You can download key file from Cloud Console. However, if you are using VM Instance, you are automatically using Compute Engine Default Service Account. You can check it using console command: $ gcloud init.
To configure properly your credentials, please follow the documentation.
Compute Engine Default Service Account, need to have enabled Access Scope Storage > Full. Access Scope is the mechanism that limits access level to Cloud APIs. That can be done during machine creation or when VM Instance is stopped.
Please note that Access Scopes are defined explicitly for the Service Account that you select for VM Instance.
Cloud Storage objects names have requirements. It is strongly recommended avoid using hash symbol "#" in the names of the objects.