Python connection to google big query using ADC

Python connection to google big query using ADC - python

I am trying to get data from Google big query table using python. I dont have a service account access,but i have individual access to bigquery using gcloud. i have application default credentials Json file. I need to how to make a connection to bigquery usinG ADC.
code snippet:
from google.cloud import bigquery
conn=bigquery.Client()
query="select * from my_data.test1"
conn.query(query)
When i run above code snippet i am getting error saying:
NewConnectionError: <urllib3.connection.HttpsConnection object at 0x83dh46bdu640>: Failed to establish a new connection:[Error -2] Name or Service not known
Note: ENVIRONMENT Variable GOOGLE APPLICATION CREDENTIALS is not set and empty

Your script works for me because I authenticated using end user credentials from Google Cloud SDK, once you have the SDK installed you can simply run:
gcloud auth application-default login
The credentials from your json file are not being passed to the bigquery client, e.g.:
client = bigquery.Client(project=project, credentials=credentials)
to set that up you can follow these steps: https://cloud.google.com/bigquery/docs/authentication/end-user-installed
or this thread has some good details on setting the credentials environment variable: Setting GOOGLE_APPLICATION_CREDENTIALS for BigQuery Python CLI

Related

Flask web app on Cloud Run - google.auth.exceptions.DefaultCredentialsError:

I'm hosting a Flask web app on Cloud Run. I'm also using Secret Manager to store Service Account keys. (I previously downloaded a JSON file with the keys)
In my code, I'm accessing the payload then using os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = payload to authenticate. When I deploy the app and try to visit the page, I get an Internal Service Error. Reviewing the logs, I see:
File "/usr/local/lib/python3.10/site-packages/google/auth/_default.py", line 121, in load_credentials_from_file
raise exceptions.DefaultCredentialsError(
google.auth.exceptions.DefaultCredentialsError: File {"
I can access the secret through gcloud just fine with: gcloud secrets versions access 1 --secret="<secret_id>" while acting as the Service Account.
Here is my Python code:
# Grabbing keys from Secret Manager
def access_secret_version():
# Create the Secret Manager client.
client = secretmanager.SecretManagerServiceClient()
# Build the resource name of the secret version.
name = "projects/{project_id}/secrets/{secret_id}/versions/1"
# Access the secret version.
response = client.access_secret_version(request={"name": name})
payload = response.payload.data.decode("UTF-8")
return payload
#app.route('/page/page_two')
def some_random_func():
# New way
payload = access_secret_version() # <---- calling the payload
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = payload
# Old way
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "service-account-keys.json"
I'm not technically accessing a JSON file like I was before. The payload variable is storing entire key. Is this why it's not working?

Your approach is incorrect.
When you run on a Google compute service like Cloud Run, the code runs under the identity of the compute service.
In this case, by default, Cloud Run uses the Compute Engine default service account but, it's good practice to create a Service Account for your service and specify it when you deploy it to Cloud Run (see Service accounts).
This mechanism is one of the "legs" of Application Default Credentials when your code is running on Google Cloud, you don't specify the environment variable (you also don't need to create a key) and Cloud Run service acquires the credentials from the Metadata service:
import google.auth
credentials, project_id = google.auth.default()
See google.auth package
It is bad practice to define|set an environment variable within code. By their nature, environment variables should be provided by the environment. Doing this with APPLICATION_DEFAULT_CREDENTIALS means that your code always sets this value when it should only do this when the code is running off Google Cloud.
For completeness, if you need to create Credentials from a JSON string rather than from a file contain a JSON string, you can use from_service_account_info (see google.oauth2.service_account)

How does client= translate.TranslationServiceClient() work in conjunction with os.environ['GOOGLE_APPLICATION_CREDENTIALS']

I am using python and azure function app to send a document to be translated using the google cloud translation api.
I am trying to load the credentials from a tempfile (json) using the below code. The idea is to later download the json file from blob storage and store it in a temp file but I am not thinking about the blob storage for now.
key= {cred info}
f= tempfile.NamedTemporaryFile(suffix='.json', mode='a+')
json.dump(key, f)
f.flush()
f.seek(0)
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = f.name
client= translate.TranslationServiceClient()
But when I run this I get the following error:
Exception: PermissionError: [Errno 13] Permission denied:
How can I correctly load the creds from a temp file?. Also what is the relationship between translate.TranslationServiceClient() and os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = f.name? Does the TranslationServiceClient() get the creds from the environment variable?
I have been looking at this problem for a while now and I cannot find a good solution. Any help would be amazing!
edit:
when I change it to
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = f.read()
I get a different error:
System.Private.CoreLib: Exception while executing function:
Functions.Trigger. System.Private.CoreLib: Result: Failure
Exception: DefaultCredentialsError:
EDIT 2:
Its really weird, but it works when I read the file just before like so:
contents= f.read()
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = f.name
client= translate.TranslationServiceClient()
Any ideas why?

Any application which connects to any GCP Product requires credentials to authenticate. Now there are many ways how this authentication works.
According to the Google doc
Additionally, we recommend you use Google Cloud Client Libraries for your application. Google Cloud Client Libraries use a library called Application Default Credentials (ADC) to automatically find your service account credentials. ADC looks for service account credentials in the following order:
If the environment variable GOOGLE_APPLICATION_CREDENTIALS is set, ADC uses the service account key or configuration file that the variable points to.
If the environment variable GOOGLE_APPLICATION_CREDENTIALS isn't set, ADC uses the service account that is attached to the resource that is running your code.
This service account might be a default service account provided by Compute Engine, Google Kubernetes Engine, App Engine, Cloud Run, or Cloud Functions. It might also be a user-managed service account that you created.
If ADC can't use any of the above credentials, an error occurs.
There are also modules provided by Google that can be used to pass the credentials.
If you already have the JSON value as dictionary then you can simply pass dictionary in from_service_account_info(key)
Example:
key = json.load(open("JSON File Path")) # loading my JSON file into dictionary
client = translate.TranslationServiceClient().from_service_account_info(key)
In your case you already have the key as dictionary
As for the error you are getting, I believe that has to be something with the temp file. Because GOOGLE_APPLICATION_CREDENTIALS needs full access to the JSON file path to read from it.

google.cloud.bigquery.Client() ignoring provided scopes, resulting in Permission denied while getting Drive credentials

I am trying to query data stored in Drive via the google.cloud.bigquery Python library.
I've followed Google's guide for Querying Drive Data.
Thus, my code looks like this:
import google.auth
from google.cloud import bigquery
credentials, project = google.auth.default(
scopes=[
"https://www.googleapis.com/auth/drive",
"https://www.googleapis.com/auth/bigquery",
]
)
client = bigquery.Client(project, credentials)
query = client.query("""MY SQL HERE""")
query_results = query.result()
The issue is: The credentials object and bigquery client ignores the provided scopes, resulting in google.api_core.exceptions.Forbidden: 403 Access Denied: BigQuery BigQuery: Permission denied while getting Drive credentials. To clarify, neither credentials nor client include the drive scope provided.
What can I do to properly pass the drive scope to my bigquery client?
My application default credentials for my local environment is my authorized user, which is the owner of the project.

Once you've set your application default credentials as an authorized user, you cannot request additional scopes.
To request additional scopes, do so during activation of your authorized user.
More plainly, when you run gcloud auth application-default login, provide the --scopes option, followed by your desired scopes. For me, that was gcloud auth application-default login --scopes=openid,https://www.googleapis.com/auth/userinfo.email,https://www.googleapis.com/auth/cloud-platform,https://www.googleapis.com/auth/drive,https://www.googleapis.com/auth/bigquery

You should share the spreadsheet to the service account email, which you are trying to reach by.

BigQuery: Permission denied while getting Drive credentials - Unable to resolve the error

I was hoping to get some help with this error code I have been coming across.
Context:
The company I work for use the GSUITE product.
My team have their own Cloud Project setup.
Google Drive isn't a "personal" drive.
We utilise Airflow to refresh our BigQuery tables on a
daily/weekly/monthly basis.
I have followed these solutions
Access Denied: Permission denied while getting Drive credentials
"Encountered an error while globbing file pattern" error when using BigQuery API w/ Google Sheets
And also referenced
https://cloud.google.com/bigquery/external-data-drive#python_3
Problem
Cloud Composer : v 1.12.0
I have recently setup an external Bigquery table that reads a tab within a Google Sheet. My Airflow DAG has been failing to complete due to the access restriction to Drive.
I have added the following to the Airflow connection scopes:
airflow scopes
And also added the service account e-mail address to the Google Sheet the table is referencing via Share. I have also updated the Service account IAM roles to BigQuery admin. After following these steps, I still receive the error BigQuery: Permission denied while getting Drive credentials.
Problem2
Following the above, I found it easier to trouble shoot in local, so I created a VENV on my machine because its where im most comfortable troubleshooting. The goal is to simply query a Bigquery table that reads a Google sheet. However, after following the same above steps, I am still unable to get this to work.
My local code:
import dotenv
import pandas as pd
from google.cloud import bigquery
import google.auth
def run_BigQuery_table(sql):
dotenv.load_dotenv()
credentials, project = google.auth.default(
scopes=[
"https://www.googleapis.com/auth/cloud-platform",
"https://www.googleapis.com/auth/drive",
"https://www.googleapis.com/auth/bigquery",
]
)
bigquery.Client(project, credentials)
output = pd.read_gbq(sql, project_id=project, dialect='standard')
return output
script_variable = "SELECT * FROM `X` LIMIT 10"
bq_output = run_BigQuery_table(script_variable)
print(bq_output)
My error:
raise self._exception
google.api_core.exceptions.Forbidden: 403 Access Denied: BigQuery BigQuery: Permission denied > while getting Drive credentials.
raise GenericGBQException("Reason: {0}".format(ex))
pandas_gbq.gbq.GenericGBQException: Reason: 403 Access Denied: BigQuery BigQuery: Permission > denied while getting Drive credentials.
Is anyone able to help?
Cheers

So a colleague suggested that I explore the default pandas_gbq credentials, as this might be using default credentials to access the data.
Turns out, it worked.
You can manually set the pandas-gbq credentials by following this:
https://pandas-gbq.readthedocs.io/en/latest/howto/authentication.html
https://pandas-gbq.readthedocs.io/en/latest/api.html#pandas_gbq.Context.credentials
I simply added the following to my code
pdgbq.context.credentials = credentials
The final output:
import dotenv
import pandas as pd
from google.cloud import bigquery
import google.auth
import pandas_gbq as pdgbq
def run_BigQuery_table(sql):
dotenv.load_dotenv()
credentials, project = google.auth.default(
scopes=[
"https://www.googleapis.com/auth/cloud-platform",
"https://www.googleapis.com/auth/drive",
"https://www.googleapis.com/auth/bigquery",
]
)
pdgbq.context.credentials = credentials
bigquery.Client(project, credentials)
output = pd.read_gbq(sql, project_id=project, dialect='standard')
return output
script_variable4 = "SELECT * FROM `X` LIMIT 10"
bq_output = run_BigQuery_table(script_variable3)
print(bq_output)

I often get these errors, and the vast majority were solved through creating and sharing service accounts. However I recently had a case where our gsuite administrator updated security settings so that only our employees could access gsuite related things (spreadsheets, storage etc). It was an attempt to plug a security gap, but in doing so, any email address or service account which did not have #ourcompany.com was blocked from using BigQuery.
I recommend you explore your company gsuite settings, and see if external access is blocked. I cannot say this is the fix for your case, but it was for me, so could be worth trying..

Google cloud speech api throwing 403 when trying to use it

I'm using python with google cloud speech api I did all the steps in "How to use google speech recognition api in python?" on ubuntu and on windows as well and when I trying to run the simple script from here - "https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/speech/api/speech_rest.py"
I get the next error:
<HttpError 403 when requesting https://speech.googleapis.com/$discovery/rest?version=v1beta1 returned "Google Cloud Speech API has not been used in project google.com:cloudsdktool before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/speech.googleapis.com/overview?project=google.com:cloudsdktool then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry.">
what is weird is that I don't have project by the name "cloudsdktool"
I run "gcloud init", and linked the json file that I got when I created service account key with "gcloud auth activate-service-account --key-file=jsonfile" command,
I tried in linux to create google credentials environment variable and still I get the same massage

So I found two ways to fix that problem:
1 - if using google cloud sdk and the cloud speech is in beta version you need to run 'gcloud beta init' instead of 'gcloud init' and then provide the json file
2 - if you don't want to use the cloud sdk from google you can pass the json file straight in python app
here are the methods for this:
from oauth2client.client import GoogleCredentials
GoogleCredentials.from_stream('path/to/your/json')
then you just create scope on the creds and authorizing or if using grpc(streaming) you pass it to the header just like in the example.
here are the changed script for the grpc:
def make_channel(host, port):
"""Creates an SSL channel with auth credentials from the environment."""
# In order to make an https call, use an ssl channel with defaults
ssl_channel = implementations.ssl_channel_credentials(None, None, None)
# Grab application default credentials from the environment
creds = GoogleCredentials.from_stream('path/to/your/json').create_scoped([SPEECH_SCOPE])
# Add a plugin to inject the creds into the header
auth_header = (
'Authorization',
'Bearer ' + creds.get_access_token().access_token)
auth_plugin = implementations.metadata_call_credentials(
lambda _, cb: cb([auth_header], None),
name='google_creds')
# compose the two together for both ssl and google auth
composite_channel = implementations.composite_channel_credentials(
ssl_channel, auth_plugin)
return implementations.secure_channel(host, port, composite_channel)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.