Access data via BigQuery in python - python

I am trying to access data in python using bigquery api , here is my code.
I have placed the pem file inside the same folder but script returns an error "googleapiclient.errors.HttpError: https://www.googleapis.com/bigquery/v2/projects/digin-1086/queries?alt=json returned "Not found: Table digin-1086:dataset.my_table">
from bigquery import get_client
# BigQuery project id as listed in the Google Developers Console.
project_id = 'digin-1086'
# Service account email address as listed in the Google Developers Console.
service_account = '77441948210-4fhu1kc1driicjecriqupndkr60npnh#developer.gserviceaccount.com'
# PKCS12 or PEM key provided by Google.
key = 'Digin-d6387c00c5a'
client = get_client(project_id, service_account=service_account,
private_key_file=key, readonly=True)
# Submit an async query.
job_id, _results = client.query('SELECT * FROM dataset.my_table LIMIT 1000')
# Check if the query has finished running.
complete, row_count = client.check_job(job_id)
# Retrieve the results.
results = client.get_query_rows(job_id)

The error says it can't find your table, nothing to do with the pem file. You need to make the table exits in the dataset.

To access data via BigQuery in python you can do the following:
from google.cloud import bigquery
from google.oauth2 import service_account
from google.auth.transport import requests
credentials = service_account.Credentials.from_service_account_file(
r'filelocation\xyz.json')
project_id = 'abc'
client = bigquery.Client(credentials= credentials,project=project_id)
query_job = client.query("""
SELECT *
FROM tabename
LIMIT 10""")
results = query_job.result()
for row in results:
print(row)}

Related

azure-devops-python-api query for work item where field == string

I'm using the azure python api (https://github.com/microsoft/azure-devops-python-api) and I need to be able to query & find a specific work item based on a custom field value.
The closest thing I can find is the function create_query, but Im hoping to be able to run a query such as
queryRsp = wit_5_1_client.run_query(
posted_query='',
project=project.id,
query='Custom.RTCID=282739'
)
I just need to find my azure devops work item where the custom field RTCID has a certain specific unique value.
Do i need to create a query with the api, run it, get results, then delete the query? Or is there any way I can run this simple query and get the results using the azure devops api?
Your Requirement can be achieved.
For example, on my side, there is two workitems that have custom field 'RTCID':
The Below is how to use python to design this feature(On my side, both organization name and project name named 'BowmanCP'):
#query workitems from azure devops
from azure.devops.connection import Connection
from msrest.authentication import BasicAuthentication
from azure.devops.v5_1.work_item_tracking.models import Wiql
import pprint
# Fill in with your personal access token and org URL
personal_access_token = '<Your Personal Access Token>'
organization_url = 'https://dev.azure.com/BowmanCP'
# Create a connection to the org
credentials = BasicAuthentication('', personal_access_token)
connection = Connection(base_url=organization_url, creds=credentials)
# Get a client (the "core" client provides access to projects, teams, etc)
core_client = connection.clients.get_core_client()
#query workitems, custom field 'RTCID' has a certain specific unique value
work_item_tracking_client = connection.clients.get_work_item_tracking_client()
query = "SELECT [System.Id], [System.WorkItemType], [System.Title], [System.AssignedTo], [System.State], [System.Tags] FROM workitems WHERE [System.TeamProject] = 'BowmanCP' AND [Custom.RTCID] = 'xxx'"
#convert query str to wiql
wiql = Wiql(query=query)
query_results = work_item_tracking_client.query_by_wiql(wiql).work_items
#get the results via title
for item in query_results:
work_item = work_item_tracking_client.get_work_item(item.id)
pprint.pprint(work_item.fields['System.Title'])
Successfully got them on my side:
SDK source code is here:
https://github.com/microsoft/azure-devops-python-api/blob/451cade4c475482792cbe9e522c1fee32393139e/azure-devops/azure/devops/released/work_item_tracking/work_item_tracking_client.py#L704
You can refer to above source code.

Permissions issues when querying a table created from Google Sheets

I have created a Big Query table using a Google Sheet as a source.
I am trying to query the table with some Python script.
from google.cloud import bigquery
from google.oauth2 import service_account
from google.auth.transport import requests
credentials = service_account.Credentials.from_service_account_file(
r"[key location]")
project_id = '[PROJECT]'
client = bigquery.Client(credentials= credentials,project=project_id)
query_job = client.query("""
SELECT *
FROM [TABLENAME]
LIMIT 10""")
results = query_job.result()
However, I am receiving the following error.
Forbidden: 403 Access Denied: BigQuery BigQuery: Permission denied while getting Drive credentials.
I have successfully used the above code to query another table (not from a Sheet source), so the issue is specifically to do with the table sourced from Sheets. I have tried running the code both on a cloud resource (using a service account) and locally.
Does anyone know the fix?
You must enable drive access of BigQuery by using the code from this documentation to create credentials for both BigQuery and Google Drive. Please do note that both BigQuery API and Google Drive API must be enabled before running the code.
I updated the code from your question and used it on my testing as shown below:
from google.cloud import bigquery
import google.auth
# Create credentials with Drive & BigQuery API scopes.
# Both APIs must be enabled for your project before running this code.
credentials, project = google.auth.default(
scopes=[
"https://www.googleapis.com/auth/drive",
"https://www.googleapis.com/auth/bigquery",
]
)
project_id = '[PROJECT]'
client = bigquery.Client(credentials= credentials,project=project_id)
query_job = client.query("""
SELECT *
FROM [TABLENAME]
LIMIT 10""")
results = query_job.result()
#Printing results for my testing
for row in results:
row1 = row['string_field_0']
row2 = row['string_field_1']
print(f'{row1} | {row2}')
Output:

Error load data local CSV to BigQuery using Python

I'm new into data engineering field, and want to create table and inserting the data to BigQuery using Python, but in the process I got error message
even though I already set the google_application_credential through the shell, the error message still appear
here is my code
from google.cloud import bigquery
from google.cloud import language
from google.oauth2 import service_account
import os
os.environ["GOOGLE_APPLICATION_CREDENTIAL"]=r"C:/Users/Pamungkas/Downloads/testing-353407-a3c774efeb5a.json"
client = bigquery.Client()
table_id="testing-353407.testing_field.sales"
file_path=r"C:\Users\Pamungkas\Downloads\sales.csv"
job_config = bigquery.LoadJobConfig(
source_format=bigquery.SourceFormat.CSV, skip_leading_rows=1, autodetect=True,
write_disposition=bigquery.WriteDisposition.WRITE_TRUNCATE #added to have truncate and insert load
)
with open(file_path, "rb") as source_file:
job = client.load_table_from_file(source_file, table_id, job_config=job_config)
job.result() # Waits for the job to complete.
table = client.get_table(table_id) # Make an API request.
print(
"Loaded {} rows and {} columns to {}".format(
table.num_rows, len(table.schema), table_id
)
)
As #p13rr0m suggested, you should have to use the environment variable as GOOGLE_APPLICATION_CREDENTIALS instead of GOOGLE_APPLICATION_CREDENTIAL to resolve your issue.

Cannot connect to query from python using Service account with BigQuery Job User role (in IAM)

Cloud team granted to me a Service account (SA) with BigQuery Job User role (IAM) for query from python.
But i got that issue
403 request failed: the user does not have
'bigquery.readsessions.create' permission to project
But this SA is worked in Java. Looking for ideas to solve this, thanks ^^
My simple code here:
import pandas as pd
import google.cloud.bigquery as gbq
gbq_client = gbq.Client.from_service_account_json('credentials/credentials_bigquery.json')
#function query
def query_job(client,query):
query_job = client.query(query) # Make an API request.
df_from_bq = query_job.to_dataframe()
return df_from_bq
#query
qr_user = """
SELECT user_id FROM `CUSTOMER_DATA_PLATFORM.CDP_TBL` LIMIT 1000
"""
user = query_job(gbq_client,qr_user)

Google Bigquery API: How to get the name of temporary table

I know that: The result of query will save in a temporary table. But how to get its name to use in other query.
from googleapiclient.discovery import build
from googleapiclient.errors import HttpError
from oauth2client.client import GoogleCredentials
def test():
project_id = "598330041668"
credentials = GoogleCredentials.get_application_default()
bigquery_service = build('bigquery', 'v2', credentials=credentials)
# [START run_query]
query_request = bigquery_service.jobs()
query_data = {
'query': (
'SELECT * '
'FROM [test.names];')
}
query_response = query_request.query(
projectId=project_id,
body=query_data).execute()
# [END run_query]
Find the job ID of the query you ran. This is present in the response from jobs.query in the jobReference field.
Look up the query configuration for the job you ran using the jobs.get method. Inspect the field configuration.query.destinationTable.
However, note that if you didn't specify your own destination table, that BigQuery assigns a temporary one that should only be used in certain APIs, which does not include other queries:
The query results from this method are saved to a temporary table that is deleted approximately 24 hours after the query is run. You can read this results table by calling either bigquery.tabledata.list(table_id) or bigquery.jobs.getQueryResults(job_reference). The table and dataset name are non-standard, and cannot be used in any other other APIs, as the behavior may be unpredictable.
If you want to use the results of a query in a subsequent query, I suggest setting the configuration.query.destinationTable yourself to write to a table of your choice.

Categories