how to run a python code with impersonated Service Account

how to run a python code with impersonated Service Account - python

i am coming here after searching google but i am not able to find any answer which i can understand. Kindly help me with my query.
If i want to access GCP resource using an impersonated service account i know i can use it using commands like for example to list a bucket in a project:
gsutil -i service-account-id ls -p project-id
But how can i run a python code ex: test1.py to access the resources using impersonate service account ?
Is there any package or class that i need to use it ? if yes then how to use ? PFB the scenario and code:
I have a pub/sub topic hosted in project-A, where owner is xyz#gmail.com and I have a python code hosted in project-B where owner is abc#gmail.com.
In project-A I have created a service account where I have added abc#gmail.com to impersonate the service account which has pub/sub admin role. Now how can I access pubsub topic via my python code in project-B without using the keys ?
"""Publishes multiple messages to a Pub/Sub topic with an error handler."""
import os
from collections.abc import Callable
from concurrent import futures
from google.cloud import pubsub_v1
# os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "C:\gcp_poc\key\my-GCP-project.JSON"
project_id = "my-project-id"
topic_id = "myTopic1"
publisher = pubsub_v1.PublisherClient()
topic_path = publisher.topic_path(project_id, topic_id)
publish_futures = []
def get_callback(publish_future: pubsub_v1.publisher.futures.Future, data: str) -> Callable[[pubsub_v1.publisher.futures.Future], None]:
def callback(publish_future: pubsub_v1.publisher.futures.Future) -> None:
try:
# Wait 60 seconds for the publish call to succeed.
print(f"Printing the publish future result here: {publish_future.result(timeout=60)}")
except futures.TimeoutError:
print(f"Publishing {data} timed out.")
return callback
for i in range(4):
data = str(i)
# When you publish a message, the client returns a future.
publish_future = publisher.publish(topic_path, data.encode("utf-8"))
# Non-blocking. Publish failures are handled in the callback function.
publish_future.add_done_callback(get_callback(publish_future, data))
publish_futures.append(publish_future)
# Wait for all the publish futures to resolve before exiting.
futures.wait(publish_futures, return_when=futures.ALL_COMPLETED)
print(f"Published messages with error handler to {topic_path}.")

Create a Service Account with the appropriate role. Then create a Service Account key file and download it. Then put the path of the key file in the "GOOGLE_APPLICATION_CREDENTIALS" environment variable. The Client Library will pick that key file and use it for further authentication/authorization. Please read this official doc to know more about how Application Default Credentials work.

Related

Get Cloud Storage upload response

I am uploading a file to a Cloud Storage bucket using the Python SDK:
from google.cloud import storage
bucket = storage.Client().get_bucket('mybucket')
df = # pandas df to save
csv = df.to_csv(index=False)
output = 'test.csv'
blob = bucket.blob(output)
blob.upload_from_string(csv)
How can I get the response to know if the file was uploaded successfully? I need to log the response to notify the user about the operation.
I tried with:
response = blob.upload_from_string(csv)
but it always return a None object even when the operation has succeded.

You can try with tqdm library.
import os
from google.cloud import storage
from tqdm import tqdm
def upload_function(client, bucket_name, source, dest, content_type=None):
bucket = client.bucket(bucket_name)
blob = bucket.blob(dest)
with open(source, "rb") as in_file:
total_bytes = os.fstat(in_file.fileno()).st_size
with tqdm.wrapattr(in_file, "read", total=total_bytes, miniters=1, desc="upload to %s" % bucket_name) as file_obj:
blob.upload_from_file(file_obj,content_type=content_type,size=total_bytes,
)
return blob
if __name__ == "__main__":
upload_function(storage.Client(), "bucket", "C:\files\", "Cloud:\blob.txt", "text/plain")

Regarding how to get notifications about changes made into the buckets there is a few ways that you could also try:
Using Pub/Sub - This is the recommended way where Pub/Sub notifications send information about changes to objects in your buckets to Pub/Sub, where the information is added to a Pub/Sub topic of your choice in the form of messages. Here you will find an example using python, as in your case, and using other ways as gsutil, other supported languages or REST APIs.
Object change notification with Watchbucket: This will create a notification channel that sends notification events to the given application URL for the given bucket using a gsutil command.
Cloud Functions with Google Cloud Storage Triggers using event-driven functions to handle events from Google Cloud Storage configuring these notifications to trigger in response to various events inside a bucket—object creation, deletion, archiving and metadata updates. Here there is some documentation on how to implement it.
Another way is using Eventarc to build an event-driven architectures, it offers a standardized solution to manage the flow of state changes, called events, between decoupled microservices. Eventarc routes these events to Cloud Run while managing delivery, security, authorization, observability, and error-handling for you. Here there is a guide on how to implement it.
Here you’ll be able to find related post with the same issue and answers:
Using Storage-triggered Cloud Function.
With Object Change Notification and Cloud Pub/Sub Notifications for Cloud Storage.
Answer with a Cloud Pub/Sub topic example.

You can verify if the upload gets any error, then use the exception's response methods:
def upload(blob,content):
try:
blob.upload_from_string(content)
except Exception as e:
status_code = e.response.status_code
status_desc = e.response.json()['error']['message']
else:
status_code = 200
status_desc = 'success'
finally:
return status_code,status_desc
Refs:
https://googleapis.dev/python/google-api-core/latest/_modules/google/api_core/exceptions.html
https://docs.python.org/3/tutorial/errors.html

How to access authentication by Strava API using Python?

I am starting a small python script (not an application) that can upload my *.fit activity files on Strava whenever they are created in a desired folder.
The main steps I plan to do are:
1. monitor *.fit file system modifications
2. access authentication to Strava to enable my program to upload files
(This tool will be personal use only, thus I expect no need to authenticate every time uploading)
3. upload the file to my Strava account
4. automatically doing this fixed routine with the help of Windows Task Scheduler
(For example, there will be 4-5 new riding activities generated in my computer folder, I expect this tool can automatically upload all of them once a week so that I do not need to manually complete the task.)
For step2, I really have no ideas how to implement even though reading through Strava Authentication Documentation and several source codes other peoples have developed (e.g. toravir's "rk2s (RunKeeper 2 Strava)" project on GitHub). I grabbed that some of the python modules like stravalib, swagger_client, request, json, etc. as well as concepts like OAuth2 may be related to step2 but I still can not put everything together...
Can any experienced give me some advice for the implementations of step2? or any related readings will be perfect!
Advice for other parts of this project will also be very welcomed and appreciated.
I thank you very much in advance:)

This is a code example on how you can access the Strava API, check out this gist or use the code below:
import time
import pickle
from fastapi import FastAPI
from fastapi.responses import RedirectResponse
from stravalib.client import Client
CLIENT_ID = 'GET FROM STRAVA API SITE'
CLIENT_SECRET = 'GET FROM STRAVA API SITE'
REDIRECT_URL = 'http://localhost:8000/authorized'
app = FastAPI()
client = Client()
def save_object(obj, filename):
with open(filename, 'wb') as output: # Overwrites any existing file.
pickle.dump(obj, output, pickle.HIGHEST_PROTOCOL)
def load_object(filename):
with open(filename, 'rb') as input:
loaded_object = pickle.load(input)
return loaded_object
def check_token():
if time.time() > client.token_expires_at:
refresh_response = client.refresh_access_token(client_id=CLIENT_ID, client_secret=CLIENT_SECRET, refresh_token=client.refresh_token)
access_token = refresh_response['access_token']
refresh_token = refresh_response['refresh_token']
expires_at = refresh_response['expires_at']
client.access_token = access_token
client.refresh_token = refresh_token
client.token_expires_at = expires_at
#app.get("/")
def read_root():
authorize_url = client.authorization_url(client_id=CLIENT_ID, redirect_uri=REDIRECT_URL)
return RedirectResponse(authorize_url)
#app.get("/authorized/")
def get_code(state=None, code=None, scope=None):
token_response = client.exchange_code_for_token(client_id=CLIENT_ID, client_secret=CLIENT_SECRET, code=code)
access_token = token_response['access_token']
refresh_token = token_response['refresh_token']
expires_at = token_response['expires_at']
client.access_token = access_token
client.refresh_token = refresh_token
client.token_expires_at = expires_at
save_object(client, 'client.pkl')
return {"state": state, "code": code, "scope": scope}
try:
client = load_object('client.pkl')
check_token()
athlete = client.get_athlete()
print("For {id}, I now have an access token {token}".format(id=athlete.id, token=client.access_token))
# To upload an activity
# client.upload_activity(activity_file, data_type, name=None, description=None, activity_type=None, private=None, external_id=None)
except FileNotFoundError:
print("No access token stored yet, visit http://localhost:8000/ to get it")
print("After visiting that url, a pickle file is stored, run this file again to upload your activity")
Download that file, install the requirements, and run it (assuming the filename is main):
pip install stravalib
pip install fastapi
pip install uvicorn
uvicorn main:app --reload

I believe you need to authenticate using OAuth in order to upload your activity, which pretty much requires you to have a web server setup that Strava can post back to after you "Authorize". I just set the authentication piece up using Rails & Heroku.
This link has a pretty good flowchart of what needs to happen.
https://developers.strava.com/docs/authentication/

Actually it looks like if you go to API Settings you can get your access token and refresh token there. I would also check out the Python Strava Library but it looks like you could do something like:
from stravalib.client import Client
access_token = 'your_access_token_from_your_api_application_settings_page'
refresh_token = 'your_refresh_token_from_your_api_application_settings_page'
client = Client()
athlete = client.get_athlete()
You may need to dig in a little more to that library to figure out the upload piece.

Switching IAM-user roles with Athena and boto3

I am writing a python program using boto3 that grabs all of the queries made by a master account and pushes them out to all of the master account's sub accounts.
Grabbing the query IDs from the master instance is done, but I'm having trouble pushing them out to the sub accounts. With my authentication information AWS is connecting to the master account by default, but I can't figure out how to get it to connect to a sub account. Generally AWS services do this by switching roles, but Athena doesn't have a built in method for this. I could manually create different profiles but I'm not sure how to switch them manually in the middle of code execution
Here's Amazon's code example for switching in STS, which does support assuming different roles https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-api.html
Here's what my program looks like so far
#!/usr/bin/env python3
import boto3
dev = boto3.session.Session(profile_name='dev')
#Function for executing athena queries
client = dev.client('athena')
s3_input = 's3://dev/test/'
s3_output = 's3://dev/testOutput'
database = 'ex_athena_db'
table = 'test_data'
response = client.list_named_queries(
MaxResults=50,
WorkGroup='primary'
)
print response
So I have the "dev" profile, but I'm not sure how to differentiate this profile to indicate to AWS that I'd like to access one of the child accounts. Is it just the name, or do I need some other paramter? I don't think I can (or need to) generate a seperate authentication token for this

I solved this by creating a new user profile for the sub account with a new ARN
sample config
[default]
region = us-east-1
[profile ecr-dev]
role_arn = arn:aws:iam::76532435:role/AccountRole
source_profile = default
sample code
#!/usr/bin/env python3
import boto3
dev = boto3.session.Session(profile_name='name', region_name="us-east-1")
#Function for executing athena queries
client = dev.client('athena')
s3_input = 's3:/test/'
s3_output = 's3:/test'
database = 'ex_athena_db'
response = client.list_named_queries(
MaxResults=50,
WorkGroup='primary'
)
print response

boto3 add unix credentials, urllib2/3

Can we add the unix user credentials to SQS client?
Situation :
I have a user in Unix system (without 'sudo'/root privileges). And this Unix user's permissions does not allow me to connect to amazonaws.com.
Can I pass the username and password of this Unix user to boto3 client (SQS) ?
OR, can we pass boto3 client receive message object in urllib2/3?
What I already tried :
import boto3
# Create SQS client
sqs = boto3.client('sqs',region_name='eu-west-1')
queue_url='https://sqs.eu-west-1.amazonaws.com/XXXX'
# Receive message from SQS queue
response = sqs.receive_message(
QueueUrl=queue_url,
AttributeNames=[
'SentTimestamp'
],
MaxNumberOfMessages=10,
MessageAttributeNames=[
'All'
],
VisibilityTimeout=0,
WaitTimeSeconds=0
)
I can access any URL using urllib2 to scrape it. And I am able to pass Unix credentials while doing it.
As a root user I am able to get messages from amazonSQS using boto3 as mentioned in the above code.

No. Not possible. From your code, it looks like the credentials are stored under root user's directory which cannot be accessed by an user without sudo privileges. This is a Linux issue, nothing to do with Boto3.
Find a way to let the user access the aws credentials. Like a soft link with read permissions on the credentials file.

Stackdriver Google Python API Access Denied

When trying to create a sink using the Google Cloud Python3 API Client I get the error:
RetryError: GaxError(Exception occurred in retry method that was not classified as transient, caused by <_Rendezvous of RPC that terminated with (StatusCode.PERMISSION_DENIED, The caller does not have permission)>)
The code I used was this one:
import os
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path_to_json_secrets.json'
from google.cloud.bigquery.client import Client as bqClient
bqclient = bqClient()
ds = bqclient.dataset('dataset_name')
print(ds.access_grants)
[]
ds.delete()
ds.create()
print(ds.access_grants)
[<AccessGrant: role=WRITER, specialGroup=projectWriters>,
<AccessGrant: role=OWNER, specialGroup=projectOwners>,
<AccessGrant: role=OWNER, userByEmail=id_1#id_2.iam.gserviceaccount.com>,
<AccessGrant: role=READER, specialGroup=projectReaders>]
from google.cloud.logging.client import Client as lClient
lclient = lClient()
dest = 'bigquery.googleapis.com%s' %(ds.path)
sink = lclient.sink('sink_test', filter_='jsonPayload.project=project_name', destination=dest)
sink.create()
Don't quite understand why this is happening. When I use lclient.log_struct() I can see the logs arriving in the Logging console so I do have access to Stackdriver Logging.
Is there any mistake in this setup?
Thanks in advance.

Creating a sink requires different permissions than writing a log entry. By default service accounts are given project Editor (not Owner), which does not have permission to create sinks.
See the list of permissions required in the access control docs.
Make sure the service account you're using has logging.sinks.create permission. The simplest way to do this is to switch the service account from Editor to Owner, but it would be better to add the Logs Editor Role so you just give it the permission it needs.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.