Accessing DynamoDB Local from boto3

Accessing DynamoDB Local from boto3 - python

I am doing AWS tutorial Python and DynamoDB. I downloaded and installed DynamoDB Local. I got the access key and secret access key. I installed boto3 for python. The only step I have left is setting up authentication credentials. I do not have AWS CLI downloaded, so where should I include access key and secret key and also the region?
Do I include it in my python code?
Do I make a file in my directory where I put this info? Then should I write anything in my python code so it can find it?

You can try passing the accesskey and secretkey in your code like this:
import boto3
session = boto3.Session(
aws_access_key_id=ACCESS_KEY,
aws_secret_access_key=SECRET_KEY,
)
client = session.client('dynamodb')
OR
dynamodb = session.resource('dynamodb')

From the AWS documentation:
Before you can access DynamoDB programmatically or through the AWS
Command Line Interface (AWS CLI), you must configure your credentials
to enable authorization for your applications. Downloadable DynamoDB
requires any credentials to work, as shown in the following example.
AWS Access Key ID: "fakeMyKeyId"
AWS Secret Access Key:"fakeSecretAccessKey"
You can use the aws configure command of the AWS
CLI to set up credentials. For more information, see Using the AWS
CLI.
So, you need to create an .aws folder in yr home directory.
There create the credentials and config files.
Here's how to do this:
https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html

If you want to write portable code and keep in the spirit of developing 12-factor apps, consider using environment variables.
The advantage is that locally, both the CLI and the boto3 python library in your code (and pretty much all the other offical AWS SDK languages, PHP, Go, etc.) are designed to look for these values.
An example using the official Docker image to quickly start DynamoDB local:
# Start a local DynamoDB instance on port 8000
docker run -p 8000:8000 amazon/dynamodb-local
Then in a terminal, set some defaults that the CLI and SDKs like boto3 are looking for.
Note that these will be available until you close your terminal session.
# Region doesn't matter, CLI will complain if not provided
export AWS_DEFAULT_REGION=us-east-1
# Set some dummy credentials, dynamodb local doesn't care what these are
export AWS_ACCESS_KEY_ID=abc
export AWS_SECRET_ACCESS_KEY=abc
You should then be able to run the following (in the same terminal session) if you have the CLI installed. Note the --endpoint-url flag.
# Create a new table in DynamoDB Local
aws dynamodb create-table \
--endpoint-url http://127.0.0.1:8000 \
--table-name tmp \
--attribute-definitions AttributeName=id,AttributeType=S \
--key-schema AttributeName=id,KeyType=HASH \
--billing-mode PAY_PER_REQUEST
You should then able to list out the tables with:
aws dynamodb list-tables --endpoint-url http://127.0.0.1:8000
And get a result like:
{
"TableNames": [
"tmp"
]
}
So how do we get the endpoint-url that we've been specifying in the CLI to work in Python? Unfortunately, there isn't a default environment variable for the endpoint url in the boto3 codebase, so we'll need to pass it in when the code runs. The docs for .NET and Java are comprehensive but for Python, they are a bit more elusive. From the boto3 github repo and also see this great answer, we need to create a client or resource with the endpoint_url keyword. In the below, we're looking for a custom environment variable called AWS_DYNAMODB_ENDPOINT_URL. The point being that if specified, it will be used, otherwise will fall back to whatever the platform default is, making your code portable.
# Run in the same shell as before
export AWS_DYNAMODB_ENDPOINT_URL=http://127.0.0.1:8000
# file test.py
import os
import boto3
# Get environment variable if it's defined
# Make sure to set the environment variable before running
endpoint_url = os.environ.get('AWS_DYNAMODB_ENDPOINT_URL', None)
# Using (high level) resource, same keyword for boto3.client
resource = boto3.resource('dynamodb', endpoint_url=endpoint_url)
tables = resource.tables.all()
for table in tables:
print(table)
Finally, run this snippet with
# Run in the same shell as before
python3 test.py
# Should produce the following output:
# dynamodb.Table(name='tmp')

Related

Specify GOOGLE APPLICATION CREDENTIALS in Airflow

So I am trying to orchestrate a workflow in Airflow. One task is to read GCP Cloud Storage, which needs me to specify the Google Application Credentials.
I decided to create a new folder in the dag folder and put the JSON key. Then I specified this in the dag.py file;
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "dags\support\keys\key.json"
Unfortunately, I am getting this error below;
google.auth.exceptions.DefaultCredentialsError: File dags\support\keys\dummy-surveillance-project-6915f229d012.json was not found
Can anyone help with how I should go about declaring the service account key?
Thank you.

You can create a connection to Google Cloud from Airflow webserver admin menu. In this menu you can pass the Service Account key file path.
In this picture, the keyfile Path is /usr/local/airflow/dags/gcp.json.
Beforehand you need to mount your key file as a volume in your Docker container with the previous path.
You can also directly copy the key json content in the Airflow connection, in the keyfile Json field :
You can check from these following links :
Airflow-connections
Airflow-with-google-cloud
Airflow-composer-managing-connections

If you trying to download data from Google Cloud Storage using Airflow, you should use the GCSToLocalFilesystemOperator operator described here. It is already provided as part of the standard Airflow library (if you installed it) so you don't have to write the code yourself using the Python operator.
Also, if you use this operator you can enter the GCP credentials into the connections screen (where it should be). This is a better approach to putting your credentials in a folder with your DAGs as this could lead to your credentials being committed into your version control system which could lead to security issues.

Google cloud functions realtime database trigger setup in Python

I am trying to setup a Google Cloud Function with a Firebase Realtime Database Trigger and cannot make the function to be triggered when I add a document to the database.
What I want to happen is that when there is a new entry to the Firebase database collection yyy under project xxx I want the Cloud Function funtion-1 to be triggered. Function-1 is the default (as per below) and a test worked fine.
I am using the main console and created a function named function-1. I can see the function itself in the firebase console:
https://console.firebase.google.com/u/0/project/xxx/functions/list
The collection I set is under project xxx, named yyy, and I can access it under
https://console.firebase.google.com/u/0/project/xxx/database/firestore/data~2Fyyy
I am in the functions console:
https://console.cloud.google.com/functions/edit/us-central1/function-1?project=xxx
and the setup is as follows:
Trigger: Firebase Realtime Database (Beta)
Event Type: Create
Database: xxx
Path: /data/yyy
Runtime is Python 3.7
Code is default Google Cloud Functions code:
def hello_rtdb(event, context):
"""Triggered by a change to a Firebase RTDB reference.
Args:
event (dict): Event payload.
context (google.cloud.functions.Context): Metadata for the event.
"""
resource_string = context.resource
# print out the resource string that triggered the function
print(f"Function triggered by change to: {resource_string}.")
# now print out the entire event object
print(str(event))
requirements.txt is empty
I have used other triggers (HTTP, or PubSub) successfully in other Google Cloud Functions but I cannot get the function to be triggered by a database event. I have tried a wide range of options for the path variable but couldn't make it work.
The options I tried for the path variable are:
/xxx/database/firestore/data/yyy
/database/firestore/data/yyy
/data/yyy
/yyy
yyy
etc...
I am sure I am making a basic mistake but sadly the documentation isn't helping (probably because this is such a basic thing). How can I set this up in the right way?

Are you using Google Firestore or Firebase Firestore? I know they are technically the same
product under the covers but I believe they trigger different events. It may depend on whether
you created the DB from Google Cloud Platform or Firebase.
$ gcloud functions event-types list
EVENT_PROVIDER EVENT_TYPE EVENT_TYPE_DEFAULT RESOURCE_TYPE RESOURCE_OPTIONAL
google.firebase.database.ref providers/google.firebase.database/eventTypes/ref.write Yes firebase database No
google.firestore.document providers/cloud.firestore/eventTypes/document.write Yes firestore document No
I'm using event_provider=google.firestore.document and it works. Here's how I deploy Python functions onto Google Cloud. Assume main.py
exists with your above code but function name is called hello_firestore.
$ gcloud functions deploy hello_firestore --entry-point hello_firestore --runtime python37 --trigger-event providers/cloud.firestore/eventTypes/document.write --trigger-resource "projects/$GCP_PROJECT/databases/(default)/documents/$DOC_PATH"
For Firebase Firestore, it should something like this but this is not tested because I created my Firestore from GCP rather than Firebase.
$ gcloud functions deploy hello_rtdb --entry-point hello_rtdb --runtime python37 --trigger-event providers/google.firebase.database/eventTypes/ref.write --trigger-resource "projects/_/instances/$GCP_PROJECT/refs/$DOC_PATH"
Another thing to watch for is that only Firestore native mode supports triggering of events as mentioned here in the section Limitations and guarantees.

Cloudformation wildcard search with boto3

I have been tasked with converting some bash scripting used by my team that performs various cloudformation tasks into Python using the boto3 library. I am currently stuck on one item. I cannot seem to determine how to do a wildcard type search where a cloud formation stack name contains a string.
My bash version using the AWS CLI is as follows:
aws cloudformation --region us-east-1 describe-stacks --query "Stacks[?contains(StackName,'myString')].StackName" --output json > stacks.out
This works on the cli, outputting the results to a json file, but I cannot find any examples online to do a similar search for contains using boto3 with Python. Is it possible?
Thanks!

Yes, it is possible. What you are looking for is the following:
import boto3
# create a boto3 client first
cloudformation = boto3.client('cloudformation', region_name='us-east-1')
# use client to make a particular API call
response = cloudformation.describe_stacks(StackName='myString')
print(response)
# as an aside, you'd need a different client to communicate
# with a different service
# ec2 = boto3.client('ec2', region_name='us-east-1')
# regions = ec2.describe_regions()
where, response is a Python dictionary, which, among other things, will contain the description of the stack, "myString".

How to I access Security token for Python SDK boto3

I want to access AWS comprehend api from python script. Not getting any leads of how do I remove this error. One thing I know that I have to get session security token.
try:
client = boto3.client(service_name='comprehend', region_name='us-east-1', aws_access_key_id='KEY ID', aws_secret_access_key= 'ACCESS KEY')
text = "It is raining today in Seattle"
print('Calling DetectEntities')
print(json.dumps(client.detect_entities(Text=text, LanguageCode='en'), sort_keys=True, indent=4))
print('End of DetectEntities\n')
except ClientError as e:
print (e)
Error : An error occurred (UnrecognizedClientException) when calling the DetectEntities operation: The security token included in the request is invalid.

This error suggesting that you have provided invalid credentials.
It is also worth nothing that you should never put credentials inside your source code. This can lead to potential security problems if other people obtain access to the source code.
There are several ways to provide valid credentials to an application that uses an AWS SDK (such as boto3).
If the application is running on an Amazon EC2 instance, assign an IAM Role to the instance. This will automatically provide credentials that can be retrieved by boto3.
If you are running the application on your own computer, store credentials in the .aws/credentials file. The easiest way to create this file is with the aws configure command.
See: Credentials — Boto 3 documentation

Create a profile using aws configure or updating ~/.aws/config. If you only have one profile to work with = default, you can omit profile_name parameter from Session() invocation (see example below). Then create AWS service specific client using the session object. Example;
import boto3
session = boto3.session.Session(profile_name="test")
ec2_client = session.client('ec2')
ec2_client.describe_instances()
ec2_resource = session.resource(‘ec2’)

One useful tool I use daily is this: https://github.com/atward/aws-profile/blob/master/aws-profile
This makes assuming role so much easier!
After you set up your access key in .aws/credentials and your .aws/config
you can do something like:
AWS_PROFILE=**you-profile** aws-profile [python x.py]
The part in [] can be substituted with anything that you want to use AWS credentials. e.g., terraform plan
Essentially, this utility simply put your AWS credentials into os environment variables. Then in your boto script, you don't need to worry about setting aws_access_key_id and etc..

Access to Amazon S3 Bucket from EC2 instance

I have an EC2 instance and an S3 bucket in different region. The bucket contains some files that are used regularly by my EC2 instance.
I want to programatically download the files on my EC2 instance (using python)
Is there a way to do that?

Lots of ways to do this from within python
Boto has S3 modules which will do this. http://boto.readthedocs.org/en/latest/ref/s3.html
You could also just use the python requests library to download over http
AWS Cli also give you an option to download from the shell:
aws s3 cp s3://bucket/folder/file.name file.name

Adding to what #joeButler has said above...
Your instances need permission to access S3 using APIs.
So, you need to create IAM role and instance profile. Your instance needs to have instance profile assigned when it is being created. See page 183 (as indicated on bottom of page. The topic name is "Using an IAM Role to Grant Permissions to Applications
Running on Amazon EC2 Instances") of this guide: AWS IAM User Guide to understand the steps and procedure.

I work for Minio, its open source, S3 Compatible object storage written in golang.
You can use minio-py client library, its open source & compatible with AWS S3. Below is a simple example of get_object.py
from minio import Minio
from minio.error import ResponseError
client = Minio('s3.amazonaws.com',
access_key='YOUR-ACCESSKEYID',
secret_key='YOUR-SECRETACCESSKEY')
# Get a full object
try:
data = client.get_object('my-bucketname', 'my-objectname')
with open('my-testfile', 'wb') as file_data:
for d in data:
file_data.write(d)
except ResponseError as err:
print(err)
You can also use minio client aka mc it come mc mirror command to perform the same. You can add it to cron.
$ mc mirror s3/mybucket localfolder
Note:
s3 is an alias
mybucket is your AWS S3 bucket
localfolder is EC2 machine file for backup.
Installing Minio Client:
GNU/Linux
Download mc for:
64-bit Intel from
https://dl.minio.io/client/mc/release/linux-amd64/mc
32-bit Intel from https://dl.minio.io/client/mc/release/linux-386/mc
ARM from https://dl.minio.io/client/mc/release/linux-arm/mc
$ chmod 755 mc
$ ./mc --help
Adding your S3 credentials
$ ./mc config host add mys3 https://s3.amazonaws.com BKIKJAA5BMMU2RHO6IBB V7f1CwQqAcwo80UEIJEjc5gVQUSSx5ohQ9GSrr12
Note: Replace access & secret key with yours.

As mentioned above, you can do this with Boto. To make it more secure and not worry about the user credentials, you could use IAM to grant the EC2 machine access to the specific bucket only. Hope that helps.

If you want to use python, you may want to use the newer boto3 API. I personally like it more than to original boto package. It works with both python2 and python3 and the differences are minimal.
You can specify region when you create a new bucket (see boto3.Client documentation), but bucket names are unique, so you shouldn't need one to connect to it. And you probably don't want to use bucket in different region than your instance because you will pay for data transfer between regions.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Accessing DynamoDB Local from boto3 - python

You can try passing the accesskey and secretkey in your code like this: import boto3 session = boto3.Session( aws_access_key_id=ACCESS_KEY, aws_secret_access_key=SECRET_KEY, ) client = session.client('dynamodb') OR dynamodb = session.resource('dynamodb')

Related

Specify GOOGLE APPLICATION CREDENTIALS in Airflow

Google cloud functions realtime database trigger setup in Python

Cloudformation wildcard search with boto3

How to I access Security token for Python SDK boto3

Access to Amazon S3 Bucket from EC2 instance

Categories

Resources