I am new to get AWS CLI working, and finally have my commands working through gitBash with:
aws s3 ls --no-verify-ssl
I am now trying to run the same commands from Python.
I need to be able to do the following tasks in AWS s3 from Python:
Copy hundreds of local folders to the s3 bucket.
Update existing folders on the s3 bucket with changes made on local versions.
List contents of the s3 bucket.
In reading similar posts here, I see that --no-verify-ssl means there is a bigger problem, however using it is the way our network people have set things up, and I have no control over that. This is the flag they require to be used to allow access to the AWS CLI.
I have tried using boto3 and running the Python command there, but I get an authentication error because I don't know how to pass the --no-verify-ssl flag from Python.
Related
I'm trying to run a python script file while in the AWS CLI. Does anyone have the syntax for that please? I've tried a few variations but without success:
aws ssm send-command --document-name "AWS-RunShellScript" --parameters commands=["/Documents/aws_instances_summary.py"]
I'm not looking to connect to a particular EC2 instance as the script gathers information about all instances
aws ssm send-command runs the command on an EC2 instance, not on your local computer.
From your comments, it looks like you are actually trying to determine how to configure the AWS SDK for Python (Boto3) with AWS API credentials, so you can run the script from your local computer and get information about the AWS account.
You would not use the AWS CLI tool at all for this purpose. Instead you would simply run the Python script directly, having configured the appropriate environment variables, or ~/.aws/credentials file, on your local computer with the API credentials. Please see the official documentation for configuring AWS API credentials for Boto3.
A minimal example would look something like this:
export AWS_ACCESS_KEY_ID=your_access_key_id
export AWS_SECRET_ACCESS_KEY=your_secret_access_key
python aws_instances_summary.py
I have code on aws ec2. Right now, it accepts input and output files from s3. Its an inefficient process. I have to upload input file to s3, copy s3 to ec2, run program, copy output files from ec2 to s3, then download locally.
Is there a way to run the code on ec2 and accept a local file as input and then have the output saved on my local machine?
It appears that your scenario is:
Some software on an Amazon EC2 instance is used to process data on the local disk
You are manually transferring that data to/from the instance via Amazon S3
An Amazon EC2 instance is just like any other computer. It runs the same operating system and the same software as you would on a server in your company. However, it does benefit from being in the cloud in that it has easy access to other services (such as Amazon S3) and resources can be turned off to save expense.
Optimize current process
In sticking with the current process, you could improve it with some simple automation:
Upload your data to Amazon S3 via an AWS Command-Line Interface (CLI) command, such as: aws s3 cp file.txt s3://my-bucket/input/
Execute a script on the EC2 process that will:
Download the file, eg: aws s3 cp s3://my-bucket/input/file.txt .
Process the file
Copy the results to S3, eg: aws s3 cp file.txt s3://my-bucket/output/
Download the results to your own computer, eg: aws s3 cp s3://my-bucket/output/file.txt .
Use scp to copy files
Assuming that you are connect to a Linux instance, you could automate via:
Use scp to copy the file to the EC2 instance (which is very similar to the SSH command)
Use ssh with a [remote command(https://malcontentcomics.com/systemsboy/2006/07/send-remote-commands-via-ssh.html) parameter to trigger the remote process
Use scp to copy the file down once complete
Re-architect to use AWS Lambda
If the job that runs on the data is suitable for being run as an AWS Lambda function, then the flow would be:
Upload the data to Amazon S3
This automatically triggers the Lambda function, which processes the data and stores the result
Download the result from Amazon S3
Please note that an AWS Lambda function runs for a maximum of 15 minutes and has a limit of 512MB of temporary disk space. (This can be expanded by using Amazon EFS is needed.)
Something in-between
There are other ways to upload/download data, such as running a web server on the EC2 instance and interacting via a web browser, or using AWS Systems Manager Run Command to trigger the process on the EC2 instance. Such a choice would be based on how much you are permitted to modify what is running on the instance and your technical capabilities.
#John Rotenstein we have solved the problem of loading 60MB+ models into Lambdas by attaching AWS EFS volumes via VPC. Also solves the problem with large libs such as Tensorflow, opencv etc. Basically lambda layers almost become redundant and you can really sit back and relax, this saved us days if not weeks of tweaking, building and cherry picking library components from source allowing us to concentrate on the real problem. Beats loading from S3 everytime too. The EFS approach would require an ec2 instance obviously.
I am an absolute beginner in AWS: I have created a key and an instance, the python script I want to run in the EC2 environment needs to loop through around 80,000 filings, tokenize the sentences in them, and use these sentences for some unsupervised learning.
This might be a duplicate; but I can't find a way to copy these filings to the EC2 environment and run the python script in EC2, I am also not very sure as to how I can use boto3. I am using Mac OS. I am just looking for any way to speed things up. Thank you so so much! I am forever grateful!!!
Here's what I tried recently:
Create the bucket and keep the bucket accessible for public.
Create the role and add HTTP option.
Upload all the files and make sure the files are public accessible.
Get the HTTP link of the S3 file.
Connect the instance through putty.
wget copies the file into EC2
instance.
If your files are in zip format, one time copy enough to move all the files into instance.
Here's one way that might help:
create a simple IAM role that allows S3 access to the bucket holding your files
apply that IAM role to the running EC2 instance (or launch a new instance with the IAM role)
install the awscli on the EC2 instance
SSH to the instance and sync the S3 files to the EC2 instance using aws s3 sync
run your app
I'm assuming you've launched EC2 with enough diskspace to hold the files.
Noob and beginner here. Just trying to learn the basics of GCP.
I have a series of Google Cloud Buckets that are text files. I also have a VM instance that I've set up via GCP.
Now, I'm trying to write some code to extract the data from Google buckets and run the script via GCP's command prompt.
How can I extract GCP buckets in Python
I think that you can use the Listing Objects and Downloading Objects GCS methods with Python; in this way, you will be able to get a list of the objects stored in your Cloud Storage buckets to then extract them into you VM instance. Additionally, keep in mind that it is important to verify that the service account that you implement to perform these tasks, has the required roles assigned in order to access to your GCS buckets, as well as provide the credentials to your application by using environment variables or explicitly pointing to your service account file in code.
Once you have your code ready, you can simply execute your Python program by using the
python command. You can take a look on this link to get the instructions to install Python in your new environment.
I have a CSV which is stored in an AWS S3 bucket and is used to store information which gets loaded into a HTML document via some jQuery.
I also have a Python script which is currently sat on my local machine ready to be used. This Python script scrapes another website and saves the information to the CSV file which I then upload to my AWS S3 bucket.
I am trying to figure out a way that I can have the Python script run nightly and overwrite the CSV stored in the S3 bucket. I cannot seem to find a similar solution to my problem online and am vastly out of my depth when it comes to AWS.
Does anyone have any solutions to this problem?
Cheapest way: Modify your Python script to work as an AWS Lambda function, then schedule it to run nightly.
Easiest way: Spin up an EC2 instance, copy the script to the instance, and schedule it to run nightly via cron.