Downloading a file from s3 to local machine using boto3 (python)

Downloading a file from s3 to local machine using boto3 (python) - python

I'm trying to use a python script to automate downloading a file from AWS s3 to my local machine. The python script itself is hosted on ubuntu (AWS ec2 instance), so it's not recognizing a directory on my local machine.
Here's my code:
import os
import boto3
from boto3.session import Session
print("this script downloads the file from s3 to local machine")
s3 = boto3.resource('s3')
BUCKET_NAME = 'sfbucket.myBucket'
KEY = 'sf_events.json'
s3.Bucket(BUCKET_NAME).download_file(KEY, '/Users/Documents/my_file.json')
print('end')
However, this gives me the following error:
FileNotFoundError: [Errno 2] No such file or directory: '/Users/Documents/my_file.json.CDC5FEf4'
Can anyone tell me what I'm doing wrong? If I replace the output directory to /home/ubuntu/ it works fine, but I want the file on my local machine. Thanks in advance.

The script has to run on your local Windows machine not on EC2 instance.
Alternatively, you can simply use aws cli
aws s3 cp s3://sfbucket.myBucket/sf_events.json /Users/Documents/my_file.json

As you are trying to download the file using the EC2 instance, this machine doesn't know your local path /Users/Documents/my_file.json
You have two options:
Run this script directly in your local machine.
In this case, you have to run this script in your local machine and make sure you have access to this bucket.
Run this script in EC2 Instance and copy.
In this case, you have to download the file to somewhere /home/ubuntu/ and then after in your local machine make a copy from EC2.
You can use [SCP][1]
You can install the SCP server in your local machine but the problem is that your local machine probably doesn't have a private IP, so every time that your local IP change you have to update the script. Maybe you should think about to do whatever you need to do with this file directly on EC2 Instance or if it's a real manual thing, sending via email, maybe?

Related

Access local directory files from a flask api running on docker container

My API has to access an image located in a directory on the host server, store it in another directory and access the whole directory multiple times in the host server.
Currently, when I run the API on docker container and it tries to access an image present in a directory on the host, I get the error-
FileNotFoundError: [Errno 2] No such file or directory: 'W:/datasets/tmp/users/1.jpg'
I understand this is because the API is running on an independent server. The directory cannot be mounted to the container as it is large and also will keep increasing in real-time.
How do I access the file system in this situation?
Will SSH using paramiko help in this case or is there another way to do it?

Will SSH using paramiko help in this case
No, you want a file server, not a shell. If you had an SSH server on the host, then you could use SFTP, but you could also use NFS, as commented
Ultimately, if you're trying to access directories on the same machine as the container, you should be using volume mounts into the container
cannot be mounted to the container as it is large and also will keep increasing in real time.
So? Have you tried using the mount? What specific issues were you having with it "being large" and mounted?

How can I connect to an on-prem windows server from azure databricks notebook using python?

Connection need to be established between azure databricks and an on-prem windows server. I tried the below python code:
import os
filePath = "\\\\SERVER001\\folder\\"
fileExtension = ".xml"
def get_file_count(filePath, fileExtension):
try:
fileCount = len([name for name in os.listdir(filePath) if name.endswith(fileExtension)])
print(fileCount)
except Exception as e:
print(str(e))
get_file_count(filePath, fileExtension)
but it gave me the error:
[Errno 2] No such file or directory: '\\\\SERVER001\\folder\\'
It is searching within the databricks directories I guess. The connection itself is not happening. I am a beginner in databricks domain. Any help will be appreciated.

It's not possible out of the box, because that server is on-premise, and Databricks is in the cloud, without any knowledge about your on-premise environment.
You have two choices:
You need to upload files onto DBFS, and then access them. You can do it for example via UI - via DBFS file browser (docs) or via Upload Data UI (docs). If you have a lot of files is huge, then you can use something like az-copy to upload file(s) to Azure Storage
Theoretically you can setup your network environment to connect to on-premise via VPN (you need the workspace with "Bring your own VNet"), and then access the file share, but that could be challenging as you need to make sure that you have all necessary ports opened on firewalls, etc.
I would recommend to go with first option.

Copy folder from server(Linux) to local machine(windows) in python

How to copy a folder from Server (linux) to a local machine (windows) in python.
I tried with the given code but it did not work
from distutils.dir_util import copy_tree
copy_tree("source_path ","destination_path")
I used copy_tree command to copy a folder on my local machine but when I used the same command to copy a folder from server to local machine then it did not work.
Any other method is there? Or any changes needed?

You need to use SSH, SCP, or SFTP to transfer files from host to host.
I do this a lot and like to use SSH and SCP. You can run and SSH server on your windows machine using OpenSSH. Here is a good set of instructions from WinSCP: https://winscp.net/eng/docs/guide_windows_openssh_server.
I recommend using Paramiko for SSH with Python. Here is a good answer showing how this works with python: https://stackoverflow.com/a/38556344/634627.
If you set up OpenSSH, you could also do this with SFTP, sometimes I find this is more suitable that SCP. Here is a good answer showing how that works: https://stackoverflow.com/a/33752662/634627
The trick is getting OpenSSH running on your Windows host and setting up SSH keys so your server can authenticate to your localhost.

Using copytree should work if:
the folder on the server is made available to the windows machine as a client.
you have sufficient access permissions.
you use a raw string for the windows path to prevent string interpretation.
Ad 3: try print('c:\test\robot'):
In [1]: print('c:\test\robot')
obot est

Save image from python script to server without saving on local machine

Is there any way to save image from python script to server with python command without saving on local machine? For example, what would be the last line in this code:
path = 'path/to/folder/on/server'
imfile=frame.copy() #get screenshot from my webcam
magic_function(imfile, path)
Maybe I can do this with paramiko library and ssh.exec_command? What should I do in this case?
I use Ubuntu 14.04 on both local and server machine and my code is in python-2.7

How to run a code in an Amazone's EC2 instance?

I understand nearly nothing to the functioning of EC2. I created an Amazon Web Service (AWS) account. Then I launched an EC2 instance.
And now I would like to execute a Python code in this instance, and I don't know how to proceed. Is it necessary to load the code somewhere in the instance? Or in Amazon's S3 and to link it to the instance?
Where is there a guide that explain the usages of instance that are possible? I feel like a man before a flying saucer's dashboard without user's guide.

Here's a very simple procedure to move your Python script from local to EC2 Instance and run it.
> 1. scp -i <filepath to Pem> <filepath to Py File> ec2-user#<Public DNS>.compute-1.amazonaws.com:<filepath in EC2 instance where you want
> your file to be>
> 2. Cd to to the directory in EC2 containing the file. Type Python <Filename.py> There it executed.
Here's a concrete examples for those who likes things shown step-by-step:
In your local directory, create a python script with the following code: print("Hello AWS")
Assuming you already have AWS already set up and you want to run this script in EC2, you need to SCP (Secure Copy Protocol) your file to a directory in EC2. So here's an example:
- My filepath to pem is ~/Desktop/random.pem.
- My filepath to py file is ~/Desktop/hello_aws.py
- My public DNS is ec22-34-12-888
- The ec2 directory where I want my script to be is in /home/ec2-user
- So the full command I run in my local terminal is:
scp -i ~/Desktop/random.pem ~/Desktop/hello_aws.py ec2-user#ec2-34-201-49-170.compute-1.amazonaws.com:/home/ec2-user
Now ssh to your ec2 instance, cd to /home/ec2-user (Or wherever you put your file) and Python hello_aws.py

You have a variety of options. You can browse through a large library of AMIs here.
You can import a vm, instructions are here.
This is a general article about AWS and python.
And in this article, the author takes you through a more advanced system with a combination of datastores in python using the highly recommend django framework.

Launch your instance through Amazon's Management Console -> Instance Actions -> Connect
(More details in the getting started guide)
Launch the Java based SSH CLient
Plugins-> SCFTP File Transfer
Upload your files
run your files in the background (with '&' at the end or use nohup)
Be sure to select an AMI with python included, you can check by typing 'python' in the shell.
If your app require any unorthodox packages you'll have to install them.

Running scripts on Linux ec2 instances
I had to run a script on Amazon ec2 and learned how to do it. Even though the question was asked years back, I thought I would share how easy it is today.
Setting up EC2 and ssh-ing to ec2 host
Sign up and launch an ec2 instance(Do not forget to save the certificate file that will be generated while launching ec2) with default settings.
Once the ec2 is up and running, provide required permissions to the certificate file chmod 400 /path/my-key-pair.pem (or .cer file)
Run the command: ssh -i /path/my-key-pair.pem(.cer) USER#Public DNS(USER data changes based on the operating system you have launched, refer to the below paragraph for more details && Public DNS can be obtained on ec2 instance page)
Use the ssh command to connect to the instance. You specify the private key (.pem) file and user_name#public_dns_name. For Amazon Linux, the user name is ec2-user. For RHEL, the user name is ec2-user or root. For Ubuntu, the user name is ubuntu or root. For Centos, the user name is centos. For Fedora, the user name is ec2-user. For SUSE, the user name is ec2-user or root. Otherwise, if ec2-user and root don't work, check with your AMI provider.
Clone the script to EC2
In order to run the scripts on ec2, I would prefer storing the code on Github as a repo or as a gist(if you need to keep code private) and clone into ec2.
Above mention is very easy and is not error-prone.
Running the python script
I have worked with RHEL Linux instance and python was already installed. So, I could run python script after ssh-ing to host directly. It depends on your operating system you choose. Refer to aws manuals if it's not installed already.
Reference: AWS Doc

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.