AWS Glue Python Shell package import - python

We create a python shell job which is connecting Redshift and fetching data, below program is working fine in my local system.
Below are the steps and programs.
Program:-
import sqlalchemy as sa
from sqlalchemy.orm import sessionmaker
#>>>>>>>> MAKE CHANGES HERE <<<<<<<<<<<<<
DATABASE = "#####"
USER = "#####"
PASSWORD = "#####"
HOST = "#####.redshift.amazonaws.com"
PORT = "5439"
SCHEMA = "test" #default is "public"
####### connection and session creation ##############
connection_string = "redshift+psycopg2://%s:%s#%s:%s/%s" % (USER,PASSWORD,HOST,str(PORT),DATABASE)
engine = sa.create_engine(connection_string)
session = sessionmaker()
session.configure(bind=engine)
s = session()
SetPath = "SET search_path TO %s" % SCHEMA
s.execute(SetPath)
###### All Set Session created using provided schema #######
################ write queries from here ######################
query = "SELECT * FROM test1 limit 2;"
rr = s.execute(query)
all_results = rr.fetchall()
def pretty(all_results):
for row in all_results :
print("row start >>>>>>>>>>>>>>>>>>>>")
for r in row :
print(" ----" , r)
print("row end >>>>>>>>>>>>>>>>>>>>>>")
pretty(all_results)
########## close session in the end ###############
s.close()
Steps:-
sudo pip install psycopg2
sudo pip install sqlalchemy
sudo pip install sqlalchemy-redshift
I have uploaded the files psycopg2-2.8.4-cp27-cp27m-win32.whl, Flask_SQLAlchemy-2.4.1-py2.py3-none-any.whl and sqlalchemy_redshift-0.7.5-py2.py3-none-any.whl in S3 (s3://####/lib/), and map the folder in Python library path in AWS Glue Job.
When I run the program below error is occurring.
Traceback (most recent call last):
File "/tmp/runscript.py", line 113, in <module>
download_and_install(args.extra_py_files)
File "/tmp/runscript.py", line 56, in download_and_install
download_from_s3(s3_file_path, local_file_path)
File "/tmp/runscript.py", line 81, in download_from_s3
s3.download_file(bucket_name, s3_key, new_file_path)
File "/usr/local/lib/python2.7/site-packages/boto3/s3/inject.py", line 172, in download_file
extra_args=ExtraArgs, callback=Callback)
File "/usr/local/lib/python2.7/site-packages/boto3/s3/transfer.py", line 307, in download_file
future.result()
File "/usr/local/lib/python2.7/site-packages/s3transfer/futures.py", line 106, in result
return self._coordinator.result()
File "/usr/local/lib/python2.7/site-packages/s3transfer/futures.py", line 265, in result
raise self._exception
botocore.exceptions.ClientError: An error occurred (404) when calling the HeadObject operation: Not Found
PS:- The Glue Job Role has full access to S3.
Please suggest how to map those libraries with the program.

You can specify your own Python libraries packaged as an .egg or a .whl file under the "—extra-py-files" flag as shown in below example.
Command line example :
aws glue create-job --name python-redshift-test-cli --role role --command '{"Name" : "pythonshell", "ScriptLocation" : "s3://MyBucket/python/library/redshift_test.py"}'
--connections Connections=connection-name --default-arguments '{"--extra-py-files" : ["s3://MyBucket/python/library/redshift_module-0.1-py2.7.egg", "s3://MyBucket/python/library/redshift_module-0.1-py2.7-none-any.whl"]}'
Refernece : Create a glue job with extra python library

There is a simple way to import python dependencies using whl files, that can be find on Python site for particular module.
You can also add multiple wheel files from S3 using comma.
For eg
"s3://xxxxxxxxx/common/glue/glue_whl/fastparquet-0.4.1-cp37-cp37m-macosx_10_9_x86_64.whl,s3://xxxxxx/common/glue/glue_whl/packaging-20.4-py2.py3-none-any.whl,s3://xxxxxx/common/glue/glue_whl/s3fs-0.5.0-py3-none-any.whl"
enter image description here

Related

error using pip search (pip search stopped working)

I am getting this error in pip search while studying python.
The picture is an error when I pip search. Can you tell me how to fix it?
$ pip search pdbx
ERROR: Exception:
Traceback (most recent call last):
File "*/lib/python3.7/site-packages/pip/_internal/cli/base_command.py", line 224, in _main
status = self.run(options, args)
File "*/lib/python3.7/site-packages/pip/_internal/commands/search.py", line 62, in run
pypi_hits = self.search(query, options)
File "*/lib/python3.7/site-packages/pip/_internal/commands/search.py", line 82, in search
hits = pypi.search({'name': query, 'summary': query}, 'or')
File "/usr/lib/python3.7/xmlrpc/client.py", line 1112, in __call__
return self.__send(self.__name, args)
File "/usr/lib/python3.7/xmlrpc/client.py", line 1452, in __request
verbose=self.__verbose
File "*/lib/python3.7/site-packages/pip/_internal/network/xmlrpc.py", line 46, in request
return self.parse_response(response.raw)
File "/usr/lib/python3.7/xmlrpc/client.py", line 1342, in parse_response
return u.close()
File "/usr/lib/python3.7/xmlrpc/client.py", line 656, in close
raise Fault(**self._stack[0])
xmlrpc.client.Fault: <Fault -32500: 'RuntimeError: This API has been temporarily disabled due to unmanageable load and will be deprecated in the near future. Please use the Simple or JSON API instead.'>
The pip search command queries PyPI's servers, and PyPI's maintainers have explained that the API endpoint that the pip search command queries is very resource intensive and too expensive for them to always keep open to the public. Consequently they sometimes throttle access and are actually planning to remove it completely soon.
See this GitHub issues thread ...
The solution I am using for now is to pip install pip-search (a utility created by GitHub user #victorgarric).
So, instead of 'pip search', I use pip_search. Definitely beats searching PyPI via a web browser
Follow the suggestion from JRK at the discussion at github (last comment) the search command is temporarily disabled, use your browser to search for packages meanwhile:
Check the thread on github and give him a thumb up ;)
search on website, https://pypi.org/,
then install the package you wanted
The error says
Please use the Simple or JSON API instead
You can try pypi-simple to query the pip repository
https://pypi.org/project/pypi-simple/
It gives an example too, I tried to use it here:
pypi-simple version 0.8.0 DistributionPackage' object has no attribute 'get_digest':
!/usr/bin/env python3
-*- coding: utf-8 -*-
"""
Created on Thu Nov 11 17:40:03 2020
#author: Pietro
"""
from pypi_simple import PyPISimple
def simple():
package=input('\npackage to be checked ')
try:
with PyPISimple() as client:
requests_page = client.get_project_page(package)
except:
print("\n SOMETHING WENT WRONG !!!!! \n\n",
"CHECK INTERNET CONNECTION OR DON'T KNOW WHAT HAPPENED !!!\n")
pkg = requests_page.packages[0]
print(pkg)
print(type(pkg))
print('\n',pkg,'\n')
print('\n'+pkg.filename+'\n')
print('\n'+pkg.url+'\n')
print('\n'+pkg.project+'\n')
print('\n'+pkg.version+'\n')
print('\n'+pkg.package_type+'\n')
#print('\n'+pkg.get_digest()+'\n','ENDs HERE !!!!') #wasnt working
if __name__ == '__main__':
simple()
got -4 so far for this answer don't know why , figureout I can try to check for a package with:
# package_name = input('insert package name : ')
package_name = 'numpy'
import requests
url = ('https://pypi.org/pypi/'+package_name+'/json')
r = requests.get(url)
try:
data = r.json()
for i in data:
if i == 'info':
print('ok')
for j in data[i]:
if j == 'name':
print((data[i])[j])
print([k for k in (data['releases'])])
except:
print('something went south !!!!!!!!!!')

Python detect new connection to wifi

I saw a tutorial on YouTube(I can't link it because I can't find it anymore),
So the code is supposed to detect devices that are connected to my Internet/Router, I don't understand a lot about how his(The person who made the tutorial) code works
I also got this error in my console:
File "c:/Users/j/Desktop/Connection-Detection.py", line 6, in
IP_NETWORK = config('IP_NETWORK')
File "C:\Users\j\AppData\Local\Programs\Python\Python38-32\lib\site-packages\decouple.py", line 199, in call
return self.config(*args, **kwargs)
File "C:\Users\j\AppData\Local\Programs\Python\Python38-32\lib\site-packages\decouple.py", line 83, in call
return self.get(*args, **kwargs)
File "C:\Users\j\AppData\Local\Programs\Python\Python38-32\lib\site-packages\decouple.py", line 68, in get
raise UndefinedValueError('{} not found. Declare it as envvar or define a default value.'.format(option))
decouple.UndefinedValueError: IP_NETWORK not found. Declare it as envvar or define a default value.
PS C:\Users\j\Desktop\python\login>
That's "Detection.py"
import sys
import subprocess
import os
from decouple import config
IP_NETWORK = config('IP_NETWORK')
IP_DEVICE = config('IP_DEVICE')
proc = subprocess.Popen(['ping', IP_NETWORK],stdout=subprocess.PIPE)
while True:
line = proc.stdout.readline
if not line:
break
connected_ip = line.decode('utf-8').split()[3]
if connected_ip == IP_DEVICE:
subprocess.Popen(['say', 'Someone connected to network'])
You need to define an environment variable in same directory as the Detection.py file.
Steps
Install python-decouple - pip install python-decouple.
Create a file called .env
Open the .env file and paste the following into it.
IP_NETWORK=YOUR_IP_NETWORK
IP_DEVICE=YOUR_IP_DEVICE
Replace YOUR_IP_NETWORK and YOUR_IP_DEVICE with your IP_NETWORK and IP_DEVICE

How to install packages with python scripts

I want to install a package with a python script. I have read the documentation about PackageManager API (http://doc.aldebaran.com/2-4/naoqi/core/packagemanager-api.html):
So I have packaged the app with choregraphe as it is described in http://doc.aldebaran.com/2-4/naoqi/core/packagemanager.html and I have tried to install it with a python script that looks like:
import qi
import sys
if __name__ == '__main__':
ip = "11.1.11.111"
port = 9559
session = qi.Session()
try:
session.connect("tcp://" + ip + ":" + str(port))
except RuntimeError:
print ("Can't connect to Naoqi at ip \"" + ip + "\" on port " + str(port))
sys.exit(1)
service = session.service("PackageManager")
package = "C:\\test_package_handlers_01-835a92-1.0.0.pkg"
# this is to see if the problem is that python can not locate the file
with open(package) as f:
print f
service.install(package)
And here is what I receive as an error:
# provided package could be opened
<open file 'C:\\test_package_handlers_01-835a92-1.0.0.pkg', mode 'r' at 0x02886288>
Traceback (most recent call last):
File "C:/test.py", line 24, in <module>
service.install(package)
RuntimeError: C:\test_package_handlers_01-835a92-1.0.0.pkg: no such file
I guess this is because the package must be uploaded on the robot and the package file path must be the one that is on the robot.
EDITED
I have added the package to a choreographe blank project and run this blank project on the robot. This way the package was saved to the robot with path /home/nao/.local/share/PackageManager/apps/.lastUploadedChoregrapheBehavior/test_package_handlers_01-835a92-1.0.0.pkg and when I have changed the path in my script ("C:\\test_package_handlers_01-835a92-1.0.0.pkg" with "/home/nao/.local/share/PackageManager/apps/.lastUploadedChoregrapheBehavior/test_package_handlers_01-835a92-1.0.0.pkg") the script worked as it was intended and the package was installed on the robot.
So is there a way to install packages from my PC without uploading them to the robot, because otherwise it is better to use Choregraphe to upload projects.
Maybe it is good to give the following explanation of what I want to achieve:
I have a folder on my PC with 20 packages for example
I want to install all those 20 packages with one python script
There is a python script that installs all the packages from the folder when it is invoked like this:
python package_installer.py path_to_packages_folder
EDITED_2
import qi
import ftplib
import os
ROBOT_URL = "10.80.129.90"
print "Uploading PKG"
pkg_file = "my-application-0.0.1.pkg"
pkg_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), pkg_file)
ftp = ftplib.FTP(ROBOT_URL)
ftp.login("nao", "nao")
with open(pkg_path) as pkg:
ftp.storbinary("STOR "+pkg_file, pkg)
print "Connecting NAOqi session"
app = qi.Application(url='tcp://'+ROBOT_URL+':9559')
app.start()
session = app.session
print "Installing app"
packagemgr = session.service("PackageManager")
packagemgr.install("/home/nao/"+pkg_file)
print "Cleaning robot"
ftp.delete(pkg_file)
ftp.quit()
print "End"
app.stop()
This piece of code ftp = ftplib.FTP(ROBOT_URL) throws the following exception:
Traceback (most recent call last):
File "C:/Stefan/DSK_PEPPER_clode_2/PythonScripts/_local_testing/uploading_and_installing_package.py", line 11, in <module>
ftp = ftplib.FTP(ROBOT_URL)
File "C:\Python27\lib\ftplib.py", line 120, in __init__
self.connect(host)
File "C:\Python27\lib\ftplib.py", line 135, in connect
self.sock = socket.create_connection((self.host, self.port), self.timeout)
File "C:\Python27\lib\socket.py", line 575, in create_connection
raise err
socket.error: [Errno 10061] No connection could be made because the target machine actively refused it
Also when I connect to the robot with username 'nao' and pass 'nao' as described in http://doc.aldebaran.com/2-5/dev/tools/opennao.html and then try to create a folder in /home/nao/.local/share/PackageManager/apps/ with sudo mkdir it informs me that: Sorry, user nao is not allowed to execute '/bin/mkdir dasdas' as root on Pepper.. If I use only mkdir here is what it tells me: mkdir: cannot create directory 'new_folder': Permission denied
Using qibuild, you can also directly install using:
qipkg deploy-package /path/to/my-package.pkg --url nao#10.10.23.45
You indeed need to upload the file before. You can use scp or sftp to do this. Once the .pkg is on the robot then you can use PackageManager.install.
Imagine something like:
import qi
import paramiko
import os
ROBOT_URL = "10.80.129.90"
print "Uploading PKG"
pkg_file = "my-application-0.0.1.pkg"
pkg_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), pkg_file)
transport = paramiko.Transport((ROBOT_URL, 22))
transport.connect(username="nao", password="nao")
sftp = paramiko.SFTPClient.from_transport(transport)
sftp.put(pkg_path, pkg_file)
print "Connecting NAOqi session"
app = qi.Application(url='tcp://'+ROBOT_URL+':9559')
app.start()
session = app.session
print "Installing app"
packagemgr = session.service("PackageManager")
packagemgr.install("/home/nao/"+pkg_file)
print "Cleaning robot"
sftp.remove(pkg_file)
sftp.close()
transport.close()
print "End"
app.stop()

Accessing OrientDB from Python

I want to convert a >1mn record MySQL database into a graph database, because it is heavily linked network-type data. The free version of Neo4J had some restrictions I thought I might bump up against, so I've installed OrientDB (Community 2.2.0) (on Ubuntu Server 16.04) and got it working. Now I need to access it from Python (3.5.1+), so I'm trying pyorient (1.5.2). (I tried TinkerPop since I eventually want to use Gremlin, and couldn't get the gremlin console to talk to the OrientDB.)
The following simple Python code, to connect to one of the test graphs in OrientDB:
import pyorient
username="user"
password="password"
client = pyorient.OrientDB("localhost", 2424)
session_id = client.connect( username, password )
print("SessionID=",session_id)
db_name="GratefulDeadConcerts"
if client.db_exists( db_name, pyorient.STORAGE_TYPE_MEMORY ):
print("Database",db_name,"exists")
client.db_open( db_name, username, password )
else:
print("Database",db_name,"doesn't exist")
gives a weird error:
SessionID= 27
Database GratefulDeadConcerts exists
Traceback (most recent call last):
File "FirstTest.py", line 18, in <module>
client.db_open( db_name, username, password )
File "/home/tom/MyProgs/TestingPyOrient/env/lib/python3.5/site-packages/pyorient/orient.py", line 379, in db_open
.prepare((db_name, user, password, db_type, client_id)).send().fetch_response()
File "/home/tom/MyProgs/TestingPyOrient/env/lib/python3.5/site-packages/pyorient/messages/database.py", line 141, in fetch_response
info = OrientVersion(release)
File "/home/tom/MyProgs/TestingPyOrient/env/lib/python3.5/site-packages/pyorient/otypes.py", line 202, in __init__
self._parse_version(release)
File "/home/tom/MyProgs/TestingPyOrient/env/lib/python3.5/site-packages/pyorient/otypes.py", line 235, in _parse_version
self.build = int( self.build )
ValueError: invalid literal for int() with base 10: '0 (build develop#r79d281140b01c0bc3b566a46a64f1573cb359783; 2016'
Does anyone know what that is or how I can fix it? Should I really be using TinkerPop instead? If so I'll post a seperate question about my struggles with that.
I firstly got the error, but after upgrading Pyorient to last version 1.5.4 I get no errors.
$ python test.py
('SessionID=', 6)
('Database', 'GratefulDeadConcerts', 'exists')
$ python --version
Python 2.7.11

Connect to Filemaker Database using JDBC, Python, and JayDeBeApi

I'm trying to write an AWS Lambda Python Package that will connect to a FileMaker database over JDBC. To test, I've launched an EC2 instance with the Lambda Linux AMI, and created a virtualenv (/venv) that I'm testing in. I've uploaded the fmjdbc.jar to the instance using WinSCP to /venv/lib/fmjdbc.jar. The code uses JayDeBeApi, following the usage example here: https://pypi.python.org/pypi/JayDeBeApi/#usage
My code so far is the following:
import jaydebeapi as jdb
driverclass = 'com.filemaker.jdbc.Driver'
jdbcURL = 'jdbc:filemaker://url:port;database'
jar = '/home/ec2-user/lambda-test-project/venv/lib/fmjdbc.jar'
print jar
conn = jdb.connect(driverclass,[jdbcURL,'username','password'],jar)
Which gives me the error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/ec2-user/lambda-test-project/venv/local/lib/python2.7/site-package s/jaydebeapi/__init__.py", line 359, in connect
jconn = _jdbc_connect(jclassname, jars, libs, *driver_args)
File "/home/ec2-user/lambda-test-project/venv/local/lib/python2.7/site-package s/jaydebeapi/__init__.py", line 183, in _jdbc_connect_jpype
return jpype.java.sql.DriverManager.getConnection(*driver_args)
jpype._jexception.SQLExceptionPyRaisable: java.sql.SQLException: No suitable driver found for jdbc:filemaker://<MY URL STUFF IS HERE>
How can I get the jdbc driver to be read by Python's virtual environment? I'd like to have this code work in a Lambda package eventually, so I'm hoping there's a solution that can be integrated to the Python code that will work repeatedly on newly created servers.
You can use jpype package to set driver for python. I used it for connecting Oracle DB before. There is my sample code which may be useful for you.
import jaydebeapi,jpype
classpath = "your jdbc jar driver path"
jvm_path = "/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.36.x86_64/jre/lib/amd64/server/libjvm.so" #your java vm path
jpype.startJVM(jvm_path, "-Djava.class.path=%s" % classpath) #start jvm based on the driver
conn = jaydebeapi.connect(xxxxxx)

Categories