How to run SQLAlchemy on AWS Lambda in Python - python

I preapre very simple file for connecting to external MySQL database server, like below:
from sqlalchemy import *
def run(event, context):
sql = create_engine('mysql://root:root#127.0.0.1/scraper?charset=utf8');
metadata = MetaData(sql)
print(sql.execute('SHOW TABLES').fetchall())
Doesn't work on AWS, but localy on Windows works perfectly.
Next, I install by pip install sqlalchemy --target my/dir and prepare ZIP file to upload packages to AWS Lambda.
Run, but with failed message No module named 'MySQLdb': ModuleNotFoundError.
Then, I use pip install mysqlclient --target my/dir, create ZIP and again upload to AWS Lambda.
Run, but with new failed message cannot import name '_mysql': ImportError.
So, what I should doing now?

SQLAlchemy includes many Dialect implementations for various backends.
Dialects for the most common databases are included with SQLAlchemy; a
handful of others require an additional install of a separate dialect.
The MySQL dialect uses mysql-python as the default DBAPI. There are
many MySQL DBAPIs available, including MySQL-connector-python and
OurSQL
Instead of mysql you may use mysql+mysqlconnector
sql = create_engine('mysql+mysqlconnector://root:root#127.0.0.1/scraper?charset=utf8')
Then use:
pip install mysql-connector --target my/dir
Create Zip and again upload to AWS Lambda.

Related

Installing postgresql db together with python package

I wrote a python script that requires the use of a postgresql DB. For test purpose, I installed the postgresql DB manually, and the DB that comes with that. The script connects to it and make its job.
My question is about packaging : what is the best solution for the user to install this script, along with the DB and its schema juste by typing pip install xxx ?
Is that possible ?
Thanks
Postgres is great but often you can get away with SQLite. It's part of the standard library and comes bundled with Python

Can not connect to mysql database using Django

I'm trying to connect my Django app with MySQL and I have Successfully installed MySQL-connector-python-2.0.4 also. But after that whenever I run server same error message is showing.
Here are my cmd ...
Here is the full message showing
you are installing MySQL connector for python (not recommended), you need the MySQLdb module, refer to this official recommendation from django doc https://docs.djangoproject.com/en/3.0/ref/databases/#mysql-db-api-drivers
and install mysqlclient it is a native driver and It’s the recommended choice.
pip install mysqlclient
for further informations have a look at this medium post : https://medium.com/#omaraamir19966/connect-django-with-mysql-database-f946d0f6f9e3

How to Connect to RDS Instance from AWS Glue Python Shell?

I am trying to access RDS Instance from AWS Glue, I have a few python scripts running in EC2 instances and I currently use PYODBC to connect, but while trying to schedule jobs for glue, I cannot import PYODBC as it is not natively supported by AWS Glue, not sure how drivers will work in glue shell as well.
From: Introducing Python Shell Jobs in AWS Glue announcement:
Python shell jobs in AWS Glue support scripts that are compatible with Python 2.7 and come pre-loaded with libraries such as the Boto3, NumPy, SciPy, pandas, and others.
The module list doesn't include pyodbc module, and it cannot be provided as custom .egg file because it depends on libodbc.so.2 and pyodbc.so libraries.
I think you have 2 options:
Create a jdbc connection to your DB from Glue's console, and use Glue's internal methods to query it. This will require code changes of course.
Use Lambda function instead. You'll need to pack pyodbc and the required libs along with your code in a zip file. Someone has already compiled those libs for AWS Lambda, see here.
Hope it helps
For AWS Glue use either Dataframe/DynamicFrame and specify the SQL Server JDBC driver. AWS Glue already contain JDBC Driver for SQL Server in its environment so you don't need to add any additional driver jar with glue job.
df1=spark.read.format("jdbc").option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver").option("url", url_src).option("dbtable", dbtable_src).option("user", userID_src).option("password", password_src).load()
if you are using a SQL instead of table:
df1=spark.read.format("jdbc").option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver").option("url", url_src).option("dbtable", ("your select statement here") A).option("user", userID_src).option("password", password_src).load()
As an alternate solution you can also use jtds driver for SQL server in your python script running in AWS Glue
If anyone needs a postgres connection with sqlalchemy using python shell, it is possible by referencing the sqlalchemy, scramp, pg8000 wheel files, it's important to reconstruct the wheel from pg8000 by eliminating the scramp dependency on the setup.py.
I needed to so something similar and ended up creating another Glue job in Scala while using Python for everything else. I know it may not work for everyone but wanted to mention How to run DDL SQL statement using AWS Glue
I was able to use the python library psycopg2 even though it is not written in pure python and it does not come preloaded with aws glue python shell environment. This runs contrary to aws glue documentation. So you might be able to use odbc related python libraries in a similar way. I created .egg files for psycopg2 library and used it successfully within glue python shell environment. Following are the logs from glue python shell if you have import psycopg2 in your script and the glue job refers to the related psycopg2 .egg files.
Creating /glue/lib/installation/site.py
Processing psycopg2-2.8.3-py2.7.egg
Copying psycopg2-2.8.3-py2.7.egg to /glue/lib/installation
Adding psycopg2 2.8.3 to easy-install.pth file
Installed /glue/lib/installation/psycopg2-2.8.3-py2.7.egg
Processing dependencies for psycopg2==2.8.3
Searching for psycopg2==2.8.3
Reading https://pypi.org/simple/psycopg2/
Downloading https://files.pythonhosted.org/packages/5c/1c/6997288da181277a0c29bc39a5f9143ff20b8c99f2a7d059cfb55163e165/psycopg2-2.8.3.tar.gz#sha256=897a6e838319b4bf648a574afb6cabcb17d0488f8c7195100d48d872419f4457
Best match: psycopg2 2.8.3
Processing psycopg2-2.8.3.tar.gz
Writing /tmp/easy_install-dml23ld7/psycopg2-2.8.3/setup.cfg
Running psycopg2-2.8.3/setup.py -q bdist_egg --dist-dir /tmp/easy_install-dml23ld7/psycopg2-2.8.3/egg-dist-tmp-9qwen3l_
creating /glue/lib/installation/psycopg2-2.8.3-py3.6-linux-x86_64.egg
Extracting psycopg2-2.8.3-py3.6-linux-x86_64.egg to /glue/lib/installation
Removing psycopg2 2.8.3 from easy-install.pth file
Adding psycopg2 2.8.3 to easy-install.pth file
Installed /glue/lib/installation/psycopg2-2.8.3-py3.6-linux-x86_64.egg
Finished processing dependencies for psycopg2==2.8.3
These are the steps that I used to connect to an RDS from glue python shell job:
Package up your dependency package into an egg file (these package must be pure python if I remember correctly). Put it in S3.
Set your job to reference that egg file under the job configuration > Python library path
Verify that your job can import the package/module
Create a glue connection to your RDS (it's in Database > Tables, Connections), test the connection make sure it can hit your RDS
Now in your job, you must set it to reference/use this connection. It's in the require connection as you configure your job or edit your job.
Once those steps are done and verify, you should be able to connect. In my sample I used pymysql.

Azure App Service ImportError: libmysqlclient.so.18: cannot open shared object file: No such file or directory

On App Azure Linux with Python, the Mysql module seem not work :
2018-12-24T19:11:38.215760010Z import _mysql
2018-12-24T19:11:38.215763810Z ImportError: libmysqlclient.so.18: cannot
open shared object file: No such file or directory
...
2018-12-24T19:11:27.536810347Z django.core.exceptions.ImproperlyConfigured:
Error loading MySQLdb module.
2018-12-24T19:11:27.536813747Z Did you install mysqlclient?
requirement :
django
mysqlclient
Has anyone ever managed to run django on azure web app?
This is a common error. Using mysqlclient also requires native dependencies to be installed: either the mysql client or the mysql-compatible mariadb client. In order to address these issues the easiest way, change your project to use mysql-connector-python instead of mysqlclient. You will also have to update your settings so that any database engine that uses django.db.backends.mysql should be updated to mysql.connector.django.
It sounds like there is not mysql native client library installed in your Azure App for Linux.
Here is two cases for building custom image.
For Debian or Ubuntu image, please run apt install libmysqlclient-dev firstly to preinstall libmysqlclient.so on your Docker image.
For Fedora or CentOS iamge, please run yum install mysql-libs firstly to preinstall the same one.
Or you can directly use the existing image which has preinstalled these required libs from Azure Container Registry or DockerHub.
Please take a try that go to the app service scm site, and find the pip location, then use pip to install the required module.

Flask-SQLAlchemy on App Engine connect to MSSQL database on Cloud Compute Engine using

I'm very new to the GCP as a whole, and I need to deploy a Flask app for a project with a client. Deploying an app is simple enough given all of the docs Google has provided, and since using the flexible app engine seems like the easiest way to do it, that's what I'm trying to use.
The issue I'm having though is in trying to connect to an MSSQL database that was setup on a Compute Engine. So far, I've connected to the database locally using pyodbc with some help from Connect to MSSQL Database using Flask-SQLAlchemy.
I was certain running gcloud app deploy would not work, and sure enough it wasn't able to install the pyodbc module. I figured that that wouldn't be the way to go anyway, and based on this page of the docs, it seems like I should be able to connect to the compute engine via its internal IP address.
I don't know how to proceed from here though, because everything in the docs wants me to use a Cloud SQL instance, but given that this data was provided by a client and I'm working on their GCP project, I'm a bit limited to the scenario I've described above.
This has since been resolved.
The issue was that there were no ODBC drivers downloaded on the server, so I needed to create a custom runtime with a Dockerfile in order to first install the drivers.
My solution was greatly aided by this solution: Connect docker python to SQL server with pyodbc
The steps are as follows:
Run gcloud beta app gen-config --custom in your flask app's directory.
Inside of the now created Dockerfile, add these lines before installing the pip requirements.
#Install FreeTDS and dependencies for PyODBC
RUN apt-get update
RUN apt-get install -y tdsodbc unixodbc-dev
RUN apt install unixodbc-bin -y
RUN apt-get clean -y
ADD odbcinst.ini /etc/odbcinst.ini
The file odbcinst.ini should contain the following lines:
[FreeTDS]
Description=FreeTDS Driver
Driver=/usr/lib/x86_64-linux-gnu/odbc/libtdsodbc.so
Setup=/usr/lib/x86_64-linux-gnu/odbc/libtdsS.so
After that gcloud app deploy should work just fine.

Categories