Connecting to CloudSQL from Dataflow in Python - python

I'm trying to connect to CloudSQL with a python pipeline.
Actual situation
I can do it without any problem using DirectRunner
I can not connect using DataflowRunner
Connection function
def cloudSQL(input):
import pymysql
connection = pymysql.connect(host='<server ip>',
user='...',
password='...',
db='...')
cursor = connection.cursor()
cursor.execute("select ...")
connection.close()
result = cursor.fetchone()
if not (result is None):
yield input
The error
This is the error message using DataflowRunner
OperationalError: (2003, "Can't connect to MySQL server on '<server ip>' (timed out)")
CloudSQL
I have publicIP (to test from local with directrunner) and I have also trying to activating private IP to see if this could be the problem to connect with DataflowRunner
Option2
I have also tried with
connection = pymysql.connect((unix_socket='/cloudsql/' + <INSTANCE_CONNECTION_NAME>,
user='...',
password='...',
db='...')
With the error:
OperationalError: (2003, "Can't connect to MySQL server on 'localhost' ([Errno 2] No such file or directory)")

Take a look at the Cloud SQL Proxy. It will create a local entrypoint (Unix socket or TCP port depending on what you configure) that will proxy and authenticate connections to your Cloud SQL instance.

You would have to mimic the implementation of JdbcIO.read() in Python as explained in this StackOverflow answer

With this solution I was able to access to CloudSQL.
For testing purpose you can add 0.0.0.0/0 to CloudSQL publicIP without using certificates

I created a example using Cloud SQL Proxy inside the Dataflow worker container, connection from the Python pipeline using Unix Sockets without need for SSL or IP authorization.
So the pipeline is able to connect to multiple Cloud SQL instances.
https://github.com/jccatrinck/dataflow-cloud-sql-python
There is a screenshot showing the log output showing the database tables as example.

Related

Flask looks for db locally when it's on another server

I'm trying to setup my Flask application to work with a database hosted on a different server. My whole setup works, if i try to work with a simple PyMysql script i will be able to connect to the database, but when i try to do that from Flask i get any kind of problem.
I'm keeping my db configurations on config.py:
SQLALCHEMY_DATABASE_URI = 'mysql+pymysql://user:pass#external_ip/mydb'
But whenever i try to do a query, i will get the following error:
SELECT command denied to user 'user'#'local_ip'
So flask is looking for the db locally, for some reason, even though i set it to point to an external server. Can anyone help me out on this?
On the same environment, the following will connect and allow me to make queries:
connection = pymysql.connect(host='external_ip', user='user', password='pass', db='mydb', charset='utf8mb4', cursorclass=pymysql.cursors.DictCursor)
This error looks more like a db server error, more then a local flask client error.
If it was a Flask local error of inability to connect to the mysql server you should end up with something like:
Unable to connect to the server
Hostname unreachable
Connection refused (if you reach the server but with the wrong port, for instance)
Reading this error I guess that you have reached the server, but for that user + IP + Database combination you have no read permissions.
See the GRANT Statement doc for further details

Database connection failed for local MSSQL server with pymssql

I had been working with pyodbcfor database connection in windows envirnment and it is working fine but now I want to switch to pymssql so that it is easier to be deployed to Linux machine as well. But I am getting this error:
(20009, b'DB-Lib error message 20009, severity 9:\nUnable to connect: Adaptive Server is unavailable or does not exist (localhost:1433)\nNet-Lib error during Unknown error (10060)\n')
My connection code for using both pyodbc and pymssql is:
import pyodbc
import pymssql
def connectODSDB_1():
conn_str = (
r"Driver={SQL Server};"
r"Server=(local);"
r"Database=populatedSandbox;"
r"Trusted_Connection=yes;"
)
return pyodbc.connect(conn_str)
def connectODSDB_2():
server = '(local)'
database = 'populatedSandbox'
conn = pymssql.connect(server=server, database=database)
return conn
What could be the problem? And solution?
Well after browsing internet for a while, it seems pymssql needs TCP/IP be enabled for communication.
Open Sql Server Configuration Manager
Expand SQL Server Network Configuration
Click on Protocols for instance_name
Enable TCP/IP
I have faced the same issue while using RDS(AWS database instance). We should configured the inbound outbound rules.
Do following steps to configure.
Services->RDS->DB Instances -> Select DB-> Connectivity&Security
Under Security Section
VPC security groups -> click on security group
Change the inbound rules.
Check the source IP and change into anywhere or specific IP

SSHTunnel for remote access of postgres server

I'm new to using postgres as well as ssh and am having some trouble understanding what I need to do to get remote clients accessing a postgres server. Right now I've got one computer with a server running that I can access using psycopg2 but now I want to query the server using another computer. I've looked around and found examples using sshtunneler, but I feel like I'm missing some puzzle pieces.
import psycopg2
from sshtunnel import SSHTunnelForwarder
import time
with SSHTunnelForwarder(
('192.168.1.121', 22),
ssh_password="????",
ssh_username="????",
remote_bind_address=('127.0.0.1', 5432)) as server:
conn = psycopg2.connect(database="networkdb",port=server.local_bind_port)
curs = conn.cursor()
sql = "select * from Cars"
curs.execute(sql)
rows = curs.fetchall()
print(rows)
My first confusion is I'm not sure what username/password should be. I downloaded putty and put the remote address info in the tunnel section using this tutorial but I have no idea if that's doing anything. When I try to start the server I get the error
2017-03-03 10:03:28,742| ERROR | Could not connect to gateway 192.168.1.121:22 : 10060
Any sort of help/explanation of what I need to do would be appreciated.
If I can do it without ssh then that would be better. Currently running this:
psycopg2.connect(dbname='networkinfodb', user='postgres', host='168.192.1.121', password='postgres', port=5432)
outputs...
OperationalError Traceback (most recent call last)
in ()
----> 1 psycopg2.connect(dbname='networkinfodb', user='postgres', host='168.192.1.121', password='postgres', port=5432)
OperationalError: could not connect to server: Connection timed out (0x0000274C/10060)
Is the server running on host "168.192.1.121" and accepting
TCP/IP connections on port 5432?
and I'm not sure where to go to figure out what the issue is.
So I didn't use ssh tunneling. That was only a backup as I was having trouble connecting to the database using psycopg2. I found that the firewall was blocking the port from being accessed externally so I was able to change that and now I can access the database from clients.

How to connect to MS SQL Server database remotely by IP in Python using mssql and pymssql

How can I connect to MS SQL Server database remotely by IP in Python using mssql and pymssql modules.
To connect locally I use link = mssql+pymssql://InstanceName/DataBaseName
I enabled TCP/IP Network Configurations.
But How can I get the connection link?
Thank you.
You need to create a Connection object
import pymssql
ip = '127.0.0.1'
database_connection = pymssql.connect(host=ip, port=1433, username='foo', password='bar')
If you're using SQLAlchemy, or another ORM that supports connection strings, you can also use the following format for the connection string.
'mssql+pymssql://{user}:{password}#{host}:{port}'

Problems in connecting to MusicBrainz database using psycopg2

I am trying to connect to the MusicBrainz database using the psycopg2 python's module. I have followed the instructions presented on http://musicbrainz.org/doc/MusicBrainz_Server/Setup, but I cannot succeed in connecting. In particular I am using the following little script:
import psycopg2
conn = psycopg2.connect( database = 'musicbrainz_db', user= 'musicbrainz', password = 'musicbrainz', port = 5000, host='10.16.65.250')
print "Connection Estabilished"
The problem is that when I launch it, it never reaches the print statement, and the console (I'm on linux) is block indefinitely. It does not even catches the ctrl-c kill, so I have to kill python itself in another console. What can cause this?
You seem to be mistaking MusicBrainz-Server to be only the database.
What's running on port 5000 is the Web Server.
You can access http://10.16.65.250:5000 in the browser.
Postgres is also running, but listens on localhost:5432.
This works:
import psycopg2
conn = psycopg2.connect(database="musicbrainz_db",
user="musicbrainz", password="musicbrainz",
port="5432", host="localhost")
print("Connection established")
In order to make postgres listen to more than localhost you need to change listen_addresses in /etc/postgresql/9.1/main/postgres.conf and make an entry for your (client) host or network in /etc/postgresql/9.1/main/pg_hba.conf.
My VM is running in a 192.168.1.0/24 network so I set listen_addresses='*' in postgres.conf and in pg_hab.conf:
host all all 192.168.1.0/24 trust
I can now connect from my local network to the DB in the VM.
Depending on what you actually need, you might not want to connect to the MusicBrainz Server via postgres. There is a MusicBrainz web service you can access in the VM.
Example:
http://10.16.65.250:5000/ws/2/artist/c5c2ea1c-4bde-4f4d-bd0b-47b200bf99d6.
In that case you might be interested in a library to process the data:
python-musicbrainzngs.
EDIT:
You need to set musicbrainzngs.set_hostname("10.16.65.250:5000") for musicbrainzngs to connect to your local VM.

Categories