Is it possible to make SQLAlchemy do cross server joins?
If I try to run something like
engine = create_engine('mssql+pyodbc://SERVER/Database')
query = sql.text('SELECT TOP 10 * FROM [dbo].[Table]')
with engine.begin() as connection:
data = connection.execute(query).fetchall()
It works as I'd expect. If I change the query to select from [OtherServer].[OtherDatabase].[dbo].[Table] I get an error message "Login failed for user 'NT AUTHORITY\\ANONYMOUS LOGON"
Looks like there's an issue with how you authenticate to SQL server.
I believe you can connect using the current Windows user, the URI syntax is then mssql+pyodbc://SERVER/Database?trusted_connection=yes (I have never tested this, but give it a try).
Another option is to create a SQL server login (ie. a username/password that is defined within SQL server, NOT a Windows user) and use the SQL server login when you connect.
The database URI then becomes: mssql+pyodbc://username:password#SERVER/Database.
mssql+pyodbc://SERVER/Database?trusted_connection=yes threw an error when I tried to it. It did point me in the right direction though.
from sqlalchemy import create_engine, sql
import urllib
string = "DRIVER={SQL SERVER};SERVER=server;DATABASE=db;TRUSTED_CONNECTION=YES"
params = urllib.quote_plus(string)
engine = create_engine('mssql+pyodbc:///?odbc_connect={0}'.format(params))
query = sql.text('SELECT TOP 10 * FROM [CrossServer].[datbase].[dbo].[Table]')
with engine.begin() as connection:
data = connection.execute(query).fetchall()
It's quite complicated if you suppose to alter different servers through one connection.
But if you need to perform a query to a different server under different credentials you should add linked server first with sp_addlinkedserver. Then it should be added credentials to the linked server with sp_addlinkedsrvlogin. Have you tried this?
Related
I am using a Python script to connect to a SQL Server database:
import pyodbc
import pandas
server = 'SQL'
database = 'DB_TEST'
username = 'USER'
password = 'My password'
sql='''
SELECT *
FROM [DB_TEST].[dbo].[test]
'''
cnxn = pyodbc.connect('DRIVER=SQL Server;SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password)
data = pandas.read_sql(sql,cnxn)
cnxn.close()
The script is launched everyday by an automatisation tools so there is no physical user.
The issue is how to replace the password field by a secure method?
The automated script is still ran by a windows user. Add this windows user to the SQL-Server users and give it the appropriate permissions, so you can use:
import pyodbc
import pandas
server = 'SQL'
database = 'DB_TEST'
sql='''
SELECT *
FROM [DB_TEST].[dbo].[test]
'''
cnxn = pyodbc.connect(
f'DRIVER=SQL Server;SERVER={server};DATABASE={database};Trusted_Connection=True;')
data = pandas.read_sql(sql,cnxn)
cnxn.close()
I am also interested in secure coding using Python .I did my own research to figure out available options, I would recommend reviewing this post as it summarize it all. Check on the listed options, and apply the one suits you better.
I'm trying to query an RDS (Postgres) database through Python, more specifically a Jupyter Notebook. Overall, what I've been trying for now is:
import boto3
client = boto3.client('rds-data')
response = client.execute_sql(
awsSecretStoreArn='string',
database='string',
dbClusterOrInstanceArn='string',
schema='string',
sqlStatements='string'
)
The error I've been receiving is:
BadRequestException: An error occurred (BadRequestException) when calling the ExecuteSql operation: ERROR: invalid cluster id: arn:aws:rds:us-east-1:839600708595:db:zprime
In the end, it was much simpler than I thought, nothing fancy or specific. It was basically a solution I had used before when accessing one of my local DBs. Simply import a specific library for your database type (Postgres, MySQL, etc) and then connect to it in order to execute queries through python.
I don't know if it will be the best solution since making queries through python will probably be much slower than doing them directly, but it's what works for now.
import psycopg2
conn = psycopg2.connect(database = 'database_name',
user = 'user',
password = 'password',
host = 'host',
port = 'port')
cur = conn.cursor()
cur.execute('''
SELECT *
FROM table;
''')
cur.fetchall()
I have been having major trouble connecting my python shell to my postgres. I am doing this on windows. I have downloaded psycopg2 and everything for this to process, however it still is not working.
import psycopg2
conn=psycopg2.connect("dbname = 'test' user ='postgres' host ='localhost' password = 'mypassword'")
It gives me an error telling me that the database "test" does not exist, however it does! If you guys have any advice at all on what I should test out, that would be amazing. Thank you!
You can layout connection parameters as a string and pass it to the connect() function as like:
conn = psycopg2.connect("dbname=test user=postgres password=postgres")
Or you can use a list of keyword arguments like
conn = psycopg2.connect(host="localhost",database="test", user="postgres", password="postgres")
If its still fails then you should check on PostgreSQL side. You should try to connect the db in question using command line and see if error re appears or not. if it appears then something is missing on DB server side.
I'm connecting Hive use pyhs2. But the Hive server required Kerberos authentication. Anyone knows how to convert the JDBC string to pyhs2 parameter? Like:
jdbc:hive2://biclient2.server.163.org:10000/default;principal=hive/app-20.photo.163.org#HADOOP.HZ.NETEASE.COM?mapred.job.queue.name=default
I think it will be something like this:
pyhs2.connect(host='biclient2.server.163.org',
port=10000,
authMechanism="KERBEROS",
password="something",
user='your_user#HADOOP.HZ.NETEASE.COM')
I'm also doing the same, I still not succeed, but at least having a meaningful errorcode:
(Server hive/xxx#yyy.COM not found in Kerberos database)
This connection string will work as long as the user running the script has a valid kerberos ticket:
import pyhs2
with pyhs2.connect(host='biclient2.server.163.org',
port=10000,
authMechanism="KERBEROS") as conn:
with conn.cursor() as cur:
print cur.getDatabases()
Username, password and any other configuration parameters are not
passed through the KDC.
I'm trying to connect to a SQL Server 2012 database using SQLAlchemy (with pyodbc) on Python 3.3 (Windows 7-64-bit). I am able to connect using straight pyodbc but have been unsuccessful at connecting using SQLAlchemy. I have dsn file setup for the database access.
I successfully connect using straight pyodbc like this:
con = pyodbc.connect('FILEDSN=c:\\users\\me\\mydbserver.dsn')
For sqlalchemy I have tried:
import sqlalchemy as sa
engine = sa.create_engine('mssql+pyodbc://c/users/me/mydbserver.dsn/mydbname')
The create_engine method doesn't actually set up the connection and succeeds, but
iIf I try something that causes sqlalchemy to actually setup the connection (like engine.table_names()), it takes a while but then returns this error:
DBAPIError: (Error) ('08001', '[08001] [Microsoft][ODBC SQL Server Driver][DBNETLIB]SQL Server does not exist or access denied. (17) (SQLDriverConnect)') None None
I'm not sure where thing are going wrong are how to see what connection string is actually being passed to pyodbc by sqlalchemy. I have successfully using the same sqlalchemy classes with SQLite and MySQL.
The file-based DSN string is being interpreted by SQLAlchemy as server name = c, database name = users.
I prefer connecting without using DSNs, it's one less configuration task to deal with during code migrations.
This syntax works using Windows Authentication:
engine = sa.create_engine('mssql+pyodbc://server/database')
Or with SQL Authentication:
engine = sa.create_engine('mssql+pyodbc://user:password#server/database')
SQLAlchemy has a thorough explanation of the different connection string options here.
In Python 3 you can use function quote_plus from module urllib.parse to create parameters for connection:
import urllib
params = urllib.parse.quote_plus("DRIVER={SQL Server Native Client 11.0};"
"SERVER=dagger;"
"DATABASE=test;"
"UID=user;"
"PWD=password")
engine = sa.create_engine("mssql+pyodbc:///?odbc_connect={}".format(params))
In order to use Windows Authentication, you want to use Trusted_Connection as parameter:
params = urllib.parse.quote_plus("DRIVER={SQL Server Native Client 11.0};"
"SERVER=dagger;"
"DATABASE=test;"
"Trusted_Connection=yes")
In Python 2 you should use function quote_plus from library urllib instead:
params = urllib.quote_plus("DRIVER={SQL Server Native Client 11.0};"
"SERVER=dagger;"
"DATABASE=test;"
"UID=user;"
"PWD=password")
I have an update info about the connection to MSSQL Server without using DSNs and using Windows Authentication. In my example I have next options:
My local server name is "(localdb)\ProjectsV12". Local server name I see from database properties (I am using Windows 10 / Visual Studio 2015).
My db name is "MainTest1"
engine = create_engine('mssql+pyodbc://(localdb)\ProjectsV12/MainTest1?driver=SQL+Server+Native+Client+11.0', echo=True)
It is needed to specify driver in connection.
You may find your client version in:
control panel>Systems and Security>Administrative Tools.>ODBC Data
Sources>System DSN tab>Add
Look on SQL Native client version from the list.
Just want to add some latest information here:
If you are connecting using DSN connections:
engine = create_engine("mssql+pyodbc://USERNAME:PASSWORD#SOME_DSN")
If you are connecting using Hostname connections:
engine = create_engine("mssql+pyodbc://USERNAME:PASSWORD#HOST_IP:PORT/DATABASENAME?driver=SQL+Server+Native+Client+11.0")
For more details, please refer to the "Official Document"
import pyodbc
import sqlalchemy as sa
engine = sa.create_engine('mssql+pyodbc://ServerName/DatabaseName?driver=SQL+Server+Native+Client+11.0',echo = True)
This works with Windows Authentication.
I did different and worked like a charm.
First you import the library:
import pandas as pd
from sqlalchemy import create_engine
import pyodbc
Create a function to create the engine
def mssql_engine(user = os.getenv('user'), password = os.getenv('password')
,host = os.getenv('SERVER_ADDRESS'),db = os.getenv('DATABASE')):
engine = create_engine(f'mssql+pyodbc://{user}:{password}#{host}/{db}?driver=SQL+Server')
return engine
Create a variable with your query
query = 'SELECT * FROM [Orders]'
Execute the Pandas command to create a Dataframe from a MSSQL Table
df = pd.read_sql(query, mssql_engine())