I am trying to connect to the snowflake database using Python. I have the .sql file in VS code that contains multiple SQL statements. For example:
select * from table1;
select * from table2:
select * from table3:
So, I tried this code to get the result but it returned an error:
"Multiple SQL statements in a single API call are not supported; use one API call per statement instead."
My Python code is
#!/usr/bin/env python
import snowflake.connector
# Gets the version
ctx = snowflake.connector.connect(
user='<user_name>',
password='<password>',
account='<account_identifier>'
)
cs = ctx.cursor()
try:
with open('<file_directory>') as f:
lines = f.readlines()
cs.execute(lines)
data_frame=cs.fetch_pandas_all()
data_frame.to_csv('filename.csv')
finally:
cs.close()
ctx.close()
What can I try next?
Perhaps do as the error suggest, and limit each API call to a single SQL statement?
dfs = []
for line in lines:
cs.execute(lines)
dfs.append(cs.fetch_pandas_all())
df = pd.concat(dfs).to_csv('filename.csv')
Related
I am new to the Python-SQL connectivity world. My goal is to retrieve data from SQL in a pandas DataFrame format by executing long SQL queries thru my python script.
Most of my SQL queries are long with multiple interim-temp tables before the final SELECT statement from the last temp table. When I run such a monolithic query in Python I get an error saying -
"pandas.io.sql.DatabaseError: Execution failed on sql"
Though they run absolutely fine in MS SQL Management Studio
I suspect this is due to the interim-temp tables, because if I split my long query into two pieces (with everything before the final SELECT in 1st section and final SELECT in the 2nd section) the two section sequentially, run fine
Can someone guide me why is it so or alternatively what is the best way to run long queries with temp tables/views and retrieve results in a pandas DataFrame?
Here is my sample Python code that ideally should take a fine name as an input and run the SQL to retrieve results in a data frame, however it fails in case of a query with temp tables
import pyodbc as db
import pandas as pd
filename = 'file.sql'
username = 'XXXX'
password = 'YYYYY'
driver= '{ODBC Driver 13 for SQL Server}'
database = 'DB'
server = 'local'
conn = db.connect('DRIVER='+driver+'; PORT=1433; SERVER='+server+';
PORT=1443; DATABASE='+database+'; UID='+username+'; PWD='+ password)
fd = open(filename, 'r')
sqlfile = fd.read()
fd.close()
sqlcommand1 = sql
df_table = pd.read_sql(sqlcommand1, conn)
If I break my sql query in two pieces (one with all temp tables and 2nd with final Select), then it runs fine. Below is a modified function that splits the long Query after finding '/**/' and it works fine
"""
This Function Reads a SQL Script From an Extrenal File and Executes The
Script in SQL. If The SQL Script Has Bunch of Tem Tables/Views
Followed By a Select Statement to Retrieve Data From Those Views Then Input
SQL File Should Have '/**/' Immediately Before the Final
Select Statement. This is to Esnure Final Select Statement is Executed on
the Temporary Views Already Run by Python.
Input is a SQL File Name and Output is a DataFrame
"""
import pyodbc as db
import pandas as pd
filename = 'filename.sql'
username = 'XXXX'
password = 'YYYYY'
driver= '{ODBC Driver 13 for SQL Server}'
database = 'DB'
server = 'local'
conn = db.connect('DRIVER='+driver+'; PORT=1433; SERVER='+server+';
PORT=1443; DATABASE='+database+'; UID='+username+'; PWD='+ password)
fd = open(filename, 'r')
sqlfile = fd.read()
fd.close()
sql = sqlfile.split('/**/')
sqlcommand1 = sql[0] #1st Section of Query with temp tables
sqlcommand2 = sql[1] #2nd section of Query with final SELECT statement
conn.execute(sqlcommand1)
df_table = pd.read_sql(sqlcommand2, conn)
Quick and dirty answer: if using T-SQL put the line SET NOCOUNT ON at the beginning of your query.
Like #Parfait mentioned above the pandas read_sql method can only support one result set. However, when you generate a temp table in T-sql you do create a result set in the form "(XX row(s) affected)" which is what causes your original query to fail. By setting NOCOUNT you eliminate any early returns and only get the results from your final SELECT statement.
Alternatively, if using pyodbc cursor instead of pandas you can utilize nextset() to skip the result sets from the temp table(s). More info on pyodbc here.
I'm not experienced in Python.
I have the following Python code:
How can i import various values from files outside the file, and use them in a SQL request?
#!/usr/bin/env python
import MySQLdb
import Stamdata
from Stamdata import Varmekurve
K = Varmekurve
print K #this vorks, and the value 1.5 from Varmekurve is printed.
#Open database connection
db = MySQLdb.connect("localhost","root","Codename","MyDvoDb")
#prepare a cursor object using cursor method
cursor = db.cursor()
#Get SetTemp FROM SQL
sql = ("SELECT SetTemp FROM varmekurver WHERE kurvenummer = '1.5' AND TempSensor ='15'")
#Here i would like to import the value from Varmekurve instead of '1.5', and the data from a DS18b20 temp. sensor instead of '15'.
#The DS18B20 sensor are located in '/sys/bus/w1/devices/28-0316007914ff/w1_slave'
cursor.execute(sql)
results = cursor.fetchall()
for row in results:
print row[0]
db.close()
Only the Stamdata file are in the same library.
The Script shall control a motorvalve by calling the SetTemp and open/close a mix-valve if the temp. is to high or low (within 2-3 degrees)
But i haven't come that far yet :0)
to dynamically change the value in the string from a variable do:
SELECT SetTemp FROM varmekurver WHERE kurvenummer = '{}' AND TempSensor ='{}'.format(val1, val2)
If you want to import these values from an external source, like a flat file, you can do it in a number of ways. For example using Pandas.
I was able to connect to SQL server Analysis service in Python using Microsoft.AnalysisServices.dll, and now I can't execute query on cube.
I've tried Execute method same as following:
amoServer.Execute('select from finance')
After issuing Execute method I have this error:
<Microsoft.AnalysisServices.XmlaError object at 0x000000000000002B [Microsoft.AnalysisServices.XmlaError]>
Note: I'm using IronPython with Python 2.7 on Windows Server 64Bit.
What's the problem?
its better use Microsoft.AnalysisServices.AdomdClient.dll and mdx query.
and set query result in Datasets in Ststem.Data assembly
something like this:
clr.AddReference ("Microsoft.AnalysisServices.AdomdClient.dll")
clr.AddReference ("System.Data")
from Microsoft.AnalysisServices.AdomdClient import AdomdConnection , AdomdDataAdapter
from System.Data import DataSet
conn = AdomdConnection("Data Source=0.0.0.0;Catalog=MyCatalog;")
conn.Open()
cmd = conn.CreateCommand()
cmd.CommandText = "your mdx query" # in your case 'select from finance'
adp = AdomdDataAdapter(cmd)
datasetParam = DataSet()
adp.Fill(datasetParam)
conn.Close();
# datasetParam hold your result as collection a\of tables
# each tables has rows
# and each row has columns
print datasetParam.Tables[0].Rows[0][0]
I am running hive 0.12, and I'd like to run several queries and get the result back as a python array.
for example:
result=[]
for col in columns:
sql='select {c} as cat,count(*) as cnt from {t} group by {c} having cnt > 100;'.format(t=table,c=col)
result.append(hive.query(sql))
result=dict(result)
What I'm missing, is the hive class to run SQL queries.
How can this be done ?
One quick and dirty way to do this, is to automate hive from the command line
hive -e "sql command"
Something like this should work
def query(self,cmd):
"""Run a hive expression"""
cmd='hive -e "'+cmd+'"';
prc = subprocess.Popen(cmd, stdout=subprocess.PIPE,stderr=subprocess.PIPE, shell=True)
ret=stdout.split('\n')
ret=[r for r in ret if len(r)]
if (len(ret)==0):
return []
if (ret[0].find('\t')>0):
return [[t.strip() for t in r.split('\t')] for r in ret]
return ret
You could also access Hive using Thrift. https://cwiki.apache.org/confluence/display/Hive/HiveClient#HiveClient-Python. It looks like pyhs2 is mostly a wrapper around using Thrift directly.
One alternative is to use the pyhs2 library to open a connection to Hive natively from within a Python process. The following is some sample code I had cobbled together to test a different use case, but it should hopefully illustrate use of this library.
# Python 2.7
import pyhs2
from pyhs2.error import Pyhs2Exception
hql = "SELECT * FROM my_table"
with pyhs2.connect(
host='localhost', port=10000, authMechanism="PLAIN", user="root" database="default"
# Use your own credentials and connection info here of course
) as db:
with db.cursor() as cursor:
try:
print "Trying default database"
cursor.execute(hql)
for row in cursor.fetch(): print row
except Pyhs2Exception as error:
print(str(error))
Depending on what is or is not already installed on your box, you may need to also install the development headers for both libpython and libsasl2.
My simple test code is listed below. I created the table already and can query it using the SQLite Manager add-in on Firefox so I know the table and data exist. When I run the query in python (and using the python shell) I get the no such table error
def TroyTest(self, acctno):
conn = sqlite3.connect('TroyData.db')
curs = conn.cursor()
v1 = curs.execute('''
SELECT acctvalue
FROM balancedata
WHERE acctno = ? ''', acctno)
print v1
conn.close()
When you pass SQLite a non-existing path, it'll happily open a new database for you, instead of telling you that the file did not exist before. When you do that, it'll be empty and you'll instead get a "No such table" error.
You are using a relative path to the database, meaning it'll try to open the database in the current directory, and that is probably not where you think it is..
The remedy is to use an absolute path instead:
conn = sqlite3.connect('/full/path/to/TroyData.db')
You need to loop over the cursor to see results:
curs.execute('''
SELECT acctvalue
FROM balancedata
WHERE acctno = ? ''', acctno)
for row in curs:
print row[0]
or call fetchone():
print curs.fetchone() # prints whole row tuple
The problem is the SQL statment. you must specify the db name and after the table name...
'''SELECT * FROM db_name.table_name WHERE acctno = ? '''