I was able to connect to SQL server Analysis service in Python using Microsoft.AnalysisServices.dll, and now I can't execute query on cube.
I've tried Execute method same as following:
amoServer.Execute('select from finance')
After issuing Execute method I have this error:
<Microsoft.AnalysisServices.XmlaError object at 0x000000000000002B [Microsoft.AnalysisServices.XmlaError]>
Note: I'm using IronPython with Python 2.7 on Windows Server 64Bit.
What's the problem?
its better use Microsoft.AnalysisServices.AdomdClient.dll and mdx query.
and set query result in Datasets in Ststem.Data assembly
something like this:
clr.AddReference ("Microsoft.AnalysisServices.AdomdClient.dll")
clr.AddReference ("System.Data")
from Microsoft.AnalysisServices.AdomdClient import AdomdConnection , AdomdDataAdapter
from System.Data import DataSet
conn = AdomdConnection("Data Source=0.0.0.0;Catalog=MyCatalog;")
conn.Open()
cmd = conn.CreateCommand()
cmd.CommandText = "your mdx query" # in your case 'select from finance'
adp = AdomdDataAdapter(cmd)
datasetParam = DataSet()
adp.Fill(datasetParam)
conn.Close();
# datasetParam hold your result as collection a\of tables
# each tables has rows
# and each row has columns
print datasetParam.Tables[0].Rows[0][0]
Related
I am trying to connect to the snowflake database using Python. I have the .sql file in VS code that contains multiple SQL statements. For example:
select * from table1;
select * from table2:
select * from table3:
So, I tried this code to get the result but it returned an error:
"Multiple SQL statements in a single API call are not supported; use one API call per statement instead."
My Python code is
#!/usr/bin/env python
import snowflake.connector
# Gets the version
ctx = snowflake.connector.connect(
user='<user_name>',
password='<password>',
account='<account_identifier>'
)
cs = ctx.cursor()
try:
with open('<file_directory>') as f:
lines = f.readlines()
cs.execute(lines)
data_frame=cs.fetch_pandas_all()
data_frame.to_csv('filename.csv')
finally:
cs.close()
ctx.close()
What can I try next?
Perhaps do as the error suggest, and limit each API call to a single SQL statement?
dfs = []
for line in lines:
cs.execute(lines)
dfs.append(cs.fetch_pandas_all())
df = pd.concat(dfs).to_csv('filename.csv')
Trying to run a script that contains a SQL query:
import example_script
example_script.df.describe()
example_script.df.info()
q1 = '''
example_script.df['specific_column'])
'''
job_config = bigquery.QueryJobConfig()
query_job = client.query(q1, job_config= job_config)
q = query_job.to_dataframe()
Issues I'm having are when I import it, how do I get that specific column name used as a text? Then it will run the query from GBQ but instead, it's stuck in pandas formatting that Google doesn't want to read. Are there other options?
I am new to the Python-SQL connectivity world. My goal is to retrieve data from SQL in a pandas DataFrame format by executing long SQL queries thru my python script.
Most of my SQL queries are long with multiple interim-temp tables before the final SELECT statement from the last temp table. When I run such a monolithic query in Python I get an error saying -
"pandas.io.sql.DatabaseError: Execution failed on sql"
Though they run absolutely fine in MS SQL Management Studio
I suspect this is due to the interim-temp tables, because if I split my long query into two pieces (with everything before the final SELECT in 1st section and final SELECT in the 2nd section) the two section sequentially, run fine
Can someone guide me why is it so or alternatively what is the best way to run long queries with temp tables/views and retrieve results in a pandas DataFrame?
Here is my sample Python code that ideally should take a fine name as an input and run the SQL to retrieve results in a data frame, however it fails in case of a query with temp tables
import pyodbc as db
import pandas as pd
filename = 'file.sql'
username = 'XXXX'
password = 'YYYYY'
driver= '{ODBC Driver 13 for SQL Server}'
database = 'DB'
server = 'local'
conn = db.connect('DRIVER='+driver+'; PORT=1433; SERVER='+server+';
PORT=1443; DATABASE='+database+'; UID='+username+'; PWD='+ password)
fd = open(filename, 'r')
sqlfile = fd.read()
fd.close()
sqlcommand1 = sql
df_table = pd.read_sql(sqlcommand1, conn)
If I break my sql query in two pieces (one with all temp tables and 2nd with final Select), then it runs fine. Below is a modified function that splits the long Query after finding '/**/' and it works fine
"""
This Function Reads a SQL Script From an Extrenal File and Executes The
Script in SQL. If The SQL Script Has Bunch of Tem Tables/Views
Followed By a Select Statement to Retrieve Data From Those Views Then Input
SQL File Should Have '/**/' Immediately Before the Final
Select Statement. This is to Esnure Final Select Statement is Executed on
the Temporary Views Already Run by Python.
Input is a SQL File Name and Output is a DataFrame
"""
import pyodbc as db
import pandas as pd
filename = 'filename.sql'
username = 'XXXX'
password = 'YYYYY'
driver= '{ODBC Driver 13 for SQL Server}'
database = 'DB'
server = 'local'
conn = db.connect('DRIVER='+driver+'; PORT=1433; SERVER='+server+';
PORT=1443; DATABASE='+database+'; UID='+username+'; PWD='+ password)
fd = open(filename, 'r')
sqlfile = fd.read()
fd.close()
sql = sqlfile.split('/**/')
sqlcommand1 = sql[0] #1st Section of Query with temp tables
sqlcommand2 = sql[1] #2nd section of Query with final SELECT statement
conn.execute(sqlcommand1)
df_table = pd.read_sql(sqlcommand2, conn)
Quick and dirty answer: if using T-SQL put the line SET NOCOUNT ON at the beginning of your query.
Like #Parfait mentioned above the pandas read_sql method can only support one result set. However, when you generate a temp table in T-sql you do create a result set in the form "(XX row(s) affected)" which is what causes your original query to fail. By setting NOCOUNT you eliminate any early returns and only get the results from your final SELECT statement.
Alternatively, if using pyodbc cursor instead of pandas you can utilize nextset() to skip the result sets from the temp table(s). More info on pyodbc here.
I have about 40 MS Access Databases and have some troubles if need to create or transfer one of MS Access Query (like object) from one db to other dbs.
So I tried to solve this problem with pyodbc but.. as I saw pyodbc doesn't support to create new, permanent MS Access Query (object).
I can connect to db, create or delete tables/rows but can't to create and save new query.
import pyodbc
odbc_driver = r"{Microsoft Access Driver (*.mdb, *.accdb)}"
db_test1 = r'''..\Test #1.accdb'''
db_test2 = r'''..\Test #2.accdb'''
db_test3 = r'''..\Test #3.accdb'''
db_test4 = r'''..\Test #4.accdb'''
db_test_objects = [db_test1, db_test2, db_test3, db_test4]
odbc_conn_str = "Driver=%s;DBQ=%s;" % (odbc_driver, db_file)
print (odbc_conn_str)
conn = pyodbc.connect(odbc_conn_str)
odbc_cursor = conn.cursor()
NewQuery = "CREATE TABLE TestTable(symbol varchar(15), leverage double)"
odbc_cursor.execute(NewQuery)
conn.commit()
conn.close()
SO, How to create and save MS Access Query like objects from python?
I tried to search info in Google, but the answers were related with Run SQL code.
On VBA this code looks like:
Public Sub CreateQueryDefX()
Dim base(1 To 4) As String
base(1) = "..\Test #1.accdb"
base(2) = "..\Test #2.accdb"
base(3) = "..\Test #3.accdb"
base(4) = "..\Test #4.accdb"
For i = LBound(base) To UBound(base)
CurrentBase = base(i)
Set dbo = OpenDatabase(CurrentBase)
With dbo
Set QueryNew = .CreateQueryDef("TestQuery", _
"SELECT * FROM TestTable")
RefreshDatabaseWindow
.Close
End With
Next i
RefreshDatabaseWindow
End Sub
Sorry for my English, it's not my native :)
By the way, I know how to solve this by VBA, but I'm interested in solve this by python.
Thank you.
You can use a CREATE VIEW statement to create a saved Select Query in Access. The pyodbc equivalent to your VBA example would be
crsr = conn.cursor()
sql = """\
CREATE VIEW TestQuery AS
SELECT * FROM TestTable
"""
crsr.execute(sql)
To delete that saved query you could simply execute a DROP VIEW statement.
For more information on DDL in Access see
Data Definition Language
Consider the Python equivalent of the VBA running exactly what VBA uses: a COM interface to the Access Object library. With Python's win32com third-party module, you can call the CreateQueryDef method. Do note: this COM interfacing can be applied in other languages such as PHP and R!
Below uses a try/except/finally block to ensure the Access application process closes regardless of error or success of code (similar to VBA's On Error handling):
import win32com.client
# OPEN ACCESS APP AND DATABASE
dbases = ["..\Test #1.accdb", "..\Test #2.accdb", "..\Test #3.accdb", "..\Test #4.accdb"]
try:
oApp = win32com.client.Dispatch("Access.Application")
# CREATE QUERYDEF
for db in dbases:
oApp.OpenCurrentDatabase(db)
currentdb = oApp.CurrentDb()
currentdb.CreateQueryDef("TestQuery", "SELECT * FROM TestTable")
currentdb = None
oApp.DoCmd.CloseDatabase
except Exception as e:
print(e)
finally:
currentdb = None
oApp.Quit
oApp = None
Also, if you need to run DML statements via pyodbc and not a COM interface, consider distributed queries as Access can query other databases directly in SQL. Below should work in Python (be sure to escape the backslash):
SELECT t.* FROM [C:\Path\To\Other\Database.accdb].TestTable t
I am running hive 0.12, and I'd like to run several queries and get the result back as a python array.
for example:
result=[]
for col in columns:
sql='select {c} as cat,count(*) as cnt from {t} group by {c} having cnt > 100;'.format(t=table,c=col)
result.append(hive.query(sql))
result=dict(result)
What I'm missing, is the hive class to run SQL queries.
How can this be done ?
One quick and dirty way to do this, is to automate hive from the command line
hive -e "sql command"
Something like this should work
def query(self,cmd):
"""Run a hive expression"""
cmd='hive -e "'+cmd+'"';
prc = subprocess.Popen(cmd, stdout=subprocess.PIPE,stderr=subprocess.PIPE, shell=True)
ret=stdout.split('\n')
ret=[r for r in ret if len(r)]
if (len(ret)==0):
return []
if (ret[0].find('\t')>0):
return [[t.strip() for t in r.split('\t')] for r in ret]
return ret
You could also access Hive using Thrift. https://cwiki.apache.org/confluence/display/Hive/HiveClient#HiveClient-Python. It looks like pyhs2 is mostly a wrapper around using Thrift directly.
One alternative is to use the pyhs2 library to open a connection to Hive natively from within a Python process. The following is some sample code I had cobbled together to test a different use case, but it should hopefully illustrate use of this library.
# Python 2.7
import pyhs2
from pyhs2.error import Pyhs2Exception
hql = "SELECT * FROM my_table"
with pyhs2.connect(
host='localhost', port=10000, authMechanism="PLAIN", user="root" database="default"
# Use your own credentials and connection info here of course
) as db:
with db.cursor() as cursor:
try:
print "Trying default database"
cursor.execute(hql)
for row in cursor.fetch(): print row
except Pyhs2Exception as error:
print(str(error))
Depending on what is or is not already installed on your box, you may need to also install the development headers for both libpython and libsasl2.