I am attempting to query a SQL Server 2012 database using the following code:
import pyodbc
class sqlserverConnector:
def __init__(self, connectionString):
"""
this is a typical connection string using windows authentication and the DSN manager:
'DSN=python;Trusted_Connection=yes'
"""
self._conn = pyodbc.connect(connectionString)
self._curs = self._conn.cursor()
self.fetchall = self._curs.fetchall
self.description = self._curs.description
self.columns = dict()
def __del__(self):
self._conn.close()
def __iter__(self):
return self._curs.__iter__()
# executes SQL statements
def execute(self, statement, **params):
if params is None:
self._curs.execute(statement)
else:
self._curs.execute(statement,params)
# creates a dictionary of column names and positions
if self._curs.description != None:
self.columns = dict((field[0], pos) for pos, field in enumerate(self._curs.description))
else:
None
And:
from sqlutil import *
sqlcnxn = sqlserverConnector('DSN=python;Trusted_Connection=yes')
rows = sqlcnxn.execute("select * from demographics", params=None)
for row in rows:
print row
break
The goal is to print out a single row (the table has 80k+ rows). However I always get this error message:
pyodbc.ProgrammingError: ('The SQL contains 0 parameter markers, but 1 parameters were supplied', 'HY000')
I have googled around and it seems like this pops up for different people for different reasons and none of the solutions I have found fit my error. I think what is happening is that the execute method is defaulting to the first else statement instead of the first if statement.
When you use the **params notation, then params is always a dictionary.
Calling that function with params=None means you now have a dictionary with:
>>> def func(**params):
... print params
...
>>> func()
{}
>>> func(params=None)
{'params': None}
>>> func(foo='bar')
{'foo': 'bar'}
The syntax is meant to accept arbitrary keyword parameters, illustrated by the foo keyword argument above.
Either remove the ** or test for an empty dictionary, and don't set params=None when calling .execute():
def execute(self, statement, **params):
if not params:
self._curs.execute(statement)
else:
self._curs.execute(statement, params)
# ...
and:
rows = sqlcnxn.execute("select * from demographics")
Note that your execute() function has no return statement, which means that rows will be set to None (the default return value for functions). Add return self if you meant to return the connection object so that it can be iterated over.
Related
I am trying to use find_one operator to fetch results from mongodb:
My document structure is as below:
{"_id":{"$oid":"600e6f592944ccc5790f1a9e"},
"user_id":"user_1",
"device_access":[
{"device_id":"DT002","access_type":"r"},
{"device_id":"DT007","access_type":"rm"},
{"device_id":"DT009","access_type":"rt"},
]
}
I have created my filter query as below
filter={'user_id': 'user_1','device_access.device_id': 'DT002'},
{'device_access': {'$elemMatch': {'device_id': 'DT002'}}}
But Pymongo returns None, when used in a function as below:
#Model.py
#this function in turn calls the pymongo find_one function
def test(self):
doc = self.__find(filter)
print(doc)
def __find(self, key):
device_document = self._db.get_single_data(COLLECTION_NAME, key)
return device_document
#Database.py
def get_single_data(self, collection, key):
db_collection = self._db[collection]
document = db_collection.find_one(key)
return document
.
Could you let me know what might be wrong here ?
Your brackets are incorrect. Try:
filter = {'user_id': 'user_1','device_access.device_id': 'DT002', 'device_access': {'$elemMatch': {'device_id': 'DT002'}}}
Also filter is a built-in function in python so you're better off using a different variable name.
Finally figured out on why the above query is returning None
The **filter** variable, as you could see, is ,comma separated. This is considered by python internally as *Tuple of Dictionary values* and when the filter var is passed on over another function it considers as a separate argument, though it never throws any exception in this case.
In order to overcome this, I now added *args to all the function calls that is being passed over.
#Model.py
#this function in turn calls the pymongo find_one function
def test(self):
doc = self.__find(filter)
print(doc)
def __find(self, key, *args):
device_document = self._db.get_single_data(COLLECTION_NAME, key, *args)
return device_document
#Database.py
def get_single_data(self, collection, key, *args):
db_collection = self._db[collection]
document = db_collection.find_one(key, *args)
return document
class Translator(object):
def __init__(self, tracking_col='tracking_id', coding_col='coding', qualifying_code_col='qualifying_code',
translation_col='translation'):
self._results = []
self.tracking_col = tracking_col
self.data_col = coding_col
self.definition_col = qualifying_code_col
self.translation_col = translation_col
self.__validate_parameters(self.__dict__)
def __validate_parameters(self, variable_values):
class_values = {}
for key, value in variable_values.items():
if type(value) is str:
class_values.setdefault(value, set()).add(key)
for key, values in class_values.items():
# If there is more than one value, there is a duplicate
if len(values) > 1:
raise Exception('Duplicate column names exist in parameters. \'{}\' are set to \'{}\'. '
'Do not use duplicate column names.'.format(values, key))
This class cannot have the duplicate values for any of the 'col' variables. If duplicate values exist, logic further in the class may not crash but will create unpredictable results.
Upon instantiation my function __validate_parameters will detect duplicate values and raise an Exception. The problem is I am dumping all the values out to a dictionary, iterating to create another dictionary, and finally manually raising an exception (which from what I've been told is the wrong thing to do in any situation). It's also rather verbose.
Is there a shorter and more concise way to validate for duplicates while propogating an error up without the complexity above?
There is nothing wrong with manually raising an exception. Collecting your cols in some collection will make validation easier:
class Translator(object):
def __init__(self, tracking_col=..., coding_col=..., qualifying_code_col=...,
translation_col=...):
self._results = []
self.cols = [tracking_col, coding_col, qualifying_code_col, translation_col]
self.validate_cols(self)
def validate_cols(self):
if len(self.cols) > len(set(self.cols)):
raise ...
#property
def tracking_col(self):
return cols[0]
# ...
You could make the constructor take a dictionary instead of individual variables, e.g.
class Translator(object):
def __init__(self, cols={}):
defaults = { "tracking_col" : "tracking_id",
"coding_col" : "coding",
"qualifying_code_col" : "qualifying_code",
"translation_col" : "translation" }
for d in defaults:
if d not in cols:
cols[d] = defaults[d]
self.__validate_parameters(cols)
def __validate_parameters(self, d):
import Counter
c = Counter.Counter(d.values())
if any(cnt > 1 for cnt in c.values()):
raise Exception("Duplicate values found: '%s'" % str(c))
(Code not tested)
Using SQLAlchemy I have defined my own TypeDecorator for storing pandas DataFrames in a databased encoded as JSON string.
class db_JsonEncodedDataFrameWithTimezone(db.TypeDecorator):
impl = db.Text
def process_bind_param(self, value, dialect):
if value is not None and isinstance(value, pd.DataFrame):
timezone = value.index.tz.zone
df_json = value.to_json(orient="index")
data = {'timezone': timezone, 'df': df_json, 'index_name': value.index.name}
value = json.dumps(data)
return value
def process_result_value(self, value, dialect):
if value is not None:
data = json.loads(value)
df = pd.read_json(data['df'], orient="index")
df.index = df.index.tz_localize('UTC')
df.index = df.index.tz_convert(data['timezone'])
df.index.name = data['index_name']
value = df
return value
This works fine for first time database save, and loading is fine too.
The problem comes when I augment the value, i.e. change the DataFrame and try to alter the database. When I invoke
db.session.add(entity)
db.session.commit()
I get a traceback which points to comparing values being the problem:
x == y
ValueError: Can only compare identically-labeled DataFrame Objects.
So I suspect my problem has something to do with coercing comparators. I have tried three things, all have failed and I really don't know what to do next:
#1st failed solution attempt inserting
coerce_to_is_types = (pd.DataFrame,)
#2nd failed solution attempt inserting
def coerce_compared_value(self, op, value):
return self.impl.coerce_compared_value(op, value)
#3rd failed solution attempt
class comparator_factory(db.Text.comparator_factory):
def __eq__(self, other):
try:
value = (self == other).all().all()
except ValueError:
value = False
return value
On my fourth attempt I think I found the answer, I directly create my own compare function that I inserted in the Type class above. This avoids the operator 'x == y' being performed on my DataFrames:
def compare_values(self, x, y):
from pandas.util.testing import assert_frame_equal
try:
assert_frame_equal(x, y, check_names=True, check_like=True)
return True
except (AssertionError, ValueError, TypeError):
return False
Another problem of this nature later appeared in my code. The solution was to amend the above to attempt the natural compare first and if that failed then implement the above:
try:
value = x == y
except:
# some other overwriting comparision method such as above
I'm new to python but I know that self is automatically passed.I'm unable to understand why am I getting this error and I get the same error with getGraph function as well 2 required 1 given.
What is going wrong here?
CreateDoc is in CeleryTasks.py and insert_manager in MongoTriggers.py
#app.task
def createDoc(self):
print ("CeleryTasks:CreateDoc")
if 'refs' not in self.data:
return
print(self.data['refs'])
for id in self.data['refs']:
doc = self.db[self.collName].find_one({'_id': id})
if doc is None:
insertedID = self.db[self.collName].insert_one({
"_id": id
})
print (insertedID)
#Trigger on Mongo Operations
def insert_manager(op_document):
print("Data Inserted")
# pprint.pprint (op_document)
data = op_document['o']
ns = op_document['ns'].split('.')
# pprint.pprint (data)
docID = op_document['o']['_id']
tasks = CeleryTasks(port, docID, dbName, collectionName, data)
tasks.createDoc()
tasks.getGraph.delay(docID)
self is always passed when it's method of class.
Celery tasks are independent functions. You can add them self arguments by adding bind=True via the app decorator but it is used for a different purpose: bounded tasks
I'm using python 3.5.2 for my project. I installed MySQLdb via pip for Mysql connection.
Code:
import MySQLdb
class DB:
def __init__(self):
self.con = MySQLdb.connect(host="127.0.0.1", user="*", passwd="*", db="*")
self.cur = self.con.cursor()
def query(self, q):
r = self.cur.execute(q)
return r
def test():
db = DB()
result = db.query('SELECT id, name FROM test')
return print(result)
def test1():
db = DB()
result = db.query('SELECT id, name FROM test').fetchall()
for res in result:
id, name = res
print(id, name)
test()
test1()
#test output >>> '3'
#test1 output >>> AttributeError: 'int' object has no attribute 'fetchall'
Test table:
id | name
1 | 'test'
2 | 'test2'
3 | 'test3'
Please read this link:http://mysql-python.sourceforge.net/MySQLdb.html
At this point your query has been executed and you need to get the
results. You have two options:
r=db.store_result()
...or... r=db.use_result() Both methods return a result object. What's the difference? store_result() returns the entire result set to
the client immediately. If your result set is really large, this could
be a problem. One way around this is to add a LIMIT clause to your
query, to limit the number of rows returned. The other is to use
use_result(), which keeps the result set in the server and sends it
row-by-row when you fetch. This does, however, tie up server
resources, and it ties up the connection: You cannot do any more
queries until you have fetched all the rows. Generally I recommend
using store_result() unless your result set is really huge and you
can't use LIMIT for some reason.
def test1():
db = DB()
db.query('SELECT id, name FROM test')
result = db.cur.fetchall()
for res in result:
id, name = res
print(id, name)
cursor.execute() will return the number of rows modified or retrieved, just like in PHP. Have you tried to return the fetchall() like so?
def query(self, q):
r = self.cur.execute(q).fetchall()
return r
See here for more documentation: https://ianhowson.com/blog/a-quick-guide-to-using-mysql-in-python/