I have a function which, when passed database, table and access details connects to a table in SQL server to read all the contents to export to a pandas dataframe
def GET_DATA(source_server, source_database, source_table, source_username, source_password):
print('******* GETTING DATA ' ,source_server, '.', source_database,'.' ,source_table,'.' ,source_username , '*******')
data_collected = []
#SOURCE
connection = pypyodbc.connect('Driver={ODBC Driver 17 for SQL Server};'
'Server=' + source_server + ';'
'Database=' + source_database + ' ;'
'uid=' + source_username + ';pwd=' + source_password + '')
#OPEN THE CONNECTION
cursor = connection.cursor()
#BUILD THE COMMAND
SQLCommand = ("SELECT * FROM " + source_database +".dbo." + source_table )
#RUN THE QUERY
cursor.execute(SQLCommand)
#GET RESULTS
results = cursor.fetchone()
columnList = [tuple[0] for tuple in cursor.description]
#print(type(columnList))
while results:
data_collected.append(results)
results = cursor.fetchone()
df_column = pd.DataFrame(columnList)
df_column = df_column.transpose()
df_result = pd.DataFrame(data_collected)
frames = [df_column,df_result]
df = pd.concat(frames)
print('GET_DATA COMPLETE!')
return df
Most of the time this works fine, however, for reasons I can't identify I get this error
sequence item 0: expected str instance, bytes found
What is causing this and how do I account for it?
thx !
I found a much better way of extracting data from SQL to pandas
import pyodbc
import pandas as pd
def GET_DATA_TO_PANDAS(source_server,source_database, source_table,source_username,source_password):
print('***** STARTING DATA TO PANDAS ********* ')
con = pyodbc.connect('Driver={ODBC Driver 17 for SQL Server};'
'Server=' + source_server + ';'
'Database=' + source_database + ' ;'
'uid=' + source_username + ';pwd=' + source_password + '')
#BUILD QUERY
query = "SELECT * FROM " + source_database + ".dbo." + source_table
df = pd.read_sql(query, con)
return df
Used this link - https://www.quora.com/How-do-I-get-data-directly-from-databases-DB2-Oracle-MS-SQL-Server-into-Pandas-DataFrames-using-Python
I experienced a similar issue in one of my projects. This exception was raised by microsoft ODBC driver. According to me the issue might have occurred while fetching the results from the DB. May be at line
cursor.fetchone()
The reason for this exception as of what I understood before, is the size of the data that is received from SQL Server to Python. There might be one specific huge row in the DB that's causing this. If the row has unicode characters or non-ascii characters, the driver exceeds the buffer length, the driver cannot convert the nvarchar to bytes and from bytes object back to string. When the driver encounters some special characters, it sometimes cannot convert the bytes object back to string and hence the error. The driver sends a bytes object back to python. I think that's the reason for the exception.
May be if you dip a bit deep into that specific data row that might help you.
I also found another similar issue here - Click here
May be this URL (Microsoft ODBC driver's known issue) might help too - Click here
I got the same error using python 3, as follows: I defined a MS SQL column as nchar, stored an empty string (which in python 3 is unicode), then retrieved the row with the pypyodbc call cursor.fetchone(). It failed on this line:
if raw_data_parts != []:
if py_v3:
if target_type != SQL_C_BINARY:
raw_value = ''.join(raw_data_parts)
# FAILS WITH "sequence item 0: expected str instance, bytes found"
....
Changing the column datatype to nvarchar in the database fixed it.
After reading several inputs I still can't get this to work.
Most likely I'm doing it all wrong but I've tried several different approaches
What I'm trying to do is extract data from a CSV and add it into my newly created database/table
My csv input look like this
NodeName,NeId,Object,Time,Interval,Direction,NeAlias,NeType,Position,AVG,MAX,MIN,percent_0-5,percent_5-10,percent_10-15,percent_15-20,percent_20-25,percent_25-30,percent_30-35,percent_35-40,percent_40-45,percent_45-50,percent_50-55,percent_55-60,percent_60-65,percent_65-70,percent_70-75,percent_75-80,percent_80-85,percent_85-90,percent_90-95,percent_95-100,IdLogNum,FailureDescription
X13146PAZ,5002,1/11/100,2016-05-16 00:00:00,24,Near End,GE0097-TN01.1,AMM 20PB,-,69684,217287,772,10563,8055,10644,15147,16821,13610,7658,2943,784,152,20,3,0,0,0,0,0,0,0,0,0,-
...
X13146PAZ,5002,1/11/102,2016-05-16 00:00:00,24,Near End,GE0097-TN01.1,AMM 20PB,-,3056,28315,215,86310,90,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-
...
X13146PAZ,5002,1/11/103,2016-05-16 00:00:00,24,Near End,GE0097-TN01.1,AMM 20PB,-,769,7195,11,86400,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-
The mysql table is created but possibly that might be the issue as some ar varchar columns and some are integer columns
My server is a Ubuntu if that is of any use
My Code
# -*- coding: utf-8 -*-
#Imports
from datetime import date, timedelta
import sys
import MySQLdb as mdb
import csv
import os
#Vars
Yesterday = date.today() - timedelta(1)
#Opening document
RX_Document = open('./reports/X13146PAZ_TN_WAN_ETH_BAND_RX_' + Yesterday.strftime("%Y%m%d") + "_231500.csv" , 'r')
RX_Document_Str = './reports/X13146PAZ_TN_WAN_ETH_BAND_RX_' + Yesterday.strftime("%Y%m%d") + "_231500.csv"
csv_data = csv.reader(file(RX_Document_Str))
con = mdb.connect('localhost', 'username', 'password','tn_rx_utilization');
counter = 0
for row in csv_data:
if counter == 0:
print row
continue
counter = 1
if counter == 1:
cur = con.cursor()
cur.execute('INSERT INTO RX_UTIL(NodeName, NeId, Object, Time, Interval1,Direction,NeAlias,NeType,Position,AVG,MAX,MIN,percent_5-10,percent_10-15,percent_15-20,percent_20-25,percent_25-30,percent_30-35,percent_35-40,percent_40-45,percent_45-50,percent_50-55,percent_55-60,percent_60-65,percent_65-70,percent_70-75,percent_75-80,percent_80-85,percent_85-90,percent_90-95,percent_95-100,IdLogNum,FailureDescription)' 'VALUES("%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s")',tuple(row[:34]))
con.commit()
#cur.execute("SELECT VERSION()")
#ver = cur.fetchone()
con.commit()
con.close()
You should not put the placeholder %s in quotes ":
cur.execute('''INSERT INTO RX_UTIL(NodeName, NeId, Object, Time, Interval1,Direction,
NeAlias,NeType,Position,AVG,MAX,MIN,"percent_5-10","percent_10-15",
"percent_15-20","percent_20-25","percent_25-30","percent_30-35",
"percent_35-40","percent_40-45","percent_45-50","percent_50-55",
"percent_55-60","percent_60-65","percent_65-70","percent_70-75",
"percent_75-80","percent_80-85","percent_85-90","percent_90-95",
"percent_95-100",IdLogNum,FailureDescription)
VALUES(%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,
%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s)''', tuple(row[:33]))
You are missing Percent_0-5 from your Insert
Remove the quotes from the %s references, this needs to be in String format, but the underlying data type will be passed.
There may be issues with datatype resulting from the csv reader. Have Python eval() the csv data to alter type as an INT. Here is some more information from another post:
Read data from csv-file and transform to correct data-type
cur.execute('INSERT INTO RX_UTIL(NodeName, NeId, Object, Time, Interval1,Direction,NeAlias,NeType,Position,AVG,MAX,MIN,percent_0-5,percent_5-10,percent_10-15,percent_15-20,percent_20-25,percent_25-30,percent_30-35,percent_35-40,percent_40-45,percent_45-50,percent_50-55,percent_55-60,percent_60-65,percent_65-70,percent_70-75,percent_75-80,percent_80-85,percent_85-90,percent_90-95,percent_95-100,IdLogNum,FailureDescription)' 'VALUES(%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s)',tuple(row[:34]))
I have a JSON object in Python. I am Using Python DB-API and SimpleJson. I am trying to insert the json into a MySQL table.
At moment am getting errors and I believe it is due to the single quotes '' in the JSON Objects.
How can I insert my JSON Object into MySQL using Python?
Here is the error message I get:
error: uncaptured python exception, closing channel
<twitstream.twitasync.TwitterStreamPOST connected at
0x7ff68f91d7e8> (<class '_mysql_exceptions.ProgrammingError'>:
(1064, "You have an error in your SQL syntax; check the
manual that corresponds to your MySQL server version for
the right syntax to use near ''favorited': '0',
'in_reply_to_user_id': '52063869', 'contributors':
'NULL', 'tr' at line 1")
[/usr/lib/python2.5/asyncore.py|read|68]
[/usr/lib/python2.5/asyncore.py|handle_read_event|390]
[/usr/lib/python2.5/asynchat.py|handle_read|137]
[/usr/lib/python2.5/site-packages/twitstream-0.1-py2.5.egg/
twitstream/twitasync.py|found_terminator|55] [twitter.py|callback|26]
[build/bdist.linux-x86_64/egg/MySQLdb/cursors.py|execute|166]
[build/bdist.linux-x86_64/egg/MySQLdb/connections.py|defaulterrorhandler|35])
Another error for reference
error: uncaptured python exception, closing channel
<twitstream.twitasync.TwitterStreamPOST connected at
0x7feb9d52b7e8> (<class '_mysql_exceptions.ProgrammingError'>:
(1064, "You have an error in your SQL syntax; check the manual
that corresponds to your MySQL server version for the right
syntax to use near 'RT #tweetmeme The Best BlackBerry Pearl
Cell Phone Covers http://bit.ly/9WtwUO''' at line 1")
[/usr/lib/python2.5/asyncore.py|read|68]
[/usr/lib/python2.5/asyncore.py|handle_read_event|390]
[/usr/lib/python2.5/asynchat.py|handle_read|137]
[/usr/lib/python2.5/site-packages/twitstream-0.1-
py2.5.egg/twitstream/twitasync.py|found_terminator|55]
[twitter.py|callback|28] [build/bdist.linux-
x86_64/egg/MySQLdb/cursors.py|execute|166] [build/bdist.linux-
x86_64/egg/MySQLdb/connections.py|defaulterrorhandler|35])
Here is a link to the code that I am using http://pastebin.com/q5QSfYLa
#!/usr/bin/env python
try:
import json as simplejson
except ImportError:
import simplejson
import twitstream
import MySQLdb
USER = ''
PASS = ''
USAGE = """%prog"""
conn = MySQLdb.connect(host = "",
user = "",
passwd = "",
db = "")
# Define a function/callable to be called on every status:
def callback(status):
twitdb = conn.cursor ()
twitdb.execute ("INSERT INTO tweets_unprocessed (text, created_at, twitter_id, user_id, user_screen_name, json) VALUES (%s,%s,%s,%s,%s,%s)",(status.get('text'), status.get('created_at'), status.get('id'), status.get('user', {}).get('id'), status.get('user', {}).get('screen_name'), status))
# print status
#print "%s:\t%s\n" % (status.get('user', {}).get('screen_name'), status.get('text'))
if __name__ == '__main__':
# Call a specific API method from the twitstream module:
# stream = twitstream.spritzer(USER, PASS, callback)
twitstream.parser.usage = USAGE
(options, args) = twitstream.parser.parse_args()
if len(args) < 1:
args = ['Blackberry']
stream = twitstream.track(USER, PASS, callback, args, options.debug, engine=options.engine)
# Loop forever on the streaming call:
stream.run()
use json.dumps(json_value) to convert your json object(python object) in a json string that you can insert in a text field in mysql
http://docs.python.org/library/json.html
To expand on the other answers:
Basically you need make sure of two things:
That you have room for the full amount of data that you want to insert in the field that you are trying to place it. Different database field types can fit different amounts of data.
See: MySQL String Datatypes. You probably want the "TEXT" or "BLOB" types.
That you are safely passing the data to database. Some ways of passing data can cause the database to "look" at the data and it will get confused if the data looks like SQL. It's also a security risk. See: SQL Injection
The solution for #1 is to check that the database is designed with correct field type.
The solution for #2 is use parameterized (bound) queries. For instance, instead of:
# Simple, but naive, method.
# Notice that you are passing in 1 large argument to db.execute()
db.execute("INSERT INTO json_col VALUES (" + json_value + ")")
Better, use:
# Correct method. Uses parameter/bind variables.
# Notice that you are passing in 2 arguments to db.execute()
db.execute("INSERT INTO json_col VALUES %s", json_value)
Hope this helps. If so, let me know. :-)
If you are still having a problem, then we will need to examine your syntax more closely.
The most straightforward way to insert a python map into a MySQL JSON field...
python_map = { "foo": "bar", [ "baz", "biz" ] }
sql = "INSERT INTO your_table (json_column_name) VALUES (%s)"
cursor.execute( sql, (json.dumps(python_map),) )
You should be able to insert intyo a text or blob column easily
db.execute("INSERT INTO json_col VALUES %s", json_value)
You need to get a look at the actual SQL string, try something like this:
sqlstr = "INSERT INTO tweets_unprocessed (text, created_at, twitter_id, user_id, user_screen_name, json) VALUES (%s,%s,%s,%s,%s,%s)", (status.get('text'), status.get('created_at'), status.get('id'), status.get('user', {}).get('id'), status.get('user', {}).get('screen_name'), status)
print "about to execute(%s)" % sqlstr
twitdb.execute(sqlstr)
I imagine you are going to find some stray quotes, brackets or parenthesis in there.
#route('/shoes', method='POST')
def createorder():
cursor = db.cursor()
data = request.json
p_id = request.json['product_id']
p_desc = request.json['product_desc']
color = request.json['color']
price = request.json['price']
p_name = request.json['product_name']
q = request.json['quantity']
createDate = datetime.now().isoformat()
print (createDate)
response.content_type = 'application/json'
print(data)
if not data:
abort(400, 'No data received')
sql = "insert into productshoes (product_id, product_desc, color, price, product_name, quantity, createDate) values ('%s', '%s','%s','%d','%s','%d', '%s')" %(p_id, p_desc, color, price, p_name, q, createDate)
print (sql)
try:
# Execute dml and commit changes
cursor.execute(sql,data)
db.commit()
cursor.close()
except:
# Rollback changes
db.rollback()
return dumps(("OK"),default=json_util.default)
One example, how add a JSON file into MySQL using Python. This means that it is necessary to convert the JSON file to sql insert, if there are several JSON objects then it is better to have only one call INSERT than multiple calls, ie for each object to call the function INSERT INTO.
# import Python's JSON lib
import json
# use JSON loads to create a list of records
test_json = json.loads('''
[
{
"COL_ID": "id1",
"COL_INT_VAULE": 7,
"COL_BOOL_VALUE": true,
"COL_FLOAT_VALUE": 3.14159,
"COL_STRING_VAULE": "stackoverflow answer"
},
{
"COL_ID": "id2",
"COL_INT_VAULE": 10,
"COL_BOOL_VALUE": false,
"COL_FLOAT_VALUE": 2.71828,
"COL_STRING_VAULE": "http://stackoverflow.com/"
},
{
"COL_ID": "id3",
"COL_INT_VAULE": 2020,
"COL_BOOL_VALUE": true,
"COL_FLOAT_VALUE": 1.41421,
"COL_STRING_VAULE": "GIRL: Do you drink? PROGRAMMER: No. GIRL: Have Girlfriend? PROGRAMMER: No. GIRL: Then how do you enjoy life? PROGRAMMER: I am Programmer"
}
]
''')
# create a nested list of the records' values
values = [list(x.values()) for x in test_json]
# print(values)
# get the column names
columns = [list(x.keys()) for x in test_json][0]
# value string for the SQL string
values_str = ""
# enumerate over the records' values
for i, record in enumerate(values):
# declare empty list for values
val_list = []
# append each value to a new list of values
for v, val in enumerate(record):
if type(val) == str:
val = "'{}'".format(val.replace("'", "''"))
val_list += [ str(val) ]
# put parenthesis around each record string
values_str += "(" + ', '.join( val_list ) + "),\n"
# remove the last comma and end SQL with a semicolon
values_str = values_str[:-2] + ";"
# concatenate the SQL string
table_name = "json_data"
sql_string = "INSERT INTO %s (%s)\nVALUES\n%s" % (
table_name,
', '.join(columns),
values_str
)
print("\nSQL string:\n\n")
print(sql_string)
output:
SQL string:
INSERT INTO json_data (COL_ID, COL_INT_VAULE, COL_BOOL_VALUE, COL_FLOAT_VALUE, COL_STRING_VAULE)
VALUES
('id1', 7, True, 3.14159, 'stackoverflow answer'),
('id2', 10, False, 2.71828, 'http://stackoverflow.com/'),
('id3', 2020, True, 1.41421, 'GIRL: Do you drink? PROGRAMMER: No. GIRL: Have Girlfriend? PROGRAMMER: No. GIRL: Then how do you enjoy life? PROGRAMMER: I am Programmer.');
The error may be due to an overflow of the size of the field in which you try to insert your json. Without any code, it is hard to help you.
Have you considerate a no-sql database system such as couchdb, which is a document oriented database relying on json format?
Here's a quick tip, if you want to write some inline code, say for a small json value, without import json.
You can escape quotes in SQL by a double quoting, i.e. use '' or "", to enter ' or ".
Sample Python code (not tested):
q = 'INSERT INTO `table`(`db_col`) VALUES ("{k:""some data"";}")'
db_connector.execute(q)