sqlite database file unreadable symbols - python

We have an assignment where we have to write and query from a database file directly without the use of any sqlite api functions. We were given the site https://www.sqlite.org/fileformat.html as guidance but I can't seem to get the database file to look like anything readable.
Here's a basic example of what I'm trying to do with the sqlite3 library from python
import sqlite3
con = sqlite3.connect("test.db")
cur = con.cursor()
cur.execute("PRAGMA page_size = 4096")
cur.execute("PRAGMA encoding = 'UTF-8'")
dropTable = "DROP TABLE IF EXISTS Employee"
createTable = "CREATE TABLE Employee(first int, second int, third int)"
cur.execute(dropTable)
cur.execute(createTable)
cur.execute("INSERT INTO Employee VALUES (1, 2, 3)")
con.commit()
con.close()
When I open the database file, it starts with "SQLite format 3", and then a bunch of weird symbols following it. The same thing happened when I made the databse with the actual csv file given. There's some readable parts but most of it is unreadable symbols that is in no way similar to the format specified by the website. I'm a little overwhelmed right now so I would appreciate anyone pointing me to the right direction on how I would begin fixing this mess.

Here is an example of reading that binary file using FileIO and struct:
from io import FileIO
from struct import *
def read_header(db):
# read the entire database header into one bytearray
header = bytearray(100)
db.readinto(header)
# print out a few of the values in the header
# any number that is more than one byte requires unpacking
# strings require decoding
print('header string: ' + header[0:15].decode('utf-8')) # note that this ignores the null byte at header[15]
page_size = unpack('>h', header[16:18])[0]
print('page_size = ' + str(page_size))
print('write version: ' + str(header[18]))
print('read version: ' + str(header[19]))
print('reserved space: ' + str(header[20]))
print('Maximum embedded payload fraction: ' + str(header[21]))
print('Minimum embedded payload fraction: ' + str(header[22]))
print('Leaf payload fraction: ' + str(header[23]))
file_change_counter = unpack('>i', header[24:28])[0]
print('File change counter: ' + str(file_change_counter))
sqlite_version_number = unpack('>i', header[96:])[0]
print('SQLITE_VERSION_NUMBER: ' + str(sqlite_version_number))
db = FileIO('test.db', mode='r')
read_header(db)
This only reads the database header, and ignores most of the values in the header.

Related

how to ingest a table specification file in .txt form and create a table in sqlite?

I tried a few different ways, below but having trouble a) removing the width and b) removing the \n with a comma. I have a txt file like the below and I want to take that information and create a table in sqlite (all using python)
"field",width, type
name, 15, string
revenue, 10, decimal
invoice_date, 10, string
amount, 2, integer
Current python code - trying to read in the file, and get the values to pass in the sql statement below
import os
import pandas as pd
dir_path = os.path.dirname(os.path.realpath(__file__))
file = open(str(dir_path) + '/revenue/revenue_table_specifications.txt','r')
lines = file.readlines()
table = lines[2::]
s = ''.join(str(table).split(','))
x = s.replace("\n", ",").strip()
print(x)
sql I want to pass in
c = sqlite3.connect('rev.db') #connnect to DB
try:
c.execute('''CREATE TABLE
revenue_table (information from txt file,
information from txt file,
....)''')
except sqlite3.OperationalError: #i.e. table exists already
pass
This produces something that will work.
def makesql(filename):
s = []
for row in open(filename):
if row[0] == '"':
continue
parts = row.strip().split(", ")
s.append( f"{parts[0]} {parts[2]}" )
return "CREATE TABLE revenue_table (\n" + ",\n".join(s) + ");"
sql = makesql( 'x.csv' )
print(sql)
c.execute( sql )

psycopg2 dictionary dump to json

Sorry if this is a noob question, but I am trying to dump a psycopg2 dictionary directly into a json string. I do get a return value in the browser, but it isn't formatted like most of the other json examples I see. The idea being to dump the result of a select statement into a json string and unbundle it on the other end to add into a database on the client side. The code is below and a sample of the return. Is there a better way to do this operation with json and psycopg2?
# initializing variables
location_connection = location_cursor = 0
sql_string = coordinate_return = data = ""
# opening connection and setting cursor
location_connection = psycopg2.connect("dbname='' user='' password=''")
location_cursor = location_connection.cursor(cursor_factory=psycopg2.extras.RealDictCursor)
# setting sql string and executing query
sql_string = "select * from " + tablename + " where left(alphacoordinate," + str(len(coordinate)) + ") = '" + coordinate + "' order by alphacoordinate;"
location_cursor.execute(sql_string)
data = json.dumps(location_cursor.fetchall())
# closing database connection
location_connection.close()
# returning coordinate string
return data
sample return
"[{\"alphacoordinate\": \"nmaan-001-01\", \"xcoordinate\":
3072951151886, \"planetarydiameter\": 288499, \"planetarymass\":
2.020936938e+27, \"planetarydescription\": \"PCCGQAAA\", \"planetarydescriptionsecondary\": 0, \"moons\": 1"\"}]"
You could create the JSON string directly in Postgres using row_to_json:
# setting sql string and executing query
sql_string = "select row_to_json(" + tablename + ") from " + tablename + " where left(alphacoordinate," + str(len(coordinate)) + ") = '" + coordinate + "' order by alphacoordinate;"
location_cursor.execute(sql_string)
data = location_cursor.fetchall()

python sql connection - pypyodbc - sequence item 0: expected str instance, bytes found

I have a function which, when passed database, table and access details connects to a table in SQL server to read all the contents to export to a pandas dataframe
def GET_DATA(source_server, source_database, source_table, source_username, source_password):
print('******* GETTING DATA ' ,source_server, '.', source_database,'.' ,source_table,'.' ,source_username , '*******')
data_collected = []
#SOURCE
connection = pypyodbc.connect('Driver={ODBC Driver 17 for SQL Server};'
'Server=' + source_server + ';'
'Database=' + source_database + ' ;'
'uid=' + source_username + ';pwd=' + source_password + '')
#OPEN THE CONNECTION
cursor = connection.cursor()
#BUILD THE COMMAND
SQLCommand = ("SELECT * FROM " + source_database +".dbo." + source_table )
#RUN THE QUERY
cursor.execute(SQLCommand)
#GET RESULTS
results = cursor.fetchone()
columnList = [tuple[0] for tuple in cursor.description]
#print(type(columnList))
while results:
data_collected.append(results)
results = cursor.fetchone()
df_column = pd.DataFrame(columnList)
df_column = df_column.transpose()
df_result = pd.DataFrame(data_collected)
frames = [df_column,df_result]
df = pd.concat(frames)
print('GET_DATA COMPLETE!')
return df
Most of the time this works fine, however, for reasons I can't identify I get this error
sequence item 0: expected str instance, bytes found
What is causing this and how do I account for it?
thx !
I found a much better way of extracting data from SQL to pandas
import pyodbc
import pandas as pd
def GET_DATA_TO_PANDAS(source_server,source_database, source_table,source_username,source_password):
print('***** STARTING DATA TO PANDAS ********* ')
con = pyodbc.connect('Driver={ODBC Driver 17 for SQL Server};'
'Server=' + source_server + ';'
'Database=' + source_database + ' ;'
'uid=' + source_username + ';pwd=' + source_password + '')
#BUILD QUERY
query = "SELECT * FROM " + source_database + ".dbo." + source_table
df = pd.read_sql(query, con)
return df
Used this link - https://www.quora.com/How-do-I-get-data-directly-from-databases-DB2-Oracle-MS-SQL-Server-into-Pandas-DataFrames-using-Python
I experienced a similar issue in one of my projects. This exception was raised by microsoft ODBC driver. According to me the issue might have occurred while fetching the results from the DB. May be at line
cursor.fetchone()
The reason for this exception as of what I understood before, is the size of the data that is received from SQL Server to Python. There might be one specific huge row in the DB that's causing this. If the row has unicode characters or non-ascii characters, the driver exceeds the buffer length, the driver cannot convert the nvarchar to bytes and from bytes object back to string. When the driver encounters some special characters, it sometimes cannot convert the bytes object back to string and hence the error. The driver sends a bytes object back to python. I think that's the reason for the exception.
May be if you dip a bit deep into that specific data row that might help you.
I also found another similar issue here - Click here
May be this URL (Microsoft ODBC driver's known issue) might help too - Click here
I got the same error using python 3, as follows: I defined a MS SQL column as nchar, stored an empty string (which in python 3 is unicode), then retrieved the row with the pypyodbc call cursor.fetchone(). It failed on this line:
if raw_data_parts != []:
if py_v3:
if target_type != SQL_C_BINARY:
raw_value = ''.join(raw_data_parts)
# FAILS WITH "sequence item 0: expected str instance, bytes found"
....
Changing the column datatype to nvarchar in the database fixed it.

CSV - MYSQL Using Python

After reading several inputs I still can't get this to work.
Most likely I'm doing it all wrong but I've tried several different approaches
What I'm trying to do is extract data from a CSV and add it into my newly created database/table
My csv input look like this
NodeName,NeId,Object,Time,Interval,Direction,NeAlias,NeType,Position,AVG,MAX,MIN,percent_0-5,percent_5-10,percent_10-15,percent_15-20,percent_20-25,percent_25-30,percent_30-35,percent_35-40,percent_40-45,percent_45-50,percent_50-55,percent_55-60,percent_60-65,percent_65-70,percent_70-75,percent_75-80,percent_80-85,percent_85-90,percent_90-95,percent_95-100,IdLogNum,FailureDescription
X13146PAZ,5002,1/11/100,2016-05-16 00:00:00,24,Near End,GE0097-TN01.1,AMM 20PB,-,69684,217287,772,10563,8055,10644,15147,16821,13610,7658,2943,784,152,20,3,0,0,0,0,0,0,0,0,0,-
...
X13146PAZ,5002,1/11/102,2016-05-16 00:00:00,24,Near End,GE0097-TN01.1,AMM 20PB,-,3056,28315,215,86310,90,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-
...
X13146PAZ,5002,1/11/103,2016-05-16 00:00:00,24,Near End,GE0097-TN01.1,AMM 20PB,-,769,7195,11,86400,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-
The mysql table is created but possibly that might be the issue as some ar varchar columns and some are integer columns
My server is a Ubuntu if that is of any use
My Code
# -*- coding: utf-8 -*-
#Imports
from datetime import date, timedelta
import sys
import MySQLdb as mdb
import csv
import os
#Vars
Yesterday = date.today() - timedelta(1)
#Opening document
RX_Document = open('./reports/X13146PAZ_TN_WAN_ETH_BAND_RX_' + Yesterday.strftime("%Y%m%d") + "_231500.csv" , 'r')
RX_Document_Str = './reports/X13146PAZ_TN_WAN_ETH_BAND_RX_' + Yesterday.strftime("%Y%m%d") + "_231500.csv"
csv_data = csv.reader(file(RX_Document_Str))
con = mdb.connect('localhost', 'username', 'password','tn_rx_utilization');
counter = 0
for row in csv_data:
if counter == 0:
print row
continue
counter = 1
if counter == 1:
cur = con.cursor()
cur.execute('INSERT INTO RX_UTIL(NodeName, NeId, Object, Time, Interval1,Direction,NeAlias,NeType,Position,AVG,MAX,MIN,percent_5-10,percent_10-15,percent_15-20,percent_20-25,percent_25-30,percent_30-35,percent_35-40,percent_40-45,percent_45-50,percent_50-55,percent_55-60,percent_60-65,percent_65-70,percent_70-75,percent_75-80,percent_80-85,percent_85-90,percent_90-95,percent_95-100,IdLogNum,FailureDescription)' 'VALUES("%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s","%s")',tuple(row[:34]))
con.commit()
#cur.execute("SELECT VERSION()")
#ver = cur.fetchone()
con.commit()
con.close()
You should not put the placeholder %s in quotes ":
cur.execute('''INSERT INTO RX_UTIL(NodeName, NeId, Object, Time, Interval1,Direction,
NeAlias,NeType,Position,AVG,MAX,MIN,"percent_5-10","percent_10-15",
"percent_15-20","percent_20-25","percent_25-30","percent_30-35",
"percent_35-40","percent_40-45","percent_45-50","percent_50-55",
"percent_55-60","percent_60-65","percent_65-70","percent_70-75",
"percent_75-80","percent_80-85","percent_85-90","percent_90-95",
"percent_95-100",IdLogNum,FailureDescription)
VALUES(%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,
%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s)''', tuple(row[:33]))
You are missing Percent_0-5 from your Insert
Remove the quotes from the %s references, this needs to be in String format, but the underlying data type will be passed.
There may be issues with datatype resulting from the csv reader. Have Python eval() the csv data to alter type as an INT. Here is some more information from another post:
Read data from csv-file and transform to correct data-type
cur.execute('INSERT INTO RX_UTIL(NodeName, NeId, Object, Time, Interval1,Direction,NeAlias,NeType,Position,AVG,MAX,MIN,percent_0-5,percent_5-10,percent_10-15,percent_15-20,percent_20-25,percent_25-30,percent_30-35,percent_35-40,percent_40-45,percent_45-50,percent_50-55,percent_55-60,percent_60-65,percent_65-70,percent_70-75,percent_75-80,percent_80-85,percent_85-90,percent_90-95,percent_95-100,IdLogNum,FailureDescription)' 'VALUES(%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s)',tuple(row[:34]))

Inserting JSON into MySQL using Python

I have a JSON object in Python. I am Using Python DB-API and SimpleJson. I am trying to insert the json into a MySQL table.
At moment am getting errors and I believe it is due to the single quotes '' in the JSON Objects.
How can I insert my JSON Object into MySQL using Python?
Here is the error message I get:
error: uncaptured python exception, closing channel
<twitstream.twitasync.TwitterStreamPOST connected at
0x7ff68f91d7e8> (<class '_mysql_exceptions.ProgrammingError'>:
(1064, "You have an error in your SQL syntax; check the
manual that corresponds to your MySQL server version for
the right syntax to use near ''favorited': '0',
'in_reply_to_user_id': '52063869', 'contributors':
'NULL', 'tr' at line 1")
[/usr/lib/python2.5/asyncore.py|read|68]
[/usr/lib/python2.5/asyncore.py|handle_read_event|390]
[/usr/lib/python2.5/asynchat.py|handle_read|137]
[/usr/lib/python2.5/site-packages/twitstream-0.1-py2.5.egg/
twitstream/twitasync.py|found_terminator|55] [twitter.py|callback|26]
[build/bdist.linux-x86_64/egg/MySQLdb/cursors.py|execute|166]
[build/bdist.linux-x86_64/egg/MySQLdb/connections.py|defaulterrorhandler|35])
Another error for reference
error: uncaptured python exception, closing channel
<twitstream.twitasync.TwitterStreamPOST connected at
0x7feb9d52b7e8> (<class '_mysql_exceptions.ProgrammingError'>:
(1064, "You have an error in your SQL syntax; check the manual
that corresponds to your MySQL server version for the right
syntax to use near 'RT #tweetmeme The Best BlackBerry Pearl
Cell Phone Covers http://bit.ly/9WtwUO''' at line 1")
[/usr/lib/python2.5/asyncore.py|read|68]
[/usr/lib/python2.5/asyncore.py|handle_read_event|390]
[/usr/lib/python2.5/asynchat.py|handle_read|137]
[/usr/lib/python2.5/site-packages/twitstream-0.1-
py2.5.egg/twitstream/twitasync.py|found_terminator|55]
[twitter.py|callback|28] [build/bdist.linux-
x86_64/egg/MySQLdb/cursors.py|execute|166] [build/bdist.linux-
x86_64/egg/MySQLdb/connections.py|defaulterrorhandler|35])
Here is a link to the code that I am using http://pastebin.com/q5QSfYLa
#!/usr/bin/env python
try:
import json as simplejson
except ImportError:
import simplejson
import twitstream
import MySQLdb
USER = ''
PASS = ''
USAGE = """%prog"""
conn = MySQLdb.connect(host = "",
user = "",
passwd = "",
db = "")
# Define a function/callable to be called on every status:
def callback(status):
twitdb = conn.cursor ()
twitdb.execute ("INSERT INTO tweets_unprocessed (text, created_at, twitter_id, user_id, user_screen_name, json) VALUES (%s,%s,%s,%s,%s,%s)",(status.get('text'), status.get('created_at'), status.get('id'), status.get('user', {}).get('id'), status.get('user', {}).get('screen_name'), status))
# print status
#print "%s:\t%s\n" % (status.get('user', {}).get('screen_name'), status.get('text'))
if __name__ == '__main__':
# Call a specific API method from the twitstream module:
# stream = twitstream.spritzer(USER, PASS, callback)
twitstream.parser.usage = USAGE
(options, args) = twitstream.parser.parse_args()
if len(args) < 1:
args = ['Blackberry']
stream = twitstream.track(USER, PASS, callback, args, options.debug, engine=options.engine)
# Loop forever on the streaming call:
stream.run()
use json.dumps(json_value) to convert your json object(python object) in a json string that you can insert in a text field in mysql
http://docs.python.org/library/json.html
To expand on the other answers:
Basically you need make sure of two things:
That you have room for the full amount of data that you want to insert in the field that you are trying to place it. Different database field types can fit different amounts of data.
See: MySQL String Datatypes. You probably want the "TEXT" or "BLOB" types.
That you are safely passing the data to database. Some ways of passing data can cause the database to "look" at the data and it will get confused if the data looks like SQL. It's also a security risk. See: SQL Injection
The solution for #1 is to check that the database is designed with correct field type.
The solution for #2 is use parameterized (bound) queries. For instance, instead of:
# Simple, but naive, method.
# Notice that you are passing in 1 large argument to db.execute()
db.execute("INSERT INTO json_col VALUES (" + json_value + ")")
Better, use:
# Correct method. Uses parameter/bind variables.
# Notice that you are passing in 2 arguments to db.execute()
db.execute("INSERT INTO json_col VALUES %s", json_value)
Hope this helps. If so, let me know. :-)
If you are still having a problem, then we will need to examine your syntax more closely.
The most straightforward way to insert a python map into a MySQL JSON field...
python_map = { "foo": "bar", [ "baz", "biz" ] }
sql = "INSERT INTO your_table (json_column_name) VALUES (%s)"
cursor.execute( sql, (json.dumps(python_map),) )
You should be able to insert intyo a text or blob column easily
db.execute("INSERT INTO json_col VALUES %s", json_value)
You need to get a look at the actual SQL string, try something like this:
sqlstr = "INSERT INTO tweets_unprocessed (text, created_at, twitter_id, user_id, user_screen_name, json) VALUES (%s,%s,%s,%s,%s,%s)", (status.get('text'), status.get('created_at'), status.get('id'), status.get('user', {}).get('id'), status.get('user', {}).get('screen_name'), status)
print "about to execute(%s)" % sqlstr
twitdb.execute(sqlstr)
I imagine you are going to find some stray quotes, brackets or parenthesis in there.
#route('/shoes', method='POST')
def createorder():
cursor = db.cursor()
data = request.json
p_id = request.json['product_id']
p_desc = request.json['product_desc']
color = request.json['color']
price = request.json['price']
p_name = request.json['product_name']
q = request.json['quantity']
createDate = datetime.now().isoformat()
print (createDate)
response.content_type = 'application/json'
print(data)
if not data:
abort(400, 'No data received')
sql = "insert into productshoes (product_id, product_desc, color, price, product_name, quantity, createDate) values ('%s', '%s','%s','%d','%s','%d', '%s')" %(p_id, p_desc, color, price, p_name, q, createDate)
print (sql)
try:
# Execute dml and commit changes
cursor.execute(sql,data)
db.commit()
cursor.close()
except:
# Rollback changes
db.rollback()
return dumps(("OK"),default=json_util.default)
One example, how add a JSON file into MySQL using Python. This means that it is necessary to convert the JSON file to sql insert, if there are several JSON objects then it is better to have only one call INSERT than multiple calls, ie for each object to call the function INSERT INTO.
# import Python's JSON lib
import json
# use JSON loads to create a list of records
test_json = json.loads('''
[
{
"COL_ID": "id1",
"COL_INT_VAULE": 7,
"COL_BOOL_VALUE": true,
"COL_FLOAT_VALUE": 3.14159,
"COL_STRING_VAULE": "stackoverflow answer"
},
{
"COL_ID": "id2",
"COL_INT_VAULE": 10,
"COL_BOOL_VALUE": false,
"COL_FLOAT_VALUE": 2.71828,
"COL_STRING_VAULE": "http://stackoverflow.com/"
},
{
"COL_ID": "id3",
"COL_INT_VAULE": 2020,
"COL_BOOL_VALUE": true,
"COL_FLOAT_VALUE": 1.41421,
"COL_STRING_VAULE": "GIRL: Do you drink? PROGRAMMER: No. GIRL: Have Girlfriend? PROGRAMMER: No. GIRL: Then how do you enjoy life? PROGRAMMER: I am Programmer"
}
]
''')
# create a nested list of the records' values
values = [list(x.values()) for x in test_json]
# print(values)
# get the column names
columns = [list(x.keys()) for x in test_json][0]
# value string for the SQL string
values_str = ""
# enumerate over the records' values
for i, record in enumerate(values):
# declare empty list for values
val_list = []
# append each value to a new list of values
for v, val in enumerate(record):
if type(val) == str:
val = "'{}'".format(val.replace("'", "''"))
val_list += [ str(val) ]
# put parenthesis around each record string
values_str += "(" + ', '.join( val_list ) + "),\n"
# remove the last comma and end SQL with a semicolon
values_str = values_str[:-2] + ";"
# concatenate the SQL string
table_name = "json_data"
sql_string = "INSERT INTO %s (%s)\nVALUES\n%s" % (
table_name,
', '.join(columns),
values_str
)
print("\nSQL string:\n\n")
print(sql_string)
output:
SQL string:
INSERT INTO json_data (COL_ID, COL_INT_VAULE, COL_BOOL_VALUE, COL_FLOAT_VALUE, COL_STRING_VAULE)
VALUES
('id1', 7, True, 3.14159, 'stackoverflow answer'),
('id2', 10, False, 2.71828, 'http://stackoverflow.com/'),
('id3', 2020, True, 1.41421, 'GIRL: Do you drink? PROGRAMMER: No. GIRL: Have Girlfriend? PROGRAMMER: No. GIRL: Then how do you enjoy life? PROGRAMMER: I am Programmer.');
The error may be due to an overflow of the size of the field in which you try to insert your json. Without any code, it is hard to help you.
Have you considerate a no-sql database system such as couchdb, which is a document oriented database relying on json format?
Here's a quick tip, if you want to write some inline code, say for a small json value, without import json.
You can escape quotes in SQL by a double quoting, i.e. use '' or "", to enter ' or ".
Sample Python code (not tested):
q = 'INSERT INTO `table`(`db_col`) VALUES ("{k:""some data"";}")'
db_connector.execute(q)

Categories