Storing numpy array in sqlite3 database with python issue - python

I have a problem with storing a numpy array in sqlite database. I have 1 table with Name and Data.
import sqlite3 as sql
from DIP import dip # function to caclculate numpy array
name = input('Enter your full name\t')
data = dip()
con = sql.connect('Database.db')
c = con.cursor()
c.execute("CREATE TABLE IF NOT EXISTS database(Name text, Vein real )")
con.commit()
c.execute("INSERT INTO database VALUES(?,?)", (name, data))
con.commit()
c.execute("SELECT * FROM database")
df = c.fetchall()
print(data)
print(df)
con.close()
Everything is fine but when Data is being stored instead of this:
[('Name', 0.03908678 0.04326234 0.18298542 ..., 0.15228545 0.09972548 0.03992807)]
I have this:
[('Name', b'\xccX+\xa8.\x03\xa4?\xf7\xda[\x1f ..., x10l\xc7?\xbf\x14\x12\)]
What is problem with this? Thank you.
P.S. I tried the solution from here Python insert numpy array into sqlite3 database but it didn't work. And my numpy array is being calculated from skimage (scikit-image) library with HOG (histogram of oriented gradients). Maybe that's a problem...
Also tried to calculate and store it from opencv3 but have the same issue.

On the assumption that it is saving data.tostring() to the database, I tried decoding it with fromstring.
Using your displayed string, and trimming off a few bytes I got:
In [79]: np.fromstring(b'\xccX+\xa8.\x03\xa4?\xf7\xda[\x1f\x10l\xc7?', float)
Out[79]: array([ 0.03908678, 0.18298532])
There's at least one matching number, so this looks promising.

I had similar issue and I have found out that sqlite has problem of storing custom numpy float type (np.float32 in my case).
Change the type of float values to string and it will work fine.
[float(x) for x in data]

Related

Connecting Python to Oracle - input contains NaN infinity or a value too large for dtype('float64') Error

I am new in Oracle and Python and I connected my python to Oracle. I had this table on my Oracle database and I wanted to write a simple query to see my result but it gave me this error:
Input contains NaN, infinity or a value too large for dtype('float64').
My code:
SQL_Query2 = pd.read_sql_query('''select Province_name, cnt from Provincepartnercnt''' , conn)
x_test = pd.DataFrame(SQL_Query2, columns=['Province_name','cnt'])
SQL_Query = pd.read_sql_query('''select Province_name, cnt from Provincepartnercnt''' , conn)
x_train = pd.DataFrame(SQL_Query, columns=['Province_name','cnt'])
myKNN = KNeighborsClassifier(n_neighbors = 1)
myKNN.fit(x_test, x_train)
Also my datatype are not float: one of my columns is in VARCHAR2(150 BYTE) and the other one is in NUMBER(38,0). Also I have to mention that non of my rows are null or anything else.
I found a way to insert my data without using that query. Maybe the problem is using sklearn. ( I just simply use this code.)
My code:
cursor = conn.cursor()
query= cursor.execute("select ostan_name, cnt from ostanpartnercnt")
for row in re:
print(row)

Psycopg2 Postgres Unnesting a very long array to insert it into a tables column

My database is on postgres and is local
I have an array that is in the form of:
[1,2,3,...2600]
As you can see it is a very long array so I cant type the elements one by one to insert them
So I wanted to use unnest() function to make it like this:
1
2
3
|
2600
and maybe go from there
however I still need to write the unnest like unnest(array [1,...,2600]) to work but ofcourse that didnt work
So how do I insert an array as rows of the same column at the same time?
You can use execute_values to bulk all your data into your table:
import psycopg2
from psycopg2.extras import execute_values
conn = psycopg2.connect(conn_string)
cursor = conn.cursor()
insert_query = "insert into table_name (col_name) values %s"
# create payload as list of tuples
data = [(i,) for i in range(1, 2601)]
execute_values(cursor, insert_query, data)
conn.commit()

How to store numpy.array in sqlite3 to take benefit of the sum function?

I am trying to use sqlite3 to compute the average of a numpy.array and I would like to take advantage of the sum function.
So far I have taken advantage of this post :
stackoverflow numpy.array
which help me to store and retreive easily the arrays I need.
import sqlite3
import numpy
import io
def adapt_array(arr):
out = io.BytesIO()
numpy.save(out, arr)
out.seek(0)
a = out.read()
return buffer(a)
def convert_array(text):
out = io.BytesIO(text)
out.seek(0)
return numpy.load(out)
sqlite3.register_adapter(numpy.ndarray, adapt_array)
sqlite3.register_converter("array", convert_array)
x1 = numpy.arange(12)
x2 = numpy.arange(12, 24)
con = sqlite3.connect(":memory:", detect_types = sqlite3.PARSE_DECLTYPES)
cur = con.cursor()
cur.execute("create table test (idx int, arr array)")
cur.execute("insert into test (idx, arr) values (?, ?)", (1, x1))
cur.execute("insert into test (idx, arr) values (?, ?)", (2, x2))
cur.execute("select idx, sum(arr) from test")
data = cur.fetchall()
print data
but unfortunately the request output does not give me the sum of the arrays.
[2, (0.0))
I would like to go one step further and get directly the result I want from an sql request. Thanks.
Edit : after reading stackoverflow : manipulation of nyumpy.array witl sqlite3 I am more sceptical about the feasibility of this. Any way to get a result close to what I want would be appreciated.
Edit2 : in other words what I am trying to do is to redefine the sum function to the particular kind of data I am using. IS it doable ? That's what was done to compress / uncompress the numpy.array.

How to convert numpy array to postgresql list

I am trying to use python to insert 2 columns of a numpy array into a postgresql table as two arrays.
postgresql table is DOS:
primary_key
energy integer[]
dos integer[]
I have a numpy array that is a 2d array of 2x1D arrays:
finArray = np.array([energy,dos])
I am trying to use the following script for inserting into a database and I keep getting errors with the insert. I can't figure out how to format the array so that it properly formats in the form: INSERT INTO dos VALUES(1,'{1,2,3}','{1,2,3}')"
Script:
import psycopg2
import argparse
import sys
import re
import numpy as np
import os
con = None
try:
con = psycopg2.connect(database='bla', user='bla')
cur = con.cursor()
cur.execute("INSERT INTO dos VALUES(1,'{%s}')", [str(finArray[0:3,0].tolist())[1:-1]])
con.commit()
except psycopg2.DatabaseError, e:
if con:
con.rollback()
print 'Error %s' % e
sys.exit(1)
finally:
if con:
con.close()
The part I can't figure out is I will get errors like this:
Error syntax error at or near "0.31691105000000003"
LINE 1: INSERT INTO dos VALUES(1,'{'0.31691105000000003, -300.0, -19...
I can't figure out where that inner ' ' is coming from in the bracket.
Too late, but putting this out anyway.
I was trying to insert a numpy array into Redshift today. After trying odo, df.to_sql() and what not, I finally got this to work at a pretty fast speed (~3k rows/minute). I won't talk about the issues I faced with those tools but here's something simple that works:
cursor = conn.cursor()
args_str = b','.join(cursor.mogrify("(%s,%s,...)", x) for x in tuple(map(tuple,np_data)))
cursor.execute("insert into table (a,b,...) VALUES "+args_str.decode("utf-8"))
cursor.commit()
cursor.close()
The 2nd line will need some work based on the dimensions of your array.
You might want to check these answers too:
Converting from numpy array to tuple
Multiple row inserts in psycopg2
You probably have an array of strings, try changing your command adding astype(float), like:
cur.execute("INSERT INTO dos VALUES(1,'{%s}')", [str(finArray[0:3,0].astype(float).tolist())[1:-1]])
The quotes come during the numpy.ndarray.tolist() and come because you actually have strings. If you don't want to assume that data is float-typed as #Saullo Castro suggested you could also do a simple str(finArray[0:3,0].tolist()).replace("'","")[1:-1] to get rid of them.
However, more appropriately, if you are treating the data in finArray in any way in your script and assume they are numbers, you should probably make sure they are imported into the array as numbers to start with.
You can require the array to have a certain datatype while initiating it by specifying, e.g. finArray = np.array(..., dtype=np.float) and then work backwards towards where it is suitable to enforce the type.
Psycopg will adapt a Python list to an array so you just have to cast the numpy array to a Python list and pass it to the execute method
import psycopg2
import numpy as np
energy = [1, 2, 3]
dos = [1, 2, 3]
finArray = np.array([energy,dos])
insert = """
insert into dos (pk, energy) values (1, %s);
;"""
conn = psycopg2.connect("host=localhost4 port=5432 dbname=cpn")
cursor = conn.cursor()
cursor.execute(insert, (list(finArray[0:3,0]),))
conn.commit()
conn.close()
You need convert the numpy array to a list, example:
import numpy as np
import psycopg2
fecha=12
tipo=1
precau=np.array([20.35,25.34,25.36978])
conn = psycopg2.connect("dbname='DataBase' user='Administrador' host='localhost' password='pass'")
cur = conn.cursor()
#make a list
vec1=[]
for k in precau:
vec1.append(k)
#make a query
query=cur.mogrify("""UPDATE prediccioncaudal SET fecha=%s, precaudal=%s WHERE idprecau=%s;""", (fecha,vec1,tipo))
#execute a query
cur.execute(query)
#save changes
conn.commit()
#close connection
cur.close()
conn.close()

multidimensional array to database in python

I am trying to figure out a way to put a two dimensional array (using python) into an sqlite database. My array:
['hello', 'hello', 'hello'], ['hello', 'hello', 'hello']
What I want is for each tuple of 'hello's would a new row with each 'hello' being its own attribute. I'm not sure what I'm trying to do is even possible (I hope it is). I tried following a few other posts but I keep getting the error:
sqlite3.InterfaceError: Error binding parameter 0 - probably unsupported type.
Does anyone know how to insert a multidimensional array into a sqlite database? Any help would be appreciated. Here is my code:
import sqlite3
array2d = [['hello' for x in xrange(3)] for x in xrange(3)]
var_string = ', '.join('?' * len(array2d))
conn = sqlite3.connect('sample.db')
c = conn.cursor()
c.execute('''CREATE TABLE sample (Name TEXT, Line_1 TEXT, Line_2 TEXT)''')
query_string = 'INSERT INTO sample VALUES (%s);' % var_string
c.execute(query_string, array2d)
Use the cursor.executemany() method to insert a sequence of rows in one go:
query_string = 'INSERT INTO sample VALUES (?, ?, ?)'
c.executemany(query_string, array2d)
conn.commit()
Don't forget to conn.commit() the transaction.
I've not bothered with formatting the SQL parameters here, for demonstration purposes; you don't really want to do so here anyway, as your number of columns is fixed at well (from the CREATE TABLE definition).

Categories