How to work with the python sqlite output data?

How to work with the python sqlite output data? - python

I need a help with Python3.7 sqlite3 database. I have created it, inserted some data in it but I am not able to work with these data then.
E. g.
test = db.execute("SELECT MESSAGE from TEST")
for row in test:
print(row[0])
This is the only thing I’ve found. But what if I want to work with the data? What if I now want to make something like:
if (row[0] == 1):
...
I could not do it that way. It does not work. Can you help me? Thank you.

The database queries returned as array of the rows. And each row is array of received column values.
More correctly test is tuple or tuples, but let's keep it simple.
In your example could be many rows, each with a single column of data.
To access first row:
test[0]
To access data in the second row:
test[1][0]
Your example:
if (test[0][0] == 1):
...
I hope that helped.

Related

Remove duplicates from paired rows before counting values

I have a Python program I am trying to convert from CSV to SQLite, I have managed to do everything apart from remove duplicates for counting entries. My database is JOINed. I'm reading the database like this:
df = pd.read_sql_query("SELECT d.id AS is, mac.add AS mac etc etc
I have tried df.drop_duplicates('tablename1','tablename2')
and
df.drop_duplicates('row[1],row[3]')
but it doesn't seem to work.
The below code is what I used with the CSV version & I would like to replicate for the Python SQLite script.
for row in reader:
key = (row[1], row[2])
if key not in entries:
writer.writerow(row)
entries.add(key)
del writer

have you tried running SELECT DISTINCT col1,col2 FROM table first?
In your case it might be as simple as placing the DISTINCT keyword prior to your column names.

You need to use the subset parameter
df.drop_duplicates(subset=['tablename1','tablename2'])

Thank you piRSquared, The missing subset is all i needed, thank you.
You need to use the subset parameter
df.drop_duplicates(subset=['tablename1','tablename2'])
Will also look into SELECT DISTINCT but for now, subset works.

TypeError: tuple indices must be integers, not str

I am trying to pull data from a database and assign them to different lists.
This specific error is giving me a lot of trouble "TypeError: tuple indices must be integers, not str"
I tried converting it to float and etc, but to no success.
The code goes as below
conn=MySQLdb.connect(*details*)
cursor=conn.cursor()
ocs={}
oltv={}
query="select pool_number, average_credit_score as waocs, average_original_ltv as waoltv from *tablename* where as_of_date= *date*"
cursor.execute(query)
result=cursor.fetchall()
for row in result:
print row
ocs[row["pool_number"]]=int(row["waocs"])
oltv[row["pool_number"]]=int(row["waoltv"])
Sample output of print statement is as follows :
('MA3146', 711L, 81L)
('MA3147', 679L, 83L)
('MA3148', 668L, 86L)
And this is the exact error I am getting:
ocs[row["pool_number"]]=int(row["waocs"])
TypeError: tuple indices must be integers, not str
Any help would be appreciated! Thanks people!

Like the error says, row is a tuple, so you can't do row["pool_number"]. You need to use the index: row[0].

I think you should do
for index, row in result:
If you wanna access by name.

TL;DR: add the parameter cursorclass=MySQLdb.cursors.DictCursor at the end of your MySQLdb.connect.
I had a working code and the DB moved, I had to change the host/user/pass. After this change, my code stopped working and I started getting this error. Upon closer inspection, I copy-pasted the connection string on a place that had an extra directive. The old code read like:
conn = MySQLdb.connect(host="oldhost",
user="olduser",
passwd="oldpass",
db="olddb",
cursorclass=MySQLdb.cursors.DictCursor)
Which was replaced by:
conn = MySQLdb.connect(host="newhost",
user="newuser",
passwd="newpass",
db="newdb")
The parameter cursorclass=MySQLdb.cursors.DictCursor at the end was making python allow me to access the rows using the column names as index. But the poor copy-paste eliminated that, yielding the error.
So, as an alternative to the solutions already presented, you can also add this parameter and access the rows in the way you originally wanted. ^_^ I hope this helps others.

I know it is not specific to this question, but for anyone coming in from a Google search: this error is also caused by a comma behind an object that creates a tuple rather than a dictionary
>>>dict = {}
>>>tuple = {},
Tuple
>>>tuple_ = {'key' : 'value'},
>>>type(tuple_)
<class 'tuple'>
Dictionary
>>>dict_ = {'key' : 'value'}
>>>type(dict_)
<class 'dict'>

Just adding a parameter like the below worked for me.
cursor=conn.cursor(dictionary=True)
I hope this would be helpful either.

The Problem is how you access row
Specifically row["waocs"] and row["pool_number"] of ocs[row["pool_number"]]=int(row["waocs"])
If you look up the official-documentation of fetchall() you find.
The method fetches all (or all remaining) rows of a query result set and returns a list of tuples.
Therefore you have to access the values of rows with row[__integer__] like row[0]

SQlite3 has a method named row_factory. This method would allow you to access the values by column name.
https://www.kite.com/python/examples/3884/sqlite3-use-a-row-factory-to-access-values-by-column-name

I see that you're trying to identify by the name of a row. If you are looking for a specific column within the row, you can do [integer][column name]
For example, to iterate through each row and only pull out the value from the row with the column header of "pool number", you can do this:
for row in df_updated.iterrows():
cell = row[1]['pool number']
print(cell)
The code will then iterate through each row but only print out the value that matches the "pool number" column

Python SQLite3 fetch raw?

I have a problem with this code :
cur.execute('SELECT Balance FROM accounts')
print(cur.fetchone())
That outputs: (0,) instead of what I want 0.
Can anyone help to fix the error? Any help is very much appreciated!

fetchone() would return you a single table row which may contain multiple columns. In your case it is a single column value returned in a tuple. Just get it by index:
data = cur.fetchone()
print(data[0])

It's possible there would be more than one value in your query, so it always returns a tuple (you wouldn't want an interface which changes depending upon the data you pass it would you?).
You can unpack the tuple:
value, = cur.fetchone()
See the last paragraph of the documentation on tuples and sequences for information about sequence unpacking

syntax error when attempting to insert data into postgresql

I am attempting to insert parsed dta data into a postgresql database with each row being a separate variable table, and it was working until I added in the second row "recodeid_fk". The error I now get when attempting to run this code is: pg8000.errors.ProgrammingError: ('ERROR', '42601', 'syntax error at or near "imp"').
Eventually, I want to be able to parse multiple files at the same time and insert the data into the database, but if anyone could help me understand whats going on now it would be fantastic. I am using Python 2.7.5, the statareader is from pandas 0.12 development records, and I have very little experience in Python.
dr = statareader.read_stata('file.dta')
a = 2
t = 1
for t in range(1,10):
z = str(t)
for date, row in dr.iterrows():
cur.execute("INSERT INTO tblv00{} (data, recodeid_fk) VALUES({}, {})".format(z, str(row[a]),29))
a += 1
t += 1
conn.commit()
cur.close()
conn.close()

To your specific error...
The syntax error probably comes from strings {} that need quotes around them. execute() can take care of this for you automtically. Replace
execute("INSERT INTO tblv00{} (data, recodeid_fk) VALUES({}, {})".format(z, str(row[a]),29))
execute("INSERT INTO tblv00{} (data, recodeid_fk) VALUES(%s, %s)".format(z), (row[a],29))
The table name is completed the same way as before, but the the values will be filled in by execute, which inserts quotes if they are needed. Maybe execute could fill in the table name too, and we could drop format entirely, but that would be an unusual usage, and I'm guessing execute might (wrongly) put quotes in the middle of the name.
But there's a nicer approach...
Pandas includes a function for writing DataFrames to SQL tables. Postgresql is not yet supported, but in simple cases you should be able to pretend that you are connected to sqlite or MySQL database and have no trouble.
What do you intend with z here? As it is, you loop z from '1' to '9' before proceeding to the next for loop. Should the loops be nested? That is, did you mean to insert the contents dr into nine different tables called tblv001 through tblv009?
If you mean that loop to put different parts of dr into different tables, please check the indentation of your code and clarify it.
In either case, the link above should take care of the SQL insertion.
Response to Edit
It seems like t, z, and a are doing redundant things. How about:
import pandas as pd
import string
...
# Loop through columns of dr, and count them as we go.
for i, col in enumerate(dr):
table_name = 'tblv' + string.zfill(i, 3) # e.g., tblv001 or tblv010
df1 = DataFrame(dr[col]).reset_index()
df1.columns = ['data', 'recodeid_fk']
pd.io.sql.write_frame(df1, table_name, conn)
I used reset_index to make the index into a column. The new (sequential) index will not be saved by write_frame.

Any easy way to alter the data that comes from a mysql database?

so I'm using mysql to grab data from a database and feeding it into a python function. I import mysqldb, connect to the database and run a query like this:
conn.query('SELECT info FROM bag')
x = conn.store_result()
for row in x.fetch_row(100):
print row
but my problem is that my data comes out like this (1.234234,)(1.12342,)(3.123412,)
when I really want it to come out like this: 1.23424, 1.1341234, 5.1342314 (i.e. without parenthesis). I need it this way to feed it into a python function. Does anyone know how I can grab data from the database in a way that doesn't have parenthesis?

Rows are returned as tuples, even if there is only one column in the query. You can access the first and only item as row[0]
The first time around in the for loop, row does indeed refer to the first row. The second time around, it refers to the second row, and so on.
By the way, you say that you are using mySQLdb, but the methods that you are using are from the underlying _mysql library (low level, scarcely portable) ... why??

You could also simply use this as your for loop:
for (info, ) in x.fetch_row(100):
print info

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to work with the python sqlite output data? - python

Related

Remove duplicates from paired rows before counting values

TypeError: tuple indices must be integers, not str

Python SQLite3 fetch raw?

syntax error when attempting to insert data into postgresql

Any easy way to alter the data that comes from a mysql database?

Categories

Resources