Summing database column in python - python

I have recently encountered the problem of adding the elements of a database column. Here is the following code:
import sqlite3
con = sqlite3.connect("values.db")
cur = con.cursor()
cur.execute('SELECT objects FROM data WHERE firm = "sony"')
As you can see, I connect to the database (sql) and I tell to Python to select the column "objects".
The problem is that I do not know the appropriate command for summing the selected objects.
Any ideas/ advices are highly reccomended.
Thank you in advance!!

If you can, have the database do the sum, as that reduces data transfer and lets the database do what it's good at.
cur.execute("SELECT sum(objects) FROM data WHERE firm = 'sony'")
or, if you're really just looking for the total count of objects.
cur.execute("SELECT count(objects) FROM data WHERE firm = 'sony'")
either way, your result is simply:
count = cur.fetchall()[0][0]

Try the following line:
print sum([ row[0] for row in cur.fetchall()])
If you want the items instead adding them together:
print ([ row[0] for row in cur.fetchall()])

Related

How to escape a #/# (for example 6/8) in the name of a table from a database

I am currently trying to get a list of values from a table inside an SQL database. The problem is appending the values due to the table's name in which I can't change. The table's name is something like Value123/123.
I tried making a variable with the name like
x = 'Value123/123'
then doing
row.append(x)
but that just prints Value123/123 and not the values from the database
cursor = conn.cursor()
cursor.execute("select Test, Value123/123 from db")
Test = []
Value = []
Compiled_Dict = {}
for row in cursor:
Test.append(row.Test)
Value.append(row.Value123/123)
Compiled_Dict = {'Date&Time': Test}
Compiled_Dict['Value'] = Value
conn.close()
df = pd.DataFrame(Compiled_Dict)
The problem occurs in this line
Value.append(row.Value123/123)
When I run it I get that the database doens't have a table named 'Value123'. Since I think it's trying to divide 123 by 123? Unfortunately the table in the database is named like this and I cannot change it, so how do I pull the values from this table?
Edit:
cursor.execute("select Test, Value123/123 as newValue from db")
I tried this and it worked thanks for the solutions. Suggested by Yu Jiaao

How to store and query hex values in mysqldb

I want to use a thermal printer with raspberry pi. I want to receive the printer vendor id and product id from mysql database. My columns are of type varchar.
My code is
import MySQLdb
from escpos.printer import Usb
db= MySQLdb.connect(host=HOST, port=PORT,user=USER, passwd=PASSWORD, db=database)
cursor = db.cursor()
sql = ("select * from printerdetails")
cursor.execute(sql)
result = cursor.fetchall()
db.close()
for row in result:
printer_vendor_id = row[2]
printer_product_id = row[3]
input_end_point = row[4]
output_end_point = row[5]
print printer_vendor_id,printer_product_id,input_end_point,output_end_point
Printer = Usb(printer_vendor_id,printer_product_id,0,input_end_point,output_end_point)
Printer.text("Hello World")
Printer.cut()
but it doesnot work. the id's are string. print command shows 0x154f 0x0517 0x82 0x02.in my case
Printer = Usb(0x154f,0x0517,0,0x82,0x02)
works fine.How could I store the same id's to the database and use them to configure the printer
Your problem is that your call to Usb is expecting integers, which works if you call it like this
Printer = Usb(0x154f,0x0517,0,0x82,0x02)
but your database call is returning tuples of hexadecimal values stored as strings. So you need to convert those strings to integers, like this:
for row in result:
printer_vendor_id = int(row[2],16)
printer_product_id = int(row[3],16)
input_end_point = int(row[4],16)
output_end_point = int(row[5],16)
Now if you do
print printer_vendor_id,printer_product_id,input_end_point,output_end_point
you will get
(5455, 1303, 130, 2)
which might look wrong, but isn't, which you can check by asking for the integers to be shown in hex format:
print ','.join('0x{0:04x}'.format(i) for i in (printer_vendor_id,printer_product_id,input_end_point,output_end_point))
0x154f,0x0517,0x0082,0x0002
I should point out that this only works because your database table contains only one row. for row in result loops through all of the rows in your table, but there happens to be only one, which is okay. If there were more, your code would always get the last row of the table, because it doesn't check the identifier of the row and so will repeatedly assign values to the same variables until it runs out of data.
The way to fix that is to put a where clause in your SQL select statement. Something like
"select * from printerdetails where id = '{0}'".format(printer_id)
Now, because I don't know what your database table looks like, the column name id is almost certainly wrong. And very likely the datatype also: it might very well not be a string.

Python foreach not looping properly

I'm writing a script that formats a bunch of csv files into one csv file.
To do this, I'm using a couple of cursor tables in python via sqlite.
Here is my code - currently I'm just trying to get every row in gsap that is associated with a code that is in gsap_locs to print
data = c.execute("SELECT * from gsap_locs")
for row in data:
print row[0]
d2 = c.execute("select date, cardtype, volume, transactions from gsap where gsaploc=?", (row[0],))
for r2 in d2:
print r2
However, my code is only returning one row. I know that the problem isn't in the first for because when I take out everything after print row[0] it prints out all of the values from the first select.
Why is it escaping out of my first for after my second for runs without satisfying the conditions of the first for?
You are missing the fetchall or fetchone instructions.
It's a common thing, we think that the execute has done the job of getting the data but you should use fetch.
To retrieve data after executing a SELECT statement, you can either treat the cursor as an iterator, call the cursor’s fetchone() method to retrieve a single matching row, or call fetchall() to get a list of the matching rows.
import sqlite3
conn = sqlite3.connect('gasp.sqlite')
c = conn.cursor()
c.execute("SELECT * FROM gsap_locs")
rows = c.fetchall()
for row in rows:
print row[0]
c.execute("select * from gsap where loc=?", (row[0],))
d2 = c.fetchall()
for r2 in d2:
print r2
conn.close()
Looks like cursor.execute can only track one operation/returns an iterator at a time. You might want to keep the results of the first operation in memory, calling tuple on it:
data = tuple(c.execute("SELECT * from gsap_locs"))
for row in data:
...
Be sure to have enough memory to hold all the results from the first query.

cleaning a Postgres table of bad rows

I have inherited a Postgres database, and am currently in the process of cleaning it. I have created an algorithm to find the rows where the data is bad. The algorithm is encoded into the function called checkProblems(). Using this, I am able to select the rows that contains the bad rows, as shown below ...
schema = findTables(dbName)
conn = psycopg2.connect("dbname='%s' user='postgres' host='localhost'"%dbName)
cur = conn.cursor()
results = []
for t in tqdm(sorted(schema.keys())):
n = 0
cur.execute('select * from %s'%t)
for i, cs in enumerate(tqdm(cur)):
if checkProblem(cs):
n += 1
results.append({
'tableName': t,
'totalRows': i+1,
'badRows' : n,
})
cur.close()
conn.close()
print pd.DataFrame(results)[['tableName', 'badRows', 'totalRows']]
Now, I need to delete the rows that are bad. I have two different ways of doing it. First, I can write the clean rows in a temporary table, and rename the table. I think that this option is too memory-intensive. It would be much better if I would be able to just delete the specific record at the cursor. Is this even an option?
Otherwise, what is the best way of deleting a record under such circumstances? I am guessing that this should be a relatively common thing that database administrators do ...
Of course that delete the specific record at the cursor is better. You can do something like:
for i, cs in enumerate(tqdm(cur)):
if checkProblem(cs):
# if cs is a tuple with cs[0] being the record id.
cur.execute('delete from %s where id=%d'%(t, cs[0]))
Or you can store the ids of the bad records and then do something like
DELETE FROM table WHERE id IN (id1,id2,id3,id4)

Variable in list name

I have this code :
cur.execute("SELECT * FROM foo WHERE date=?",(date,))
for row in cur:
list_foo.append(row[2])
cur.execute("SELECT * FROM bar WHERE date=?",(date,))
for row in cur:
list_bar.append(row[2])
It works fine, but I’d like to automize this. I have made a list of the tables in my sqlite database, and I’d like something like this :
table_list = ['foo','bar']
for t in table_list:
cur.execute("SELECT * FROM "+t+" WHERE date=?",(date,))
for row in cur:
# and here I’d like to append to the list which name depends of t (list_foo, then list_bar, etc.)
But I don’t know how to do that. Any idea ?
Use a dictionary to collect your data. Don't try to set new local names for each list.
You could use string templating too, and a list comprehension to turn your result rows into lists:
data = {}
for t in table_list:
cur.execute("SELECT * FROM {} WHERE date=?".format(t), (date,))
data[t] = [row[2] for row in cur]
One caveat: only do this with a pre-defined list of table names; don't ever interpolate untrusted input like that without hefty escaping to prevent SQL injection attacks.

Categories