how to add headers when selecting items from an SQL table? - python

so my sql query quite simply just selects all of the items in the table and prints each entry one under the other. I am wondering if there is a way to get headers above each of the items so the user can tell what each item in the table means here is my current query if needed:
c.execute("SELECT * FROM outflows1")
data = c.fetchall()
print(data)
print("Currently in database:")
for row in data:
print(row)
conn.commit()
so this query outputs all of the items in my database however I am wondering if there is a way to put headers above the output which label what each of the items are. Thanks

Related

Why is my second database column being populated with 'none'?

I am trying to create a database with three columns, URL which is the location of the data I am aiming to scrape, STATUS which is the ticker symbol of the stock, and STATUS which will be used to inform whether the data has been acquired yet, or not.
import sqlite3
conn = sqlite3.connect('tickers.db')
conn.execute('''CREATE TABLE TAB(URL, TICKER, STATUS default "Not started");''')
for i in url_list:
conn.execute("INSERT INTO TAB(URL) VALUES(?)",(i,))
for j in ticklist:
conn.execute("INSERT INTO TAB(TICKER) VALUES(?)",(j,))
for row in conn.execute("SELECT URL, TICKER, STATUS from TAB"):
print('URL={i}'.format(i=row[0]))
print('TICKER={i}'.format(i=row[1]))
print('STATUS={i}'.format(i=row[2]))
To populate the URL column I have used a list of URL's, similarly I am trying to the same thing with TICKER, however when I run the code, the column is only populated with 'none' for all rows.
Output
URL=https://api.pushshift.io/reddit/search/submission/?q=$AACG&subreddit=wallstreetbets&metadata=true&size=0&after=1610928000&before=1613088000
TICKER=None
STATUS=Not started
URL=https://api.pushshift.io/reddit/search/submission/?q=$AACIU&subreddit=wallstreetbets&metadata=true&size=0&after=1610928000&before=1613088000
TICKER=None
STATUS=Not started
Instead of trying to populate the columns, why not insert as rows directly?
Assuming url_list and ticklist are of equal length (and even if not) you can Try this:
for i, j in zip(url_list,ticklist):
conn.execute("INSERT INTO TAB(URL, TICKER) VALUES(?,?)",(i,j))
That way you are adding the values as expected and not creating new rows with every insert

SQL: SELECT where one of many columns contains 'x' and result is not "NULL"

I have a piece of code that I realized is probably quite inefficient, though I'm not sure how to improve it.
Basically, I have a database table like this:
Example DB table
Any or several of columns A-G might match my search query. If that is the case, I want to query VALUE from that row. I need VALUE not to equal NULL though, so if that's the case, it should keep looking. If my query were abc, I'd want to obtain correct.
Below is my current code, using a database named db with a table table.
cur=db.cursor()
data="123"
fields_to_check=["A","B","C","D","E","F","G"]
for field in fields_to_check:
"SELECT Value FROM table WHERE {}='{}'".format(field,data)
query=cur.fetchone()
if query and query !="NULL":
break
db.close()
I think that the fact that this performs 8 queries is likely very inefficient.
cur=db.cursor()
data="123"
fields_to_check=["A","B","C","D","E","F","G"]
sub_query = ""
for field in fields_to_check:
sub_query = sub_query + "or {}='{}' ".format(field,data)
if sub_query:
query = "SELECT Value FROM table WHERE ("+ str(sub_query[2:]) +") and value IS NOT NULL;"
if query:
cur.execute(query)
rows = cur.fetchall()
if rows:
for row in rows:
print(row)

Python foreach not looping properly

I'm writing a script that formats a bunch of csv files into one csv file.
To do this, I'm using a couple of cursor tables in python via sqlite.
Here is my code - currently I'm just trying to get every row in gsap that is associated with a code that is in gsap_locs to print
data = c.execute("SELECT * from gsap_locs")
for row in data:
print row[0]
d2 = c.execute("select date, cardtype, volume, transactions from gsap where gsaploc=?", (row[0],))
for r2 in d2:
print r2
However, my code is only returning one row. I know that the problem isn't in the first for because when I take out everything after print row[0] it prints out all of the values from the first select.
Why is it escaping out of my first for after my second for runs without satisfying the conditions of the first for?
You are missing the fetchall or fetchone instructions.
It's a common thing, we think that the execute has done the job of getting the data but you should use fetch.
To retrieve data after executing a SELECT statement, you can either treat the cursor as an iterator, call the cursor’s fetchone() method to retrieve a single matching row, or call fetchall() to get a list of the matching rows.
import sqlite3
conn = sqlite3.connect('gasp.sqlite')
c = conn.cursor()
c.execute("SELECT * FROM gsap_locs")
rows = c.fetchall()
for row in rows:
print row[0]
c.execute("select * from gsap where loc=?", (row[0],))
d2 = c.fetchall()
for r2 in d2:
print r2
conn.close()
Looks like cursor.execute can only track one operation/returns an iterator at a time. You might want to keep the results of the first operation in memory, calling tuple on it:
data = tuple(c.execute("SELECT * from gsap_locs"))
for row in data:
...
Be sure to have enough memory to hold all the results from the first query.

Put retrieved data from MySQL query into DataFrame pandas by a for loop

I have one database with two tables, both have a column called barcode, the aim is to retrieve barcode from one table and search for the entries in the other where extra information of that certain barcode is stored. I would like to have bothe retrieved data to be saved in a DataFrame. The problem is when I want to insert the retrieved data into DataFrame from the second query, it stores only the last entry:
import mysql.connector
import pandas as pd
cnx = mysql.connector(user,password,host,database)
query_barcode = ("SELECT barcode FROM barcode_store")
cursor = cnx.cursor()
cursor.execute(query_barcode)
data_barcode = cursor.fetchall()
Up to this point everything works smoothly, and here is the part with problem:
query_info = ("SELECT product_code FROM product_info WHERE barcode=%s")
for each_barcode in data_barcode:
cursor.execute(query_info % each_barcode)
pro_info = pd.DataFrame(cursor.fetchall())
pro_info contains only the last matching barcode information! While I want to retrieve all the information for each data_barcode match.
That's because you are consistently overriding existing pro_info with new data in each loop iteration. You should rather do something like:
query_info = ("SELECT product_code FROM product_info")
cursor.execute(query_info)
pro_info = pd.DataFrame(cursor.fetchall())
Making so many SELECTs is redundant since you can get all records in one SELECT and instantly insert them to your DataFrame.
#edit: However if you need to use the WHERE statement to fetch only specific products, you need to store records in a list until you insert them to DataFrame. So your code will eventually look like:
pro_list = []
query_info = ("SELECT product_code FROM product_info WHERE barcode=%s")
for each_barcode in data_barcode:
cursor.execute(query_info % each_barcode)
pro_list.append(cursor.fetchone())
pro_info = pd.DataFrame(pro_list)
Cheers!

MySQL - Match two tables contains HUGE DATA and find the similar data

I have two tables in my SQL.
Table 1 contains many data, but Table 2 contains huge data.
Here's the code I implement using Python
import MySQLdb
db = MySQLdb.connect(host = "localhost", user = "root", passwd="", db="fak")
cursor = db.cursor()
#Execute SQL Statement:
cursor.execute("SELECT invention_title FROM auip_wipo_sample WHERE invention_title IN (SELECT invention_title FROM us_pat_2005_to_2012)")
#Get the result set as a tuple:
result = cursor.fetchall()
#Iterate through results and print:
for record in result:
print record
print "Finish."
#Finish dealing with the database and close it
db.commit()
db.close()
However, it takes so long. I have run the Python script for 1 hour, and it still doesn't give me any results yet.
Please help me.
Do you have index on invention_title in both tables? If not, then create it:
ALTER TABLE auip_wipo_sample ADD KEY (`invention_title`);
ALTER TABLE us_pat_2005_to_2012 ADD KEY (`invention_title`);
Then combine your query into one which don't use subqueries:
SELECT invention_title FROM auip_wipo_sample
INNER JOIN us_pat_2005_to_2012 ON auip_wipo_sample.invention_title = us_pat_2005_to_2012.invention_title
And let me know about your results.

Categories