Python-Sqlite3: trying to export strings. Problem: contain "\N\n" - python

Hello and thank you for taking the time to read this. I want to extract large amounts of text from one database to another. The problem is that when I read the text from the database it contains a lot of "\N" "\n". I would really appreciate it if someone could point me in the right direction or tell what is going on here exactly and why the command fetchall is behaving like this...
Thank you in advance!
Example of the text in the TABLE:ourdata (I want to read from):
Example of text in the TABLE product_d I am writing to:
This is the code I am using:
import sqlite3
database_path = "E:/Thesis stuff/Python/database/"
conn = sqlite3.connect(database_path + 'test2.db')
c = conn.cursor()
current_number = 0
# creates the table to which the descriptions go
c.execute("CREATE TABLE IF NOT EXISTS product_d (product_description TEXT)")
conn.commit()
c.execute("SELECT description FROM ourdata")
text_to_format = c.fetchall()
print(text_to_format[0])
text_list = []
# make a list of the descriptions
for text in text_to_format:
text = str(text)
text_list.append(text)
# put all the elements of the list into the new table
for item in text_list:
c.execute("INSERT INTO product_d (product_description) VALUES (?)", (text_list[current_number],))
print("at number " + str(current_number))
current_number += 1
conn.commit()
c.close()
conn.close()

The giveaway is the parentheses around the string in the second example. fetchall() returns a list of tuples, and in your "make a list of the descriptions" block, you're explicitly converting those tuples to strings. Instead, what you want to do is simply grab the first (and only, in this case) element of the tuple. It should be as simple as changing this line:
text = str(text)
to this:
text = text[0]

Related

How to execute sqlite3 insert statement with a tuple, receiving TypeError: tuple indices must be integers or slices, not tuple

So, I have data pulled from a CSV that is not in the right order for the execute Insert Statement and will need to be inserted into two different tables.
I'm trying to insert into the database using a Tuple to access each specific piece accurately.
Such as FirstName comes from row[2] etc. I'm a total newb when it comes to this and have spent days researching on this site and many others... What is the right way to do a For Loop from CSV data, inserting it into a Database using SQLITE3 with a tuple?
Currently, I get the error:
line 49, in
values(?,?,?,?,?,?,?,?)''',r[0],r[2],r[1],r[3],r[4],r[7],r[8,],r[9])
TypeError: tuple indices must be integers or slices, not tuple
Here is my current code:
#!/usr/bin/python
# import modules
import sqlite3
import csv
import sys
# Get input and output names
#inFile = sys.argv[1]
#outFile = sys.argv[2]
inFile = 'mod5csv.csv'
outFile = 'test.db'
# Create DB
newdb = sqlite3.connect(str(outFile))
# If tables exist already then drop them
curs = newdb.cursor()
curs.execute('''drop table if exists courses''')
curs.execute('''drop table if exists people''')
#Create table
curs.execute('''create table people
(id text, lastname text, firstname text, email text, major text, city text, state text, zip text)''')
#curs.execute('''create table courses
# (id text, subjcode txt, coursenumber text, termcode text)''')
# Try to read in CSV
try:
reader = csv.reader(open(inFile, 'r'), delimiter = ',', quotechar='"')
except:
print("Sorry " + str(inFile) + " is not a valid CSV file.")
exit(1)
counter = 0
for row in reader:
counter += 1
if counter == 1:
continue
r = (row[0],row[1],row[2],row[3],row[4],row[5],row[6],row[7],row[8],row[9])
if counter == 5:
print(row[1])
print(counter)
#print(row[counter][2])
curs.execute('''insert into people (id,firstname,lastname,email,major,city,state,zip)
values(?,?,?,?,?,?,?,?)''',r[0],r[2],r[1],r[3],r[4],r[7],r[8,],r[9])
Notice r[8,]. Its easy to mistype a comma which turns an integer into a 1-tuple holding that integer. When the error is tuple indices must be integers or slices, not tuple and you are sure you typed in an integer literal, that's usually the problem.
Adding whitespace to your code can help spot the problem.
curs.execute('''insert into people
(id,firstname,lastname,email,major,city,state,zip)
values(?,?,?,?,?,?,?,?)''',
r[0], r[2], r[1], r[3], r[4], r[7], r[8], r[9])

Using python psychopg2 to get a count on all variables in the database that contain a specific string

I have a list, words = [word1, word2, word3, ...]
I want to use sql to return the number of times each word appears in Column A of an sql file. I can't figure out how to pass a variable into my sql query. Any help would be appreciated! My code so far looks like:
import psycopg2 as sql
for word in words
conn = sql.connect(**params)
c = conn.cursor()
#Create query and parameters to get usernames and ids
Query = """ SELECT COUNT(Column A) FROM file
WHERE Column A SIMILAR TO '% **VARIABLE WORD** %'
LIMIT 1000; """
try:
c.execute(Query)
except:
conn.commit()
print("Error in Query")
Result = c.fetchall()
Also, will this count return the total number of times the word appears or just the number of lines of column A in which it appears? (Will the count of the in "the team won the game" return one or two?)
The replaceable parameter flag used by psycopg2 is "%s", and to use a plain "%" in a query with replaceable parameters you need to double it (i.e., "%%"). So your code should look like:
Query = """SELECT COUNT(Column_A) FROM file
WHERE Column_A SIMILAR TO '%%%s%%'
LIMIT 1000;"""
try:
c.execute(Query, word)
This should return the number of lines in which the word appears, not the total number of occurrences of the word in all lines.
Your example has a space in the column name used; I've substituted an underscore, but if the column name really contains a space, the name should be double-quoted in this query.

How to fix Arabic unicode in a list

I made a database containing Arabic words and when I fetch the data and print it it's OK and works well and prints:
مشاعر‬
مودة
But when I loop into that database and turn it out to a list then print that list to see whats happening, I get this:
['\u202b\u202bمشاعر\u202c', '\u202b\u202bالمودة\u202c']
Here is the code:
cors.execute("SELECT * FROM DictContents") # Selecting from database
self.AraList = [] # empty list to put arabic words in
for raw in cors.fetchall(): # fetching data from database
rawAra = raw[1] # the database includes more than that so this index refer to arabic table
print(rawAra) # here is the first print . works fine as i said .
self.AraList.append(rawAra)
print(self.AraList) # here is the other list printing
I tried more than one way to fix it before I ask but none of them worked for me.
Found ...
import re
cors.execute("SELECT * FROM DictContents")
self.AraList = []
for raw in cors.fetchall():
rawAra = raw[1]
cleanit = re.compile('\w+.*')
cleanone = cleanit .search(rawAra)
if cleanone:
print(cleanone.group()) # prints the clean strings : مشاعر‬ مودة
self.AraList.append(cleanone.group()) # adding strings to list to see how it will looks like .
print(self.AraList) # prints much better clean list than firs one
['مشاعر\u202c - ', 'المودة\u202c']

Give a list of strings to a sql-query [Python]

I use Python2.7 on Windows 7 and a mysql server, connection by pymssql.
My Problem: I have a very big Database and I like to select the ID's of objects matching one of several words(string) from a list, I give to my program.
In this query there must be a LIKE %...% expression for these words of my list, too.
So far I connected my Python-Script to my Database and defined a cursor.
Then I made a small list with the words, I am searching for and I created some placeholders for my query later:
wortliste = ['Bruch', 'Verwerfung']
placeholders = ','.join(['%s'] * len(wortliste))
Here is my Query:
query = """ SELECT BO_INDEX FROM GeoTest.dbo.Tabelle_Bohrung
WHERE BO_BEMERKUNG IN ({})""".format(placeholders)
When I am searching for a single word, here for example for the word 'Bruch', my query would look like this:
query = """ SELECT BO_INDEX FROM GeoTest.dbo.Tabelle_Bohrung
WHERE BO_BEMERKUNG LIKE '%Bruch%'"""
This query for a single word matches the right Id's (=BO_INDEX).
The query with the placeholders doesn't crash, but it didn't match anything :(
But I like to loop my database for a couple of words and append the matching ID's for every word (string) in my list(=wortliste) and append it to a new list.
I really dont't know how to solve this problem!
I am grateful for every new way to solve this challenge!
Thanks!
EDIT 2:
If you want to loop over your list and append to the output (using your example):
words = ['ab', 'cd', 'ef']
abfrage_list = []
for w in words:
# Generate a query
query = """ SELECT BO_INDEX FROM GeoTest.dbo.Tabelle_Bohrung
WHERE BO_BEMERKUNG LIKE '%%%s%%' """ % w
# Execute that query and get results
cur.execute(query)
result_all = cur.fetchall()
# Add those results to your final list
for i in result_all:
abfrage_list.append(i)
EDIT:
For your example with multiple likes:
query = """ SELECT BO_INDEX FROM GeoTest.dbo.Tabelle_Bohrung
WHERE BO_BEMERKUNG LIKE '%ab%'
OR O_BEMERKUNG LIKE '%cd%'
OR O_BEMERKUNG LIKE '%ef%' """
query = """ SELECT BO_INDEX FROM GeoTest.dbo.Tabelle_Bohrung
WHERE {params}""".format(
params=" OR ".join("BO_BEMERKUNG LIKE '%%%s%%' \n" % w for w in wortliste)
)
print(query)
Prints:
SELECT BO_INDEX FROM GeoTest.dbo.Tabelle_Bohrung
WHERE BO_BEMERKUNG LIKE '%Bruch%'
OR BO_BEMERKUNG LIKE '%Verwerfung%'
Your placeholders doesn't contain any of the items from your word list, use:
placeholders = ','.join("'%s'" % w for w in wortliste)
For example:
wortliste = ['Bruch', 'Verwerfung']
print(','.join(['%s'] * len(wortliste)))
print(','.join("'%s'" % w for w in wortliste))
Prints:
%s,%s
'Bruch','Verwerfung'
for the Example of the following list = ['ab','cd','ef']
query = """ SELECT BO_INDEX FROM GeoTest.dbo.Tabelle_Bohrung
WHERE BO_BEMERKUNG LIKE '%ab%'
OR O_BEMERKUNG LIKE '%cd%'
OR O_BEMERKUNG LIKE '%ef%' """
cur.execute(query)
result_all = cur.fetchall()
abfrage_list = []
for i in result_all:
abfrage_list.append(i)
But I need this procedure for possibly hundreds of strings in this list.
i need to loop over this list and i need the LIKE expression in the query, otherwise it won't catch anything.

Read txt file and add a variable to the text - Python

In order to simplify some of my code I have decided to move queries and HTML code to txt files. However, an issue has come up: most of my queries and HTML that I normally keep inside the code have variable in the middle. For example, I have this in my code:
count = 0
for x in reviewers:
query = """select *
from mytable
where reviewer = """ + reviewers[count]
cur.execute(query)
count = count + 1
#do more stuff
The question is, how do I save queries or HTML code in txt files and then add variables in the middle of the strings?
Thanks!!
Ok so here is the solution I came up with I hope it helps
So you can save the Queries in text files in the form
SELECT * from %s where id = %d
And once you get the query you can place your variable in it. I am assuming that I already got the query from file.
query = "SELECT * from %s where id = %d"
completeQuery=query% ('myTable', 21)
print completeQuery
The output will be
SELECT * from myTable where id = 21
Reference
I'm still not sure what you want, Here's a way to read a file and add a variable name in the text
query = ""
f = open("query_file",'r')
query = f.read() # read the query file into a string
f.close()
for x in reviewers:
query = query+reviewers[count] # add variable name in the string assuming reviewers[count] gives a string
cur.execute(query)
count = count + 1
#do more stuff
EDIT
An important point strings in Python are immutable
if you want to modify string then you'd have to create a new string
for e.g
query = "Select from Table"
you want to make it Select Col from Table
here is what you do:-
add_me = "Col"
new_string = query[:-10] + add_me + query[6:]
now new_string string will have Select Col from Table

Categories