SQL SELECT where cell is a certain length and includes specific characters - python

I'm trying to create a SELECT statement that selects rows where NAME is max. 5 characters and the . is in the NAME.
I only want the first, so I'm including a LIMIT 1 to the statement.
I have worked with the following
searchstring = "."
sql = "SELECT * FROM Table WHERE NAME LIKE %s LIMIT 1"
val = (("%"+searchstring+"%"),)
cursor.execute(sql, val)
But I'm not sure how to incorporate the length of NAME in my statement.
My "Table" is as follows:
ID NAME
1 Jim
2 J.
3 Jonathan
4 Jack M.
5 M.S.
So based on the table above, I would expect row 2 and 5 to be selected.
I could select all, and loop through them. But as I only want the first, I'm thinking I would prefer a SQL statement?
Thanks in advance.

You can use CHAR_LENGTH function along with LIKE:
SELECT * FROM Table WHERE name LIKE '%.%' AND CHAR_LENGTH(name) <= 5 LIMIT 1

Try LEN()
Select LEN(result string);
This will return the length of string. but this will count spaces also. Try removing it with LTRIM().

Oracle SQL
SELECT * FROM Table WHERE name LIKE '%.%' AND LENGTH(name) < 6 and rownum < 2
Base on the sql language(oracle, mysql, sql server, etc) use
length() or char_length()
rownum or limit

Related

Get the most common word in a MySQL table using Python

I have a table containing full of movie genre, like this:
id | genre
---+----------------------------
1 | Drama, Romance, War
2 | Drama, Musical, Romance
3 | Adventure, Biography, Drama
Im looking for a way to get the most common word in the whole genre column and return it to a variable for further step in python.
I'm new to Python so I really don't know how to do it. Currently, I have these lines to connect to the database but don't know the way to get the most common word mentioned above.
conn = mysql.connect()
cursor = conn.cursor()
most_common_word = cursor.execute()
cursor.close()
conn.close()
First you need get list of words in each column. i.e create another table like
genre_words(genre_id bigint, word varchar(50))
For clues how to do that you may check this question:
SQL split values to multiple rows
You can do that as temporary table if you wish or use transaction and rollback. Which one to choose depend of your data size and PC on which DB running.
After that query will be really simple
select count(*) as c, word from genre_word group by word order by count(*) desc limit 1;
You also can do it using python, but if so it will not be a MySQL question at all. Need read table, create simple list of word+counter. If it new, add it, if exist - increase counter.
from collections import Counter
# Connect to database and get rows from table
rows = ...
# Create a list to hold all of the genres
genres = []
# Loop through each row and split the genre string by the comma character
# to create a list of individual genres
for row in rows:
genre_list = row['genre'].split(',')
genres.extend(genre_list)
# Use a Counter to count the number of occurrences of each genre
genre_counts = Counter(genres)
# Get the most common genre
most_common_genre = genre_counts.most_common(1)
# Print the most common genre
print(most_common_genre)

psycopg2 Syntax errors at or near "' '"

I have a dataframe named Data2 and I wish to put values of it inside a postgresql table. For reasons, I cannot use to_sql as some of the values in Data2 are numpy arrays.
This is Data2's schema:
cursor.execute(
"""
DROP TABLE IF EXISTS Data2;
CREATE TABLE Data2 (
time timestamp without time zone,
u bytea,
v bytea,
w bytea,
spd bytea,
dir bytea,
temp bytea
);
"""
)
My code segment:
for col in Data2_mcw.columns:
for row in Data2_mcw.index:
value = Data2_mcw[col].loc[row]
if type(value).__module__ == np.__name__:
value = pickle.dumps(value)
cursor.execute(
"""
INSERT INTO Data2_mcw(%s)
VALUES (%s)
"""
,
(col.replace('\"',''),value)
)
Error generated:
psycopg2.errors.SyntaxError: syntax error at or near "'time'"
LINE 2: INSERT INTO Data2_mcw('time')
How do I rectify this error?
Any help would be much appreciated!
There are two problems I see with this code.
The first problem is that you cannot use bind parameters for column names, only for values. The first of the two %s placeholders in your SQL string is invalid. You will have to use string concatenation to set column names, something like the following (assuming you are using Python 3.6+):
cursor.execute(
f"""
INSERT INTO Data2_mcw({col})
VALUES (%s)
""",
(value,))
The second problem is that a SQL INSERT statement inserts an entire row. It does not insert a single value into an already-existing row, as you seem to be expecting it to.
Suppose your dataframe Data2_mcw looks like this:
a b c
0 1 2 7
1 3 4 9
Clearly, this dataframe has six values in it. If you were to run your code on this dataframe, then it would insert six rows into your database table, one for each value, and the data in your table would look like the following:
a b c
1
3
2
4
7
9
I'm guessing you don't want this: you'd rather your database table contained the following two rows instead:
a b c
1 2 7
3 4 9
Instead of inserting one value at a time, you will have to insert one entire row at time. This means you have to swap your two loops around, build the SQL string up once beforehand, and collect together all the values for a row before passing it to the database. Something like the following should hopefully work (please note that I don't have a Postgres database to test this against):
column_names = ",".join(Data2_mcw.columns)
placeholders = ",".join(["%s"] * len(Data2_mcw.columns))
sql = f"INSERT INTO Data2_mcw({column_names}) VALUES ({placeholders})"
for row in Data2_mcw.index:
values = []
for col in Data2_mcw.columns:
value = Data2_mcw[col].loc[row]
if type(value).__module__ == np.__name__:
value = pickle.dumps(value)
values.append(value)
cursor.execute(sql, values)

Format SQL query to Python

So far I have copied and pasted from SQL to Python simple codes where I have used the following formats:
sql = ("SELECT column1, column2, column3, column4 "
"FROM table1 "
"LEFT OUTER JOIN table2 ON x = y "
"LEFT OUTER JOIN table3 ON table3.z = table1.y "
However now I have started to copy into Python largest and more complicated SQL codes and I find quite difficult to use the same format as the above as columns start to contain sub-codes. I have seen some python packages that format an SQL code into python and I was wondering which one you suggest or what is the best and quiker way to overcome this situation.
You can use python multiline strings that start and end with three `
```This is a
a multi
line
string```
and not worry about formatting. This is what i generally use for such purposes but ideally you should go with an ORM
For reference please check
https://www.w3schools.com/python/python_strings.asp
For readability, you can try this
For example:
sql = """ SELECT country, product, SUM(profit) FROM sales left join
x on x.id=sales.k GROUP BY country, product having f > 7 and fk=9
limit 5; """
will result in:
sql = """
SELECT
country,
product,
SUM(profit)
FROM
sales
LEFT JOIN x ON
x.id = sales.k
GROUP BY
country,
product
HAVING
f > 7
AND fk = 9
LIMIT 5; """

Selecting rows from sqlite table with three criterions

I am new to Sqlite. In the following line of code c is a cursor:
c.execute(
"SELECT rowid,* from statements "
"WHERE [row1] = (?*) AND [row2] = (?) AND [row3] = 'text3'",
(text1,text2,)
)
items = c.fetchall()
In the above I was trying to find rows where the text in row1 is anything that begins with text1, e.g. I would like to select rows where
Row 1 is "the cow", "the horse" or thermometer" where text1="the"
and
Row 2 is "elephant" where text2="elephant"
and
Row 3 is "text3"
I was using the (?) operator because this is part of a function that will be called with different parameters.
You must use the operator LIKE instead of = for the 1st condition and for this to work, you must concatenate the placeholder ? with the wildcard '%':
c.execute("SELECT rowid,* from statements WHERE [row1] LIKE ? || '%' AND [row2] = ? AND [row3] = 'text3'", (text1,text2,))

Comparing multiple tables in Sqlite 3 using python

I am quite new to SQLITE3 as well as python. I a complete beginner in SQLite. I don't understand much. I am right now learning as a go for my project.I am working on a project where I have one database with about 20 tables inside of it. One table is for user input and the other tables are pre-loaded with values. How can I compare and match which values that are in the pre-loaded table with the user table?? For example:
Users Table:
Barcode: Item:
1234 milk
4321 cheese
5678 butter
8765 water
9876 sugar
Pre-Loaded Table:
Barcode: Availability:
1234 1
5678 1
9876 1
1111 1
Now, I want to be able to compare each row in the Pre-Loaded Table to each row in the Users Table. They both have the Barcode column in common to be able to compare. As a result, during the query process, it should check each row:
1234 - milk - 1 (those columns are equal )
5678 - butter - 1 ( those columns are equal)
9876 - sugar - 1 (those columns are equal)
1100 - - 1 ( this barcode does not exist in the Users Table)
so when a Barcode, in this case, 1100 doesn't exist in the Users Table, the code should print: You don't have all the items for the Pre-Loaded Table. How can I get the code to this?
so far I have this: This code does work by the way.
import sqlite3 as sq
connect = sq.connect('Food_Data.db')
con = connect.cursor()
sql = ("SELECT Users_Food.Barcode, Users_Food.Item, Recipe1.Ham_Swiss_Omelet FROM Users_Food INNER JOIN Recipe1 ON Users_Food.Barcode = Recipe1.Barcode WHERE Recipe1.Ham_Swiss_Omelet = '1'")
con.execute(sql)
data = con.fetchall()
print("You can make: Ham Swiss Omelet")
formatted_row = '{:<10} {:<9} {:>9} '
print(formatted_row.format("Barcode", "Ingredients", "Availability"))
for row in data:
print(formatted_row.format(*row))
#print (row[:])
#connect.commit()
It prints:
You can make: Ham Swiss Omelet
Barcode Ingredients Availability
9130849874 butter 1
2870896881 eggs 1
5501066727 water 1
1765023029 salt 1
9118188735 pepper 1
4087256674 ham 1
3009527296 cheese 1
The SQLite code:
sql = ("SELECT Users_Food.Barcode, Users_Food.Item, Recipe1.Ham_Swiss_Omelet FROM Users_Food INNER JOIN Recipe1 ON Users_Food.Barcode = Recipe1.Barcode WHERE Recipe1.Ham_Swiss_Omelet = '1'")
It combines the two tables with the Barcode in common and and the corresponding food names and availability. However, If one of the barcode values is not present in the Pre-Loaded table, when I compare how can I go about coding to know that it is not there while still displaying what is there in common between those two tables? It is like checking to see if the tables are identical.
Perhaps try your luck with LEFT JOIN and a CASE statement.
From sqlite doc
If the join-operator is a "LEFT JOIN" or "LEFT OUTER JOIN", then after
the ON or USING filtering clauses have been applied, an extra row is
added to the output for each row in the original left-hand input
dataset that corresponds to no rows at all in the composite dataset
(if any).
You need the Recipe1 table to be the left-hand table, because you need to select every row in that table. All columns from Users_Food will be null in the extra row. The sample query adds another column "status", which you can use in the python. With a little rearranging:
SELECT Users_Food.Barcode, Users_Food.Item, Recipe1.Ham_Swiss_Omelet,
CASE WHEN (Users_Food.Barcode is null then 'You cannot make this recipe' else ' ' END as status
FROM Recipe1
LEFT JOIN Users_Food ON Users_Food.Barcode = Recipe1.Barcode
WHERE Recipe1.Ham_Swiss_Omelet = '1'
In python you might not want to print("You can make: Ham Swiss Omelet") since you won't know whether that is true until you fetch all the returned rows.
After you get the SQL to return the rows that you want, you can play around with the python to get the desired output.

Categories