I am trying to store some tables I create in my code in an RDS instance using psycopg2. The script runs without issue and I can see the table being stored correctly in the DB. However, if I try to retrieve the query, I only see the columns, but no data:
import pandas as pd
import psycopg2
test=pd.DataFrame({'A':[1,1],'B':[2,2]})
#connect is a function to connect to the RDS instance
connection= connect()
cursor=connection.cursor()
query='CREATE TABLE test (A varchar NOT NULL,B varchar NOT NULL);'
cursor.execute(query)
connection.commit()
cursor.close()
connection.close()
This script runs without issues and, printing out file_check from the following script:
connection=connect()
# check if file already exists in SQL
sql = """
SELECT "table_name","column_name", "data_type", "table_schema"
FROM INFORMATION_SCHEMA.COLUMNS
WHERE "table_schema" = 'public'
ORDER BY table_name
"""
file_check=pd.read_sql(sql, con=connection)
connection.close()
I get:
table_name column_name data_type table_schema
0 test a character varying public
1 test b character varying public
which looks good.
Running the following however:
read='select * from public.test'
df=pd.read_sql(read,con=connection)
returns:
Empty DataFrame
Columns: [a, b]
Index: []
Anybody have any idea why this is happening? I cannot seem to get around this
Erm, your first script has a test_tbl dataframe, but it's never referred to after it's defined.
You'll need to
test_tbl.to_sql("test", connection)
or similar to actually write it.
A minimal example:
$ createdb so63284022
$ python
>>> import sqlalchemy as sa
>>> import pandas as pd
>>> test = pd.DataFrame({'A':[1,1],'B':[2,2], 'C': ['yes', 'hello']})
>>> engine = sa.create_engine("postgres://localhost/so63284022")
>>> with engine.connect() as connection:
... test.to_sql("test", connection)
...
>>>
$ psql so63284022
so63284022=# select * from test;
index | A | B | C
-------+---+---+-------
0 | 1 | 2 | yes
1 | 1 | 2 | hello
(2 rows)
so63284022=# \d+ test
Table "public.test"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------+--------+-----------+----------+---------+----------+--------------+-------------
index | bigint | | | | plain | |
A | bigint | | | | plain | |
B | bigint | | | | plain | |
C | text | | | | extended | |
Indexes:
"ix_test_index" btree (index)
Access method: heap
so63284022=#
I was able to solve this:
As it was pointed out by #AKX, I was only creating the table structure, but I was not filling in the table.
I now import import psycopg2.extras as well and, after this:
query='CREATE TABLE test (A varchar NOT NULL,B varchar NOT NULL);'
cursor.execute(query)
I add something like:
update_query='INSERT INTO test(A, B) VALUES(%s,%s) ON CONFLICT DO NOTHING'
psycopg2.extras.execute_batch(cursor, update_query, test.values)
cursor.close()
connection.close()
My table is now correctly filled after checking with pd.read_sql
Related
I am trying to see if theres anyway i can implement this piece of code using only sql REDSHIFT
a = '''
SELECT to_char(DATE '2022-01-01'
+ (interval '1 day' * generate_series(0,365)), 'YYYY_MM_DD') AS ym
'''
dfa = pd.read_sql(a, conn)
b = f'''
select account_no, {','.join('"' + str(x) + '"' for x in dfa.ym)}
from loan__balance_table
where account_no =
'''
dfb = pd.read_sql(b, conn)
the first query will yield something like this
| ym |
| ---------- |
| 2022_01_01 |
| 2022_01_02 |
...
| 2022_12_31|
Then i used string concatenation to combime the dates together and use then in the second query to select all columns in ym. The result of the second query should be something like this.
| account_no | 2022_01_01 | 2022_01_01 | ...
| ---------- | ---------- | ---------- | ...
| 1234 | 234,987.09 | 233,989.19 | ...
I just want to know if theres a way i can combine both queries together as one in sql without using python to concat the column_names.
I tried using CTE but i cant seem to get it right i dont even know if this is the right approach, The database is REDSHIFT
Consider a table users of 6 rows
+_______________________+
| userid | name |
+-----------------------+
| 1 | john |
| 2 | steve |
| 3 | joe |
| 4 | jason |
| 5 | abraham |
| 6 | leonard |
+-----------------------+
I am using the below SQL query:
SELECT userid,name FROM users where userid IN (2,3,4,5);
which returns 4 rows -
| 2 | steve |
| 3 | joe |
| 4 | jason |
| 5 | abraham |
Pymysql equivalent code is as below:
def get_username(user_ids):
data=[]
conn = init_db()
cur = conn.cursor(pymysql.cursors.DictCursor)
cur.executemany("SELECT userid,name from users WHERE userid IN (%s)",user_ids)
rows=cur.fetchall()
for row in rows:
data.append([row['userid'],row['name']])
cur.close()
conn.close()
return data
user_ids=[2,3,4,5]
get_usernames(user_ids)
This code just returns the last row [[5,abraham]] . How can I fetch all those rows?.
That's the (partly documented) behaviour of .executemany():
Help on method executemany in module pymysql.cursors:
executemany(self, query, args) method of pymysql.cursors.Cursor instance
Run several data against one query
:param query: query to execute on server
:param args: Sequence of sequences or mappings. It is used as parameter.
:return: Number of rows affected, if any.
This method improves performance on multiple-row INSERT and
REPLACE. Otherwise it is equivalent to looping over args with
execute().
So what you want here is cursor.execute() - but then, you have a bit more work to build your SQL query:
user_ids = (2, 3, 4, 5)
placeholders = ", ".join(["%s"] * len(user_ids))
sql = "SELECT userid,name from users WHERE userid IN ({})".format(placeholders)
cursor.execute(sql, user_ids)
data = list(cursor)
Note that cursors are iterables, so you don't need to explicitely call cursor.fetchall() then iterate on the result, you can iterate directly on the cursor. Also note that if you want a list of (id, name) tuples, using a DictCursor is just a double waste of CPU cycles (once for building the dicts and once for rebuilding tuples out of them), you could just use a default cursor and return list(cursor) instead.
My first guess is that is something related SELECT statement.
May you try this way of generating the query?
def get_username(user_ids):
data=[]
conn = init_db()
cur = conn.cursor(pymysql.cursors.DictCursor)
cur.executemany("SELECT userid,name from users WHERE userid IN "+"("+','.join(str(e) for e in user_ids)+")")
rows=cur.fetchall()
for row in rows:
data.append([row['userid'],row['name']])
cur.close()
conn.close()
return data
user_ids=[2,3,4,5]
get_usernames(user_ids)
I have a MySQL database that contains a table named commands with the following structure:
+-----------+---------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+---------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| input | varchar(3000) | NO | | NULL | |
| inputhash | varchar(66) | YES | UNI | NULL | |
+-----------+---------------+------+-----+---------+----------------+
I am trying to insert rows in it, but only if the inputhash field does not already exist. I thought INSERT IGNORE was the way to do this, but I am still getting warnings.
For instance, suppose that the able already contains
+----+---------+------------------------------------------------------------------+
| id | input | inputhash |
+----+---------+------------------------------------------------------------------+
| 1 | enable | 234a86bf393cadeba1bcbc09a244a398ac10c23a51e7fd72d7c449ef0edaa9e9 |
+----+---------+------------------------------------------------------------------+
Then when using the following Python code to insert a row
import MySQLdb
db = MySQLdb.connect(host='xxx.xxx.xxx.xxx', user='xxxx', passwd='xxxx', db='dbase')
c = db.cursor()
c.execute('INSERT IGNORE INTO `commands` (`input`, `inputhash`) VALUES (%s, %s)', ('enable', '234a86bf393cadeba1bcbc09a244a398ac10c23a51e7fd72d7c449ef0edaa9e9',))
I am getting the warning
Warning: Duplicate entry '234a86bf393cadeba1bcbc09a244a398ac10c23a51e7fd72d7c449ef0edaa9e9' for key 'inputhash'
c.execute('INSERT IGNORE INTO `commands` (`input`, `inputhash`) VALUES (%s, %s)', ('enable','234a86bf393cadeba1bcbc09a244a398ac10c23a51e7fd72d7c449ef0edaa9e9',))
Why does this happen? I thought that the whole point of using INSERT IGNORE on a table with UNIQUE fields is to suppress the error and simply ignore the write attempt?
What is the proper way to resolve this? I suppose I can suppress the warning in Python with warnings.filterwarnings('ignore') but why does the warning appear in the first place?
I hope it will help you !
import MySQLdb
db = MySQLdb.connect(host='xxx.xxx.xxx.xxx', user='xxxx', passwd='xxxx',
db='dbase')
c = db.cursor()
c.execute('INSERT INTO `commands` (`input`, `inputhash`) VALUES ('enable',
'234a86bf393cadeba1bcbc09a244a398ac10c23a51e7fd72d7c449ef0edaa9e9') ON
DUPLICATE KEY UPDATE 'inputhash'='inputhash')
Question: how do I insert a datetime value into MS SQL server, given the code below?
Context:
I have a 2-D list (i.e., a list of lists) in Python that I'd like to upload to a table in Microsoft SQL Server 2008. For this project I am using Python's pymssql package. Each value in each list is a string except for the very first element, which is a datetime value.
Here is how my code reads:
import pymssql
db_connect = pymssql.connect( # these are just generic names
server = server_name,
user = db_usr,
password = db_pwd,
database = db_name
)
my_cursor = db_connect.cursor()
for individual_list in list_of_lists:
# the first value in the paranthesis should be datetime
my_cursor.execute("INSERT INTO [DB_Table_Name] VALUES (%s, %s, %s, %s, %s, %s, %s, %s)", tuple(individual_list))
db_connect.commit()
The python interpreter is having a tough time inserting my datetime values. I understand that currently I have %s and that it is a string formatter, but I'm unsure what I should use for datetime, which is what the database's first column is formatted as.
The "list of lists" looks like this (after each list is converted into a tuple):
[(datetime.datetime(2012, 4, 1), '1', '4.1', 'hip', 'A1', 'J. Smith', 'B123', 'XYZ'),...]
Here is an illustration of what the table should look like:
+-----------+------+------+--------+-------+-----------+---------+---------+
| date | step | data | type | ID | contact | notif. | program |
+-----------+------+------+--------+-------+-----------+---------+---------+
|2012-04-01 | 1 | 4.1 | hip | A1 | J. Smith | B123 | XYZ |
|2012-09-05 | 2 | 5.1 | hip | A9 | B. Armst | B123 | ABC |
|2012-01-16 | 5 | 9.0 | horray | C6 | F. Bayes | P995 | XYZ |
+-----------+------+------+--------+-------+-----------+---------+---------+
Thank you in advance.
I would try formatting the date time to "yyyymmdd hh:mm:ss" before inserting. With what you are doing SQL will be parsing the string so I would also build the entire string and then insert the string as a variable. See below
for individual_list in list_of_lists:
# the first value in the parentheses should be datetime
date_time = individual_list[0].strftime("%Y%m%d %H:%M:%S")
insert_str = "INSERT INTO [DB_Table_Name] VALUES (" + str(date_time) + "),(" + str(individual_list[1]) + ");"
print insert_str
my_cursor.execute(insert_str)
db_connect.commit()
I apologize for the crude python but SQL should like that insert statement as long as all the fields match up. If not you may want to specify what fields those values go to in your insert statement.
Let me know if that works.
Thanks for taking the time to read this. It's going to be a long post to explain the problem. I haven't been able to find an answer in all the usual sources.
Problem:
I am having an issue with using the select statement with python to recall data from a table in a mysql database.
System and versions:
Linux ubuntu 2.6.38-14-generic #58-Ubuntu SMP Tue Mar 27 20:04:55 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
Python: 2.7.1+
MySql: Server version: 5.1.62-0ubuntu0.11.04.1 (Ubuntu)
Here's the table:
mysql> describe hashes;
+-------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+-------+
| id | varchar(20) | NO | PRI | NULL | |
| hash | varbinary(4) | NO | MUL | NULL | |
+-------+--------------+------+-----+---------+-------+
Here are responses that I want via a normal mysql query:
mysql> SELECT id FROM hashes WHERE hash='f';
+------+
| id |
+------+
| 0x67 |
+------+
mysql> SELECT id FROM hashes WHERE hash='ff';
+--------+
| id |
+--------+
| 0x6700 |
+--------+
As before, these are the responses that are expected and how I designed the DB.
My code:
import mysql.connector
from database import login_info
import sys
db = mysql.connector.Connect(**login_info)
cursor = db.cursor()
data = 'ff'
cursor.execute("""SELECT
* FROM hashes
WHERE hash=%s""",
(data))
rows = cursor.fetchall()
print rows
for row in rows:
print row[0]
This returns the result I expect:
[(u'0x67', 'f')]
0x67
If I change data to :
data = 'ff'
I receive the following error:
Traceback (most recent call last):
File "test.py", line 11, in <module>
(data))
File "/usr/local/lib/python2.7/dist-packages/mysql_connector_python-0.3.2_devel- py2.7.egg/mysql/connector/cursor.py", line 310, in execute
"Wrong number of arguments during string formatting")
mysql.connector.errors.ProgrammingError: Wrong number of arguments during string formatting
OK. So, I add a string formatting character to my SQL statement as so:
cursor.execute("""SELECT
* FROM hashes
WHERE hash=%s%s""",
(data))
And I get the following response:
[(u'0x665aa6', "f'f")]
0x665aa6
and it should by 0x6700.
I know that I should be passing the data with one %s character. That is how I built my database table, using one %s per variable:
cursor.execute("""
INSERT INTO hashes (id, hash)
VALUES (%s, %s)""", (k, hash))
Any ideas how to fix this?
Thanks.
Your execute statement doesn't seem quite correct. My understanding is that it should follow the pattern cursor.execute( <select statement string>, <tuple>) and by putting only a single value in the tuple location it is actually just a string. To make the second argument the correct data type you need to put a comma in there, so your statement would look like:
cursor.execute("""SELECT
* FROM hashes
WHERE hash=%s""",
(data, ))