I am using jupyter notebook to access Teradata database.
Assume I have a dataframe
Name Age
Sam 5
Tom 6
Roy 7
I want to let the whole column "Name" content become the WHERE condition of a sql query.
query = '''select Age
from xxx
where Name in (Sam, Tom, Roy)'''
age = pd.read_sql(query,conn)
How to format the column so that the whole column can be insert to the sql statement automatically instead of manually paste the column content?
Join the Name column and insert into the query using f-string:
query = f'''select Age
from xxx
where Name in ({", ".join(df.Name)})'''
print(query)
select Age
from xxx
where Name in (Sam, Tom, Roy)
Related
CREATE TABLE temp (
id UINTEGER,
name VARCHAR,
age UINTEGER
);
CREATE SEQUENCE serial START 1;
Insertion with series works just fine:
INSERT INTO temp VALUES(nextval('serial'), 'John', 13)
How I can use the sequence with pandas dataframe?
data = [['Alex',10],['Bob',12],['Clarke',13]]
df = pd.DataFrame(data,columns=['Name','Age'])
print(df)
Name Age
0 Alex 10
1 Bob 12
2 Clarke 13
con.execute("INSERT INTO temp SELECT * FROM df")
RuntimeError: Binder Error: table temp has 3 columns but 2 values were supplied
I don't want to iterate item by item. The goal is to efficiently insert 1000s of items from python to DB. I'm ok to change pandas to something else.
Can't you have nextval('serial') as part of your select query when reading the df?
e.g.,
con.execute("INSERT INTO temp SELECT nextval('serial'), Name, Age FROM df")
I have a list of string values new_values and I want to insert it as a new column in the companies tables my mysql database. Since I have hundreds of rows, I cannot manually type them using the ? syntax that I came across on SO.
import MySQLdb
cursor = db.cursor()
cursor.execute("INSERT INTO companies ....")
lst_to_add = ["name1", "name2", "name3"]
db.commit()
db.close()
However, i am not sure what query I should use to pass in my list and what's the correct syntax to include the new column name (eg: "newCol") into the query.
Edit:
current table:
id originalName
1 Hannah
2 Joi
3 Kale
expected output:
id originalName fakeName
1 Hannah name1
2 Joi name2
3 Kale name3
Use ALTER TABLE to add the new column.
cursor.execute("ALTER TABLE companies ADD COLUMN fakeName VARCHAR(100)")
Then loop through the data, updating each row. You need Python data that indicates the mapping from original name to fake name. You can use a dictionary, for example.
names_to_add = {
"Hannah": "name1",
"Joi": "name2",
"Kale": "name3"
}
for oldname, newname in names_to_add.items():
cursor.execute("UPDATE companies SET fakeName = %s WHERE originalName = %s", (newname, oldname))
I am trying to get column names from my postgres sql table using psycopg2 but it is returning unordered column list not same as how columns are shown in table.
This is how database table look when saved as pandas dataframe:
cur.execute("Select * from actor")
tupples = cur.fetchall()
cur.execute("select column_name from information_schema.columns where table_name = 'actor'")
column_name = cur.fetchall()
df = pd.DataFrame(tupples,columns = column_name)
(actor_id,) (last_update,) (first_name,) (last_name,)
1 PENELOPE GUINESS 2006-02-15 04:34:33
2 NICK WAHLBERG 2006-02-15 04:34:33
3 ED CHASE 2006-02-15 04:34:33
4 JENNIFER DAVIS 2006-02-15 04:34:33
5 JOHNNY LOLLOBRIGIDA 2006-02-15 04:34:33
This is how database table looked like when i see in pgadmin2:
I just want the column_name to return the column names of sql table as shown in image.
I have a dataframe like this
Name age city
John 31 London
Pierre 35 Paris
...
Kasparov 40 NYC
I would like to select data from redshift city table using sql where city are included in city of the dataframe
query = select * from city where ....
Can you help me to accomplish this query?
Thank you
Jeril's answer is going to right direction but not complete. df.unique() result is not a string it's series. You need a string in your where clause
# create a string for cities to use in sql, the way sql expects the string
unique_cities = ','.join("'{0}'".format(c) for c in list(df['city'].unique()))
# output
'London','Paris'
#sql query would be
query = f"select * from city where name in ({unique_cities})"
The code above is assuming you are using python 3.x
Please let me know if this solves your issue
You can try the following:
unique_cities = df['city'].unique()
# sql query
select * from city where name in unique_cities
If I have a SQL script is there a way to parse and extract the columns and tables referenced in the script into a table like structure :
Script:
Select t1.first, t1.last, t2.car, t2.make, t2.year
from owners t1
left join cars t2
on t1.owner_id = t2.owner_id
Output:
Table Column
owners first
owners last
owners owner_id
cars car
cars make
cars year
cars owner_id
Old question but interesting so here it goes - turn your script temporarily into a stored procedure forcing SQL Server to map the dependencies and then you can retrieve them by using:
SELECT referenced_entity_name ,referenced_minor_name FROM sys.dm_sql_referenced_entities('dbo.stp_ObjectsToTrack', 'Object')
This is what you want in SQL Server:
select t.name as [Table], c.name as [Column]
from sys.columns c
inner join sys.tables t
on c.object_id = t.object_id