How to write csv file into sql database with python - python

I have csv file that include some information about computer as like ostype, ram, cpu value and ı have sql database that already has same information and ı want to updated that database table with by python script. database table and csv file has uniqe "id" parameters.
import csv
with open("Hypersanal.csv") as csvfile:
readCSV = csv.reader(csvfile, delimiter=';')
for row in readCSV:
print row

Depending on what type of database, there will be some slight adjustments to make to the code.
For this example, I'll use SQLAlchemy with the pymysql driver. To find out what the first part of the connection String should be (depends on the kind of database you want to connect to), check SQLAlchemy Doc about Dialects.
First, we import the necessary modules
from sqlalchemy import *
from sqlalchemy.orm import create_session
from sqlalchemy.ext.declarative import declarative_base
Then, we create the connection string
dialect_part = "mysql+pymysql://"
# username is the usernmae we'll use to connect to the db, and password the corresponding password
# server_name is the name fo the server where the db is. It INCLUDES the port number (eg : 'localhost:9080')
# database is the name of the db on the server we'll work on
connection_string = dialect_part+username+":"+password+"#"+server_name+"/"+database
Some more setups needed for SQLAlchemy :
Base = declarative_base()
engine = create_engine(connection_string)
metadata = MetaData(bind=engine)
Now, we have a link to the db, but need some more work before being able to do anything to it.
We create a class corresponding to the table of the db we'll hit. This class will 'autofill' according to how the table is in the db. You can also fill it manually.
class TableWellHit(Base):
__table__ = Table(name_of_the_table, metadata, autoload=True)
Now, to be able to interact with the table, we need to create a session :
session = create_session(bind=engine)
Now, we need to begin the session, and we'll be set.
Your code will now be used.
import csv
with open("Hypersanal.csv") as csvfile:
readCSV = csv.reader(csvfile, delimiter=';')
for row in readCSV:
# print row
# I chose to push each value from the db one by one
# If you're sure there won't be any duplicates for the primary key, you can put the session.begin() before the for loop
session.begin()
# I create an element for the db
new_element = TableWellHit(field_in_table=row[field])
An example for this, imagine you have required fiels 'username' and 'password' in the table, and row contains a dictionnary containing 'user' and 'pass' as keys.
The elements will be created by : TableWellHit(username=row['user],password=row['pass'])
# I add the element to the table
# I choose to merge instead of add, so as to prevent duplicates, one more time
session.merge(new_element)
# Now, we commit our changes to the db
# This also closes the session
# if you put the session.begin() outside of the loop, do the same for the session.commit()
session.commit()
Hope this answers your question, and if he does not, just let me know so I can correct my answer.
edit :
For MSSQL :
- Install pymssql (pip install pymssql)
The connection_string should be of the following form, according to this SQLAlchemy page : mssql+pymssql://<username>:<password>#<freetds_name>/?charset=utf8
Using merge allows you to create or update a value, depending on whether or not it already exists.

Related

Python - Reading specific column from SQL output stored in a variable

I have a basic question here. I am pulling a SQL output as below:
cur = connection.cursor()
cur.execute("""select store_name,count(*) from stores group by store_name""")
data = cur.fetchall()
The output of the above SQL is as below:
Store_1,23
Store_2,13
Store_3,43
Store_4,2
I am trying to read column 1 (store_name) in the above output.
Expected Output:
Store_1
Store_2
Store_3
Store_4
Could anyone advice on how could I have this done. Thanks..
If I correctly understand your question, I guess just correcting your SQL will give you the desired result. Fetch distinct store_name
select distinct store_name from stores
Edit
Response to comment:
Try following:
from operator import itemgetter
data = cur.fetchall()
list(map(itemgetter(0), data)) # your answer
In your code, you can simply append the following lines:
for rows in data:
print(rows[0])
hope this helps.
BTW: I am not on the computer and have not crosschecked the solution.
Harvard's CS50 web class has the following which I think helps you in it's last 3 lines.
import os
from sqlalchemy import create_engine
from sqlalchemy.orm import scoped_session, sessionmaker
engine = create_engine(os.getenv("DATABASE_URL")) # database engine object from SQLAlchemy that manages connections to the database
# DATABASE_URL is an environment variable that indicates where the database lives
db = scoped_session(sessionmaker(bind=engine)) # create a 'scoped session' that ensures different users' interactions with the
# database are kept separate
flights = db.execute("SELECT origin, destination, duration FROM flights").fetchall() # execute this SQL command and return all of the results
for flight in flights
print(f"{flight.origin} to {flight.destination}, {flight.duration} minutes.") # for every flight, print out the flight info
So in your case I suppose its:
results = db.execute( <PUT YOUR SQL QUERY HERE> )
for row in results:
print(row.store_name)

Specifying pyODBC options (fast_executemany = True in particular) using SQLAlchemy

I would like to switch on the fast_executemany option for the pyODBC driver while using SQLAlchemy to insert rows to a table. By default it is of and the code runs really slow... Could anyone suggest how to do this?
Edits:
I am using pyODBC 4.0.21 and SQLAlchemy 1.1.13 and a simplified sample of the code I am using are presented below.
import sqlalchemy as sa
def InsertIntoDB(self, tablename, colnames, data, create = False):
"""
Inserts data into given db table
Args:
tablename - name of db table with dbname
colnames - column names to insert to
data - a list of tuples, a tuple per row
"""
# reflect table into a sqlalchemy object
meta = sa.MetaData(bind=self.engine)
reflected_table = sa.Table(tablename, meta, autoload=True)
# prepare an input object for sa.connection.execute
execute_inp = []
for i in data:
execute_inp.append(dict(zip(colnames, i)))
# Insert values
self.connection.execute(reflected_table.insert(),execute_inp)
Try this for pyodbc
crsr = cnxn.cursor()
crsr.fast_executemany = True
Starting with version 1.3, SQLAlchemy has directly supported fast_executemany, e.g.,
engine = create_engine(connection_uri, fast_executemany=True)

Insert on mongodb if not duplicated

I write this script for insert a doc into Mongodb if not duplicated
import tldextract
from pymongo import MongoClient
client = MongoClient()
db = client.my_domains
collection = db.domain
with open('inputcut.csv', 'r') as f:
for line in f:
ext = tldextract.extract(line)
domain = {"domain":ext.registered_domain}
collection.update(domain,{'upsert':True})
When I run the script, no domains are inserted into the database.
I would like to insert a domain if it is not yet present in mongodb.
If the domain is already present, we do not insert it and we go to the next one ...
Thank you in advance for your help.
collection.update expects 3 arguments - the query, the update and the options. Since upsert should be in the options, rewrite the call as follows:
collection.update(domain, {$set: domain}, {'upsert':True})

List database tables with SQLAlchemy

I want to implement a function that gives information about all the tables (and their column names) that are present in a database (not only those created with SQLAlchemy). While reading the documentation it seems to me that this is done via reflection but I didn't manage to get something working. Any suggestions or examples on how to do this?
start with an engine:
from sqlalchemy import create_engine
engine = create_engine("postgresql://u:p#host/database")
quick path to all table /column names, use an inspector:
from sqlalchemy import inspect
inspector = inspect(engine)
for table_name in inspector.get_table_names():
for column in inspector.get_columns(table_name):
print("Column: %s" % column['name'])
docs: http://docs.sqlalchemy.org/en/rel_0_9/core/reflection.html?highlight=inspector#fine-grained-reflection-with-inspector
alternatively, use MetaData / Tables:
from sqlalchemy import MetaData
m = MetaData()
m.reflect(engine)
for table in m.tables.values():
print(table.name)
for column in table.c:
print(column.name)
docs: http://docs.sqlalchemy.org/en/rel_0_9/core/reflection.html#reflecting-all-tables-at-once
First set up the sqlalchemy engine.
from sqlalchemy import create_engine, inspect, text
from sqlalchemy.engine import url
connect_url = url.URL(
'oracle',
username='db_username',
password='db_password',
host='db_host',
port='db_port',
query=dict(service_name='db_service_name'))
engine = create_engine(connect_url)
try:
engine.connect()
except Exception as error:
print(error)
return
Like others have mentioned, you can use the inspect method to get the table names.
But in my case, the list of tables returned by the inspect method was incomplete.
So, I found out another way to find table names by using pure SQL queries in sqlalchemy.
query = text("SELECT table_name FROM all_tables where owner = '%s'"%str('db_username'))
table_name_data = self.session.execute(query).fetchall()
Just for sake of completeness of answer, here's the code to fetch table names by inspect method (if it works good in your case).
inspector = inspect(engine)
table_names = inspector.get_table_names()
Hey I created a small module that helps easily reflecting all tables in a database you connect to with SQLAlchemy, give it a look: EZAlchemy
from EZAlchemy.ezalchemy import EZAlchemy
DB = EZAlchemy(
db_user='username',
db_password='pezzword',
db_hostname='127.0.0.1',
db_database='mydatabase',
d_n_d='mysql' # stands for dialect+driver
)
# this function loads all tables in the database to the class instance DB
DB.connect()
# List all associations to DB, you will see all the tables in that database
dir(DB)
I'm proposing another solution as I was not satisfied by any of the previous in the case of postgres which uses schemas. I hacked this solution together by looking into the pandas source code.
from sqlalchemy import MetaData, create_engine
from typing import List
def list_tables(pg_uri: str, schema: str) -> List[str]:
with create_engine(pg_uri).connect() as conn:
meta = MetaData(conn, schema=schema)
meta.reflect(views=True)
return list(meta.tables.keys())
In order to get a list of all tables in your schema, you need to form your postgres database uri pg_uri (e.g. "postgresql://u:p#host/database" as in the zzzeek's answer) as well as the schema's name schema. So if we use the example uri as well as the typical schema public we would get all the tables and views with:
list_tables("postgresql://u:p#host/database", "public")
While reflection/inspection is useful, I had trouble getting the data out of the database. I found sqlsoup to be much more user-friendly. You create the engine using sqlalchemy and pass that engine to sqlsoup.SQlSoup. ie:
import sqlsoup
def create_engine():
from sqlalchemy import create_engine
return create_engine(f"mysql+mysqlconnector://{database_username}:{database_pw}#{database_host}/{database_name}")
def test_sqlsoup():
engine = create_engine()
db = sqlsoup.SQLSoup(engine)
# Note: database must have a table called 'users' for this example
users = db.users.all()
print(users)
if __name__ == "__main__":
test_sqlsoup()
If you're familiar with sqlalchemy then you're familiar with sqlsoup. I've used this to extract data from a wordpress database.

add column to SQLAlchemy Table

I made a table using SQLAlchemy and forgot to add a column. I basically want to do this:
users.addColumn('user_id', ForeignKey('users.user_id'))
What's the syntax for this? I couldn't find it in the docs.
I have the same problem, and a thought of using migration library only for this trivial thing makes me
tremble. Anyway, this is my attempt so far:
def add_column(engine, table_name, column):
column_name = column.compile(dialect=engine.dialect)
column_type = column.type.compile(engine.dialect)
engine.execute('ALTER TABLE %s ADD COLUMN %s %s' % (table_name, column_name, column_type))
column = Column('new_column_name', String(100), primary_key=True)
add_column(engine, table_name, column)
Still, I don't know how to insert primary_key=True into raw SQL request.
This is referred to as database migration (SQLAlchemy doesn't support migration out of the box). You can look at using sqlalchemy-migrate to help in these kinds of situations, or you can just ALTER TABLE through your chosen database's command line utility,
See this section of the SQLAlchemy documentation: http://docs.sqlalchemy.org/en/latest/core/metadata.html#altering-schemas-through-migrations
Alembic is the latest software to offer this type of functionality and is made by the same author as SQLAlchemy.
I have a database called "ncaaf.db" built with sqlite3 and a table called "games". So I would CD into the same directory on my linux command prompt and do
sqlite3 ncaaf.db
alter table games add column q4 type float
and that is all it takes! Just make sure you update your definitions in your sqlalchemy code.
from sqlalchemy import create_engine
engine = create_engine('sqlite:///db.sqlite3')
engine.execute('alter table table_name add column column_name String')
I had the same problem, I ended up just writing my own function in raw sql. If you are using SQLITE3 this might be useful.
Then if you add the column to your class definition at the same time it seems to do the trick.
import sqlite3
def add_column(database_name, table_name, column_name, data_type):
connection = sqlite3.connect(database_name)
cursor = connection.cursor()
if data_type == "Integer":
data_type_formatted = "INTEGER"
elif data_type == "String":
data_type_formatted = "VARCHAR(100)"
base_command = ("ALTER TABLE '{table_name}' ADD column '{column_name}' '{data_type}'")
sql_command = base_command.format(table_name=table_name, column_name=column_name, data_type=data_type_formatted)
cursor.execute(sql_command)
connection.commit()
connection.close()
I've recently had this same issue so I took a point from AlexP in an earlier answer. The problem was in getting the new column into my program's metadata. Using sqlAlchemy's append_column functionality had some unexpected downstream effects ('str' object has no attribute 'dialect impl'). I corrected this by adding the column with DDL (MySQL database in this case) and then reflecting the table back from the DB into my metadata.
Code is as roughly as follows (modified slightly from what I have in order to reduce it to its minimal essence. I apologize for any mistakes - if there, they should be minor)...
try:
# Use back quotes as a protection against SQL Injection Attacks. Can we do more?
common.qry_engine.execute('ALTER TABLE %s ADD COLUMN %s %s' %
('`' + self.tbl.schema + '`.`' + self.tbl.name + '`',
'`' + self.outputs[new_col] + '`', 'VARCHAR(50)'))
except exc.SQLAlchemyError as msg:
raise GRError(desc='Unable to physically add derived column to table. Contact support.',
data=str(self.outputs), other_info=str(msg))
try: # Refresh the metadata to show the new column
self.tbl = sqlalchemy.Table(self.tbl.name, self.tbl.metadata, extend_existing=True, autoload=True)
except exc.SQLAlchemyError as msg:
raise GRError(desc='Unable to establish metadata for new column. Contact support.',
data=str(self.outputs), other_info=str(msg))
Yes you can
Install sqlalchemy-migrate (pip install sqlalchemy-migrate) and use it in your script to call Table and Column create() method:
from sqlalchemy import String, MetaData, create_engine
from migrate.versioning.schema import Table, Column
db_engine = create_engine(app.config.get('SQLALCHEMY_DATABASE_URI'))
db_meta = MetaData(bind=db_engine)
table = Table('tabel_name' , db_meta)
col = Column('new_column_name', String(20), default='foo')
col.create(table)
Just continuing the simple way proposed by chasmani, little improvement
'''
# simple migration
# columns to add:
# last_status_change = Column(BigInteger, default=None)
# last_complete_phase = Column(String, default=None)
# complete_percentage = Column(DECIMAL, default=0.0)
'''
import sqlite3
from config import APP_STATUS_DB
from sqlalchemy import types
def add_column(database_name: str, table_name: str, column_name: str, data_type: types, default=None):
ret = False
if default is not None:
try:
float(default)
ddl = ("ALTER TABLE '{table_name}' ADD column '{column_name}' '{data_type}' DEFAULT {default}")
except:
ddl = ("ALTER TABLE '{table_name}' ADD column '{column_name}' '{data_type}' DEFAULT '{default}'")
else:
ddl = ("ALTER TABLE '{table_name}' ADD column '{column_name}' '{data_type}'")
sql_command = ddl.format(table_name=table_name, column_name=column_name, data_type=data_type.__name__,
default=default)
try:
connection = sqlite3.connect(database_name)
cursor = connection.cursor()
cursor.execute(sql_command)
connection.commit()
connection.close()
ret = True
except Exception as e:
print(e)
ret = False
return ret
add_column(APP_STATUS_DB, 'procedures', 'last_status_change', types.BigInteger)
add_column(APP_STATUS_DB, 'procedures', 'last_complete_phase', types.String)
add_column(APP_STATUS_DB, 'procedures', 'complete_percentage', types.DECIMAL, 0.0)
If using docker:
go to the terminal of the container holding your DB
get into the db: psql -U usr [YOUR_DB_NAME]
now you can alter tables using raw SQL: alter table [TABLE_NAME] add column [COLUMN_NAME] [TYPE]
Note you will need to have mounted your DB for the changes to persist between builds.
Adding the column "manually" (not using python or SQLAlchemy) is perhaps the easiest?
Same problem over here. What I will do is iterating over the db and add each entry to a new database with the extra column, then delete the old db and rename the new to this one.

Categories