Store XML File into MS SQL DB using Python

Store XML File into MS SQL DB using Python - python

My MSSQL DB table contains following structure:
create table TEMP
(
MyXMLFile XML
)
Using Python, I a trying to load locally stored .XML file into MS SQL DB (No XML Parsing Required)
Following is Python code:
import pyodbc
import xlrd
import xml.etree.ElementTree as ET
print("Connecting..")
# Establish a connection between Python and SQL Server
conn = pyodbc.connect('Driver={SQL Server};'
'Server=TEST;'
'Database=test;'
'Trusted_Connection=yes;')
print("DB Connected..")
# Get XMLFile
XMLFilePath = open('C:HelloWorld.xml')
# Create Table in DB
CreateTable = """
create table test.dbo.TEMP
(
XBRLFile XML
)
"""
# execute create table
cursor = conn.cursor()
try:
cursor.execute(CreateTable)
conn.commit()
except pyodbc.ProgrammingError:
pass
print("Table Created..")
InsertQuery = """
INSERT INTO test.dbo.TEMP (
XBRLFile
) VALUES (?)"""
# Assign values from each row
values = (XMLFilePath)
# Execute SQL Insert Query
cursor.execute(InsertQuery, values)
# Commit the transaction
conn.commit()
# Close the database connection
conn.close()
But the code is storing the XML path in MYXMLFile column and not the XML file. I referred lxml library and other tutorials. But, I did not encountered straight forward approach to store file.
Please can anyone help me with it. I have just started working on Python.

Here, is solution to load .XML file directly into MS SQL SB using Python.
import pyodbc
import xlrd
import xml.etree.ElementTree as ET
print("Connecting..")
# Establish a connection between Python and SQL Server
conn = pyodbc.connect('Driver={SQL Server};'
'Server=TEST;'
'Database=test;'
'Trusted_Connection=yes;')
print("DB Connected..")
# Get XMLFile
XMLFilePath = open('C:HelloWorld.xml')
x = etree.parse(XBRLFilePath) # Updated Code line
with open("FileName", "wb") as f: # Updated Code line
f.write(etree.tostring(x)) # Updated Code line
# Create Table in DB
CreateTable = """
create table test.dbo.TEMP
(
XBRLFile XML
)
"""
# execute create table
cursor = conn.cursor()
try:
cursor.execute(CreateTable)
conn.commit()
except pyodbc.ProgrammingError:
pass
print("Table Created..")
InsertQuery = """
INSERT INTO test.dbo.TEMP (
XBRLFile
) VALUES (?)"""
# Assign values from each row
values = etree.tostring(x) # Updated Code line
# Execute SQL Insert Query
cursor.execute(InsertQuery, values)
# Commit the transaction
conn.commit()
# Close the database connection
conn.close()

Related

Running select query on db for different variables using python

I am using python to establish db connection and reading csv file. For each line in csv i want to run a PostgreSQL query and get value corresponding to each line read.
DB connection and file reading is working fine. Also if i run query for hardcoded value then it works fine. But if i try to run query for each row in csv file using python variable then i am not getting correct value.
cursor.execute("select team from users.teamdetails where p_id = '123abc'")
Above query works fine.
but when i try it for multiple values fetched from csv file then i am not getting correct value.
cursor.execute("select team from users.teamdetails where p_id = queryPID")
Complete code for Reference:
import psycopg2
import csv
conn = psycopg2.connect(dbname='', user='', password='', host='', port='')
cursor = conn.cursor()
with open('playerid.csv','r') as csv_file:
csv_reader = csv.reader(csv_file)
for line in csv_reader:
queryPID = line[0]
cursor.execute("select team from users.teamdetails where p_id = queryPID")
team = cursor.fetchone()
print (team[0])
conn.close()

DO NOT concatenate the csv data. Use a parameterised query.
Use %s inside your string, then pass the additional variable:
cursor.execute('select team from users.teamdetails where p_id = %s', (queryPID,))
Concatenation of text leaves your application vulnerable to SQL injection.
https://www.psycopg.org/docs/usage.html

Load OSM data into PostgreSQL using Python-OGR

I want to load OSM data into PostgreSQL database using Python script. When I am trying this Python script, it doesn't load OSM data into database. Can anyone guide me? I know how to load data in PostgreSQL using osmosis but currently I am looking for something using Python.
import psycopg2
from osgeo import ogr
# connect to the database
connection = psycopg2.connect(user="postgres",
password="password",
host="localhost",
database="example")
# create cursor
cursor = connection.cursor()
cursor.execute("DROP TABLE IF EXISTS trial1")
cursor.execute("CREATE TABLE trial1 (id SERIAL PRIMARY KEY, geom Geometry)")
cursor.execute("CREATE INDEX trial1_index ON trial1 USING GIST(geom)")
print("Successfully created ")
connection.commit()
# define OSMfile path
osm = ogr.Open("pedestrian.osm")
layer = osm.GetLayer(1)
# delete the existing contents of the table
cursor.execute("DELETE FROM trial1")
print(str(layer.GetFeatureCount()))
for i in range(layer.GetFeatureCount()):
feature = layer.GetFeature(i)
# Get feature geometry
geometry = feature.GetGeometryRef()
wkt = geometry.ExportToWkt()
# Insert data into database,
cursor.execute("INSERT INTO trial1 (geom) VALUES (ST_GeomFromText(" + "'" + wkt + "', 4326))")
print("Data inserted successfully")
connection.commit()
expected result:This code should create table in database and also import data into it from given file.currently this code is just creating table .after table creation it dont give error ,it says Process finished with exit code 0.

Load data infile updating table on duplicate values from csv

I'm trying to update mysql table based on my csv data where sha1 in my csv should update or insert the suggestedname on duplicate. What part am I doing wrong here? Gives me error:
ProgrammingError: 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'where sha1=#col1' at line 1
Here is my table structure:
date_sourced, sha1, suggested, vsdt, trendx, falcon, notes, mtf
CSV structure:
SHA1,suggestedName
Code:
import mysql.connector
mydb = mysql.connector.connect(user='root', password='',
host='localhost',database='jeremy_db')
cursor = mydb.cursor()
query = "LOAD DATA INFILE %s IGNORE INTO TABLE jeremy_table_test FIELDS TERMINATED BY ',' LINES TERMINATED BY '\r\n' IGNORE 1 LINES (#col1,#col2) set suggested=#col2 where sha1=#col1"
cursor.execute(query, (fullPath))
mydb.commit()

LOAD DATA INFILE can not add condition in it. You can try to read file through pandas then insert value into table, but you need to set up an unique index on sha1 in advance. otherwise, my script will not work(reason).
import pandas as pd
import mysql.connector as mysql
path = "1.xls"
df = pd.read_excel(path)
_sha1 = df["SHA1"].tolist()
_suggestedName = df["suggestedName"].tolist()
conn = mysql.connect(user="xx",passwd="xx",db="xx")
cur = conn.cursor()
sql = """INSERT INTO jeremy_table_test (sha1,suggested) VALUES (%s,%s) ON DUPLICATE KEY UPDATE suggested=VALUES(suggested)"""
try:
cur.executemany(sql,list(zip(_sha1,_suggestedName)))
conn.commit()
except Exception as e:
conn.rollback()
raise e

Populating Cassandra database using Python

I am on Linux platform with Cassandra database. I want to insert Images data into Cassandra database using Python Code from a remote server. Previously, I had written a python code that inserts Images' data into MySQL database from a remote server. Please see the code below for MySQL
#!/usr/bin/python
# -*- coding: utf-8 -*-
import MySQLdb as mdb
import psycopg2
import sys
import MySQLdb
def read_image(i):
filename="/home/faban/Downloads/Python/Python-Mysql/images/im"
filename=filename+str(i)+".jpg"
print(filename)
fin = open(filename)
img = fin.read()
return img
con = MySQLdb.connect("192.168.50.12","root","faban","experiments" )
with con:
print('connecting to database')
range_from=input('Enter range from:')
range_till=input('Enter range till:')
for i in range(range_from,range_till):
cur = con.cursor()
data = read_image(i)
cur.execute("INSERT INTO images VALUES(%s, %s)", (i,data, ))
cur.close()
con.commit()
con.close()
This code successfully inserts data into MySQL database which is located at .12
I want to modify the same code to insert data into Cassandra database which is also located at .12
Please help me out in this regard.

If I create a simple table like this:
CREATE TABLE stackoverflow.images (
name text PRIMARY KEY,
data blob);
I can load those images with Python code that is similar to yours, but with some minor changes to use the DataStax Python Cassandra driver (pip install cassandra-driver):
#imports for DataStax Cassandra driver and sys
from cassandra.cluster import Cluster
from cassandra.auth import PlainTextAuthProvider
from cassandra.cluster import SimpleStatement
import sys
#reading my hostname, username, and password from the command line; defining my Cassandra keyspace as as variable.
hostname=sys.argv[1]
username=sys.argv[2]
password=sys.argv[3]
keyspace="stackoverflow"
#adding my hostname to an array, setting up auth, and connecting to Cassandra
nodes = []
nodes.append(hostname)
auth_provider = PlainTextAuthProvider(username=username, password=password)
ssl_opts = {}
cluster = Cluster(nodes,auth_provider=auth_provider,ssl_options=ssl_opts)
session = cluster.connect(keyspace)
#setting my image name, loading the file, and reading the data
name = "IVoidWarranties.jpg"
fileHandle = open("/home/aploetz/Pictures/" + name)
imgData = fileHandle.read()
#preparing and executing my INSERT statement
strCQL = "INSERT INTO images (name,data) VALUES (?,?)"
pStatement = session.prepare(strCQL)
session.execute(pStatement,[name,imgData])
#closing my connection
session.shutdown()
Hope that helps!

Insert data from file into database

I have a .sql file with multiple insert statements ( 1000 + ) and I want to run the statements in this file into my Oracle database.
For now, im using a python with odbc to connect to my database with the following:
import pyodbc
from ConfigParser import SafeConfigParser
def db_call(self, cfgFile, sql):
parser = SafeConfigParser()
parser.read(cfgFile)
dsn = parser.get('odbc', 'dsn')
uid = parser.get('odbc', 'user')
pwd = parser.get('odbc', 'pass')
try:
con = pyodbc.connect('DSN=' + dsn + ';PWD=' + pwd + ';UID=' + pwd)
cur = con.cursor()
cur.execute(sql)
con.commit()
except pyodbc.DatabaseError, e:
print 'Error %s' % e
sys.exit(1)
finally:
if con and cur:
cur.close()
con.close()
with open('theFile.sql','r') as f:
cfgFile = 'c:\\dbinfo\\connectionInfo.cfg'
#here goes the code to insert the contents into the database using db_call_many
statements = f.read()
db_call(cfgFile,statements)
But when i run it i receive the following error:
pyodbc.Error: ('HY000', '[HY000] [Oracle][ODBC][Ora]ORA-00911: invalid character\n (911) (SQLExecDirectW)')
But all the content of the file are only:
INSERT INTO table (movie,genre) VALUES ('moviename','horror');
Edit
Adding print '<{}>'.format(statements) before the db_db_call(cfgFile,statements) i get the results(100+):
<INSERT INTO table (movie,genre) VALUES ('moviename','horror');INSERT INTO table (movie,genre) VALUES ('moviename_b','horror');INSERT INTO table (movie,genre) VALUES ('moviename_c','horror');>
Thanks for your time on reading this.

Now it's somewhat clarified - you have a lot of separate SQL statements such as INSERT INTO table (movie,genre) VALUES ('moviename','horror');
Then, you're effectively after cur.executescript() than the current state (I have no idea if pyodbc supports that part of the DB API, but any reason, you can't just execute an execute to the database itself?

When you read a file using read() function, the end line (\n) at the end of file is read too. I think you should use db_call(cfgFile,statements[:-1]) to eliminate the end line.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Store XML File into MS SQL DB using Python - python

Related

Running select query on db for different variables using python

Load OSM data into PostgreSQL using Python-OGR

Load data infile updating table on duplicate values from csv

Populating Cassandra database using Python

Insert data from file into database

Categories

Resources