I have multiple unstructured txt files in a directory and I want to insert all of them into mysql; basically, the entire content of each text file should be placed into a row . In MySQL, I have 2 columns: ID (auto increment), and LastName(nvarchar(45)). I used Python to connect to MySql; used LOAD DATA LOCAL INFILE to insert the whole content. But when I run the code I see the following messages in Python console:
.
Also, when I check MySql, I see nothing but a bunch of empty rows with Ids being automatically generated.
Here is the code:
import MySQLdb
import sys
import os
result = os.listdir("C:\\Users\\msalimi\\Google Drive\\s\\Discharge_Summary")
for x in result:
db = MySQLdb.connect("localhost", "root", "Pass", "myblog")
cursor = db.cursor()
file1 = os.path.join(r'C:\\Discharge_Summary\\'+x)
cursor.execute("LOAD DATA LOCAL INFILE '%s' INTO TABLE clamp_test" %(file1,));
db.commit()
db.close()
Can someone please tell me what is wrong with the code? What is the right way to achieve my goal?
I edited my code with:
.....cursor.execute("LOAD DATA LOCAL INFILE '%s' INTO TABLE clamp_test LINES TERMINATED BY '\r' (Lastname) SET id = NULL" %(file1,))
and it worked :)
Related
I am trying to setup a python script to get some data and store it into a SQLite database. However when I am running the script a .fuse_hidden file is created.
On windows no .fuse_hidden file is observed but on ubuntu it generates at each call. The .fuse_hidden file seems to contain some form of sql query with input and tables.
I can delete the files without error during runtime but they are not deleted automatically. I make sure to end my connection to the db when I am finished with the query.
lsof give no information.
I am out of ideas on what to try next to get the files removed automatically. Any suggestions?
Testing
In order to confirm that it is nothing wrong with the code I made a simple script
(Assume there is an empty error.db)
import sqlite3
conn = sqlite3.connect("error.db")
cur = conn.cursor()
create_query = """
CREATE TABLE Errors (
name TEXT
);"""
try:
cur.execute(create_query)
except:
pass
cur.execute("INSERT INTO Errors (name) VALUES(?)", ["Test2"])
conn.commit()
cur.close()
conn.close()
I am new to the Python-SQL connectivity world. My goal is to retrieve data from SQL in a pandas DataFrame format by executing long SQL queries thru my python script.
Most of my SQL queries are long with multiple interim-temp tables before the final SELECT statement from the last temp table. When I run such a monolithic query in Python I get an error saying -
"pandas.io.sql.DatabaseError: Execution failed on sql"
Though they run absolutely fine in MS SQL Management Studio
I suspect this is due to the interim-temp tables, because if I split my long query into two pieces (with everything before the final SELECT in 1st section and final SELECT in the 2nd section) the two section sequentially, run fine
Can someone guide me why is it so or alternatively what is the best way to run long queries with temp tables/views and retrieve results in a pandas DataFrame?
Here is my sample Python code that ideally should take a fine name as an input and run the SQL to retrieve results in a data frame, however it fails in case of a query with temp tables
import pyodbc as db
import pandas as pd
filename = 'file.sql'
username = 'XXXX'
password = 'YYYYY'
driver= '{ODBC Driver 13 for SQL Server}'
database = 'DB'
server = 'local'
conn = db.connect('DRIVER='+driver+'; PORT=1433; SERVER='+server+';
PORT=1443; DATABASE='+database+'; UID='+username+'; PWD='+ password)
fd = open(filename, 'r')
sqlfile = fd.read()
fd.close()
sqlcommand1 = sql
df_table = pd.read_sql(sqlcommand1, conn)
If I break my sql query in two pieces (one with all temp tables and 2nd with final Select), then it runs fine. Below is a modified function that splits the long Query after finding '/**/' and it works fine
"""
This Function Reads a SQL Script From an Extrenal File and Executes The
Script in SQL. If The SQL Script Has Bunch of Tem Tables/Views
Followed By a Select Statement to Retrieve Data From Those Views Then Input
SQL File Should Have '/**/' Immediately Before the Final
Select Statement. This is to Esnure Final Select Statement is Executed on
the Temporary Views Already Run by Python.
Input is a SQL File Name and Output is a DataFrame
"""
import pyodbc as db
import pandas as pd
filename = 'filename.sql'
username = 'XXXX'
password = 'YYYYY'
driver= '{ODBC Driver 13 for SQL Server}'
database = 'DB'
server = 'local'
conn = db.connect('DRIVER='+driver+'; PORT=1433; SERVER='+server+';
PORT=1443; DATABASE='+database+'; UID='+username+'; PWD='+ password)
fd = open(filename, 'r')
sqlfile = fd.read()
fd.close()
sql = sqlfile.split('/**/')
sqlcommand1 = sql[0] #1st Section of Query with temp tables
sqlcommand2 = sql[1] #2nd section of Query with final SELECT statement
conn.execute(sqlcommand1)
df_table = pd.read_sql(sqlcommand2, conn)
Quick and dirty answer: if using T-SQL put the line SET NOCOUNT ON at the beginning of your query.
Like #Parfait mentioned above the pandas read_sql method can only support one result set. However, when you generate a temp table in T-sql you do create a result set in the form "(XX row(s) affected)" which is what causes your original query to fail. By setting NOCOUNT you eliminate any early returns and only get the results from your final SELECT statement.
Alternatively, if using pyodbc cursor instead of pandas you can utilize nextset() to skip the result sets from the temp table(s). More info on pyodbc here.
I'm not experienced in Python.
I have the following Python code:
How can i import various values from files outside the file, and use them in a SQL request?
#!/usr/bin/env python
import MySQLdb
import Stamdata
from Stamdata import Varmekurve
K = Varmekurve
print K #this vorks, and the value 1.5 from Varmekurve is printed.
#Open database connection
db = MySQLdb.connect("localhost","root","Codename","MyDvoDb")
#prepare a cursor object using cursor method
cursor = db.cursor()
#Get SetTemp FROM SQL
sql = ("SELECT SetTemp FROM varmekurver WHERE kurvenummer = '1.5' AND TempSensor ='15'")
#Here i would like to import the value from Varmekurve instead of '1.5', and the data from a DS18b20 temp. sensor instead of '15'.
#The DS18B20 sensor are located in '/sys/bus/w1/devices/28-0316007914ff/w1_slave'
cursor.execute(sql)
results = cursor.fetchall()
for row in results:
print row[0]
db.close()
Only the Stamdata file are in the same library.
The Script shall control a motorvalve by calling the SetTemp and open/close a mix-valve if the temp. is to high or low (within 2-3 degrees)
But i haven't come that far yet :0)
to dynamically change the value in the string from a variable do:
SELECT SetTemp FROM varmekurver WHERE kurvenummer = '{}' AND TempSensor ='{}'.format(val1, val2)
If you want to import these values from an external source, like a flat file, you can do it in a number of ways. For example using Pandas.
I am trying to load rows of data into postgres in a csv-like structure using the copy_from command (function to utilize copy command in postgres). My data is delimited with commas(and unfortunately since I am not the data owner I cannot just change the delimiter). I run into a problem when I try to load a row that has a value in quotes containing a comma (ie. that comma should not be treated as a delimiter).
For example this row of data is fine:
",Madrid,SN,,SEN,,,SN,173,157"
This row of data is not fine:
","Dominican, Republic of",MC,,YUO,,,MC,65,162",
Some code:
conn = get_psycopg_conn()
cur = conn.cursor()
_io_buffer.seek(0) #This buffer is holding the csv-like data
cur.copy_from(_io_buffer, str(table_name), sep=',', null='', columns=column_names)
conn.commit()
It looks like copy_from doesn't expose the csv mode or quote options, which are available form the underlying PostgreSQL COPY command. So you'll need to either patch psycopg2 to add them, or use copy_expert.
I haven't tried it, but something like
curs.copy_expert("""COPY mytable FROM STDIN WITH (FORMAT CSV)""", _io_buffer)
might be sufficient.
I had this same error and was able to get close to a fix based on the single line of code listed by craig-ringer. The other item I needed was to include quotes for the initial object by using df.to_csv(index=False,header=False, quoting=csv.QUOTE_NONNUMERIC,sep=',') and specifically , quoting=csv.QUOTE_NONNUMERIC.
The full example of pulling one data source from MySQL and storing it in Postgres is below:
#run in python 3.6
import MySQLdb
import psycopg2
import os
from io import StringIO
import pandas as pd
import csv
mysql_db = MySQLdb.connect(host="host_address",# your host, usually localhost
user="user_name", # your username
passwd="source_pw", # your password
db="source_db") # name of the data base
postgres_db = psycopg2.connect("host=dest_address dbname=dest_db_name user=dest_user password=dest_pw")
my_list = ['1','2','3','4']
# you must create a Cursor object. It will let you execute all the queries you need
mysql_cur = mysql_db.cursor()
postgres_cur = postgres_db.cursor()
for item in my_list:
# Pull cbi data for each state and write it to postgres
print(item)
mysql_sql = 'select * from my_table t \
where t.important_feature = \'' + item + '\';'
# Do something to create your dataframe here...
df = pd.read_sql_query(mysql_sql, mysql_db)
# Initialize a string buffer
sio = StringIO()
sio.write(df.to_csv(index=False,header=False, quoting=csv.QUOTE_NONNUMERIC,sep=',')) # Write the Pandas DataFrame as a csv to the buffer
sio.seek(0) # Be sure to reset the position to the start of the stream
# Copy the string buffer to the database, as if it were an actual file
with postgres_db.cursor() as c:
print(c)
c.copy_expert("""COPY schema:new_table FROM STDIN WITH (FORMAT CSV)""", sio)
postgres_db.commit()
mysql_db.close()
postgres_db.close()
My simple test code is listed below. I created the table already and can query it using the SQLite Manager add-in on Firefox so I know the table and data exist. When I run the query in python (and using the python shell) I get the no such table error
def TroyTest(self, acctno):
conn = sqlite3.connect('TroyData.db')
curs = conn.cursor()
v1 = curs.execute('''
SELECT acctvalue
FROM balancedata
WHERE acctno = ? ''', acctno)
print v1
conn.close()
When you pass SQLite a non-existing path, it'll happily open a new database for you, instead of telling you that the file did not exist before. When you do that, it'll be empty and you'll instead get a "No such table" error.
You are using a relative path to the database, meaning it'll try to open the database in the current directory, and that is probably not where you think it is..
The remedy is to use an absolute path instead:
conn = sqlite3.connect('/full/path/to/TroyData.db')
You need to loop over the cursor to see results:
curs.execute('''
SELECT acctvalue
FROM balancedata
WHERE acctno = ? ''', acctno)
for row in curs:
print row[0]
or call fetchone():
print curs.fetchone() # prints whole row tuple
The problem is the SQL statment. you must specify the db name and after the table name...
'''SELECT * FROM db_name.table_name WHERE acctno = ? '''