I have a python script that reads from a CSV file and inserts the contents into a MySQL database.
When I run my script, this line gives a SQL error:
insert_sql = "INSERT INTO billing_info
(InvoiceId, PayerAccountId, LinkedAccountId,
RecordType, RecordId, ProductName, RateId,
SubscriptionId, PricingPlanId, UsageType, Operation,
AvailabilityZone, ReservedInstance, ItemDescription,
UsageStartDate, UsageEndDate, UsageQuantity,
BlendedRate, BlendedCost, UnBlendedRate, UnBlendedCost,
ResourceId, Engagement, Name, Owner, Parent)
VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s,
%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)"
I get this SQL error when it runs:
['198715713', '188087670762', '200925899732', 'LineItem', '200176724883754717735329322', 'AWS CloudTrail', '300551402', '214457068', '34915176', 'EUC1-FreeEventsRecorded', 'None', '', 'N', '0.0 per free event recorded in EU (Frankfurt) region', '2019-03-01 00:00:00', '2019-03-01 01:00:00', '4.0000000000', '0.0000000000', '0.0000000000', '0.0000000000', '0.0000000000', '', '', '', '', '']
Exception: 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, ' at line 1
But when I take that exact SQL statement that the error is complaining about, and insert it into the database myself, it works!
mysql> INSERT INTO billing_info (InvoiceId, PayerAccountId, LinkedAccountId, RecordType, RecordId, ProductName, RateId, SubscriptionId, PricingPlanId, UsageType, Operation, AvailabilityZone, ReservedInstance, ItemDescription, UsageStartDate, UsageEndDate, UsageQuantity, BlendedRate, BlendedCost, UnBlendedRate, UnBlendedCost, ResourceId, Engagement, Name, Owner, Parent) VALUES ('198715713', '188087670762', '200925899732', 'LineItem', '200176724883754717735329322', 'AWS CloudTrail', '300551402', '214457068', '34915176', 'EUC1-FreeEventsRecorded', 'None', '', 'N', '0.0 per free event recorded in EU (Frankfurt) region', '2019-03-01 00:00:00', '2019-03-01 01:00:00', '4.0000000000', '0.0000000000', '0.0000000000', '0.0000000000', '0.0000000000', '', '', '', '', '');
Query OK, 1 row affected (0.01 sec)
What the heck could be going on?
Here is all of my code (it's very small):
#!/usr/bin/env python3
import os
import mysql.connector
import csv
source = os.path.join('source_files', 'aws_bills', 'march-bill-original-2019.csv')
destination = os.path.join('output_files', 'aws_bills', 'test_out.csv')
mydb = mysql.connector.connect(user='xxx', password='xxx',
host='xxx',
database='aws_bill')
cursor = mydb.cursor()
def read_csv_to_sql():
try:
with open(source) as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
next(csv_reader)
insert_sql = "INSERT INTO billing_info (InvoiceId, PayerAccountId, LinkedAccountId, RecordType, RecordId, ProductName, RateId, SubscriptionId, PricingPlanId, UsageType, Operation, AvailabilityZone, ReservedInstance, ItemDescription, UsageStartDate, UsageEndDate, UsageQuantity, BlendedRate, BlendedCost, UnBlendedRate, UnBlendedCost, ResourceId, Engagement, Name, Owner, Parent) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)"
for row in csv_reader:
print(row)
cursor.execute(insert_sql.format(row))
mydb.commit()
print("Done importing data.")
except Exception as e:
print("Exception:", e)
mydb.rollback()
finally:
mydb.close()
def main():
read_csv_to_sql()
if __name__ == "__main__":
main()
What am I doing wrong?
Related
I have a python script that reads a large (4GB!!!) CSV file into MySQL. It works as is, but is DOG slow. The CSV file has over 4 million rows. And it is taking forever to insert all the records into the database.
Could I get an example of how I would use executemany in this situation?
Here is my code:
source = os.path.join('source_files', 'aws_bills', 'march-bill-original-2019.csv')
try:
with open(source) as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
next(csv_reader)
insert_sql = """ INSERT INTO billing_info (InvoiceId, PayerAccountId, LinkedAccountId, RecordType, RecordId, ProductName, RateId, SubscriptionId, PricingPlanId, UsageType, Operation, AvailabilityZone, ReservedInstance, ItemDescription, UsageStartDate, UsageEndDate, UsageQuantity, BlendedRate, BlendedCost, UnBlendedRate, UnBlendedCost, ResourceId, Engagement, Name, Owner, Parent) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s) """
#for row in csv_reader:
for row_idx, row in enumerate(csv_reader):
try:
cursor.execute(insert_sql,row)
#cursor.executemany(insert_sql, 100)
mydb.commit()
print('row', row_idx, 'inserted with LinkedAccountId', row[2], 'at', datetime.now().isoformat())
except Exception as e:
print("MySQL Exception:", e)
print("Done importing data.")
Again, that code works to insert the records into the database. But I am hoping to speed this up with executemany if I can get an example of how to do that.
Good Night
I saw that the question is a little old and I don't know if you still need it.
I was doing something similar recently, initially I transformed the csv into a list so that the executemany function accepts the data, right after performing the request passing its insert with the list, in your case it would look like this:
import pandas as pd
df = pd.read_csv(r'path_your_csv')
df1=pd.DataFrame(df)
df1=df1.astype(str)
List_Values=df1.values.tolist()
insert_sql = """ INSERT INTO billing_info (InvoiceId, PayerAccountId, LinkedAccountId, RecordType, RecordId, ProductName, RateId, SubscriptionId, PricingPlanId, UsageType, Operation, AvailabilityZone, ReservedInstance, ItemDescription, UsageStartDate, UsageEndDate, UsageQuantity, BlendedRate, BlendedCost, UnBlendedRate, UnBlendedCost, ResourceId, Engagement, Name, Owner, Parent) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s) """
cursor.executemany(insert_sql, List_Values)
Can anyone help, I have no idea why it keeps returing the error
ERROR: Not enough arguments for format string
It's reading from a csv where the headers are named Property ID, Reference Number etc. The only difference is the addition of the _ in the table column names.
Here is my script for reference:
import csv
import pymysql
mydb = pymysql.connect(host='127.0.0.1', user='root', passwd='root', db='jupix', unix_socket="/Applications/MAMP/tmp/mysql/mysql.sock")
cursor = mydb.cursor()
with open("activeproperties.csv") as f:
reader = csv.reader(f)
# next(reader) # skip header
data = []
for row in reader:
cursor.executemany('INSERT INTO ACTIVE_PROPERTIES(Property_ID, Reference_Number,Address_Name,Address_Number,Address_Street,Address_2,Address_3,Address_4,Address_Postcode,Owner_Contact_ID,Owner_Name,Owner_Number_Type_1,Owner_Contact_Number_1,Owner_Number_Type_2,Owner_Contact_Number_2,Owner_Number_Type_3,Owner_Contact_Number_3,Owner_Email_Address,Display_Property_Type,Property_Type,Property_Style,Property_Bedrooms,Property_Bathrooms,Property_Ensuites,Property_Toilets,Property_Reception_Rooms,Property_Kitchens,Floor_Area_Sq_Ft,Acres,Rent,Rent_Frequency,Furnished,Next_Available_Date,Property_Status,Office_Name,Negotiator,Date_Created)''VALUES(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)', row)
mydb.commit()
cursor.close()
print"Imported!"
The error is happening because you have 37 columns that you are trying to insert data into, but 38 inputs that you are sending to the database i.e %s. Therefore you are telling the cursor to send a piece of data to the database but the cursor does not know where to insert the data into. You either forgot to include a column in your INSERT INTO statement, or have an extra %s in your statement.
Therefore you need to remove one of the %s in your SQL statement, or add a column in your database to send the last piece of data into.
When you have a large number of columns, it can be a challenge to make sure you have one %s for each column. There's an alternative syntax for INSERT that makes this easier.
Instead of this:
INSERT INTO ACTIVE_PROPERTIES(Property_ID, Reference_Number,
Address_Name, Address_Number, Address_Street, Address_2, Address_3,
Address_4, Address_Postcode, Owner_Contact_ID, Owner_Name,
Owner_Number_Type_1, Owner_Contact_Number_1, Owner_Number_Type_2,
Owner_Contact_Number_2, Owner_Number_Type_3, Owner_Contact_Number_3,
Owner_Email_Address, Display_Property_Type, Property_Type,
Property_Style, Property_Bedrooms, Property_Bathrooms, Property_Ensuites,
Property_Toilets, Property_Reception_Rooms, Property_Kitchens,
Floor_Area_Sq_Ft, Acres, Rent, Rent_Frequency, Furnished,
Next_Available_Date, Property_Status, Office_Name, Negotiator,
Date_Created)
VALUES(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s,
%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s,
%s, %s, %s, %s, %s)
Try the following, to make it easier to match up columns with %s parameters, so you don't miscount:
INSERT INTO ACTIVE_PROPERTIES
SET Property_ID = %s,
Reference_Number = %s,
Address_Name = %s,
Address_Number = %s,
Address_Street = %s,
Address_2 = %s,
Address_3 = %s,
Address_4 = %s,
Address_Postcode = %s,
Owner_Contact_ID = %s,
Owner_Name = %s,
Owner_Number_Type_1 = %s,
Owner_Contact_Number_1 = %s,
Owner_Number_Type_2 = %s,
Owner_Contact_Number_2 = %s,
Owner_Number_Type_3 = %s,
Owner_Contact_Number_3 = %s,
Owner_Email_Address = %s,
Display_Property_Type = %s,
Property_Type = %s,
Property_Style = %s,
Property_Bedrooms = %s,
Property_Bathrooms = %s,
Property_Ensuites = %s,
Property_Toilets = %s,
Property_Reception_Rooms = %s,
Property_Kitchens = %s,
Floor_Area_Sq_Ft = %s,
Acres = %s,
Rent = %s,
Rent_Frequency = %s,
Furnished = %s,
Next_Available_Date = %s,
Property_Status = %s,
Office_Name = %s,
Negotiator = %s,
Date_Created = %s;
It's not standard SQL, but it's supported by MySQL. It does the same thing internally, it's just more convenient syntax, at least when you're inserting a single row at a time.
a simple example. also you can manipulate it for your needs.
myDict = {'key-1': 193699, 'key-2': 206050, 'key-3': 0, 'key-N': 9999999}
values = ', '.join(f"'{str(x)}'" for x in myDict.values())
columns = ', '.join(myDict.keys())
sql = f"INSERT INTO YourTableName ({columns}) VALUES ({values});"
conn = pymysql.connect(autocommit=True, host="localhost",database='DbName', user='UserName', password='PassWord',)
with conn.cursor() as cursor:
cursor.execute(sql)
This is my code and I'm getting the error "not all arguments converted during string formatting" in python 2.7 I can't understand what I am doing wrong here, can anyone help with this? Thanks!
import csv
import pymysql
mydb = pymysql.connect(host='127.0.0.1', user='root', passwd='root', db='jupix', unix_socket="/Applications/MAMP/tmp/mysql/mysql.sock")
cursor = mydb.cursor()
csv_data = csv.reader(open('activeproperties.csv'))
print("Connect Success")
for row in csv_data:
cursor.execute('INSERT INTO ALL_ACTIVE_PROPERTIES(Property ID, Reference_Number,Address_Name,Address_Number,Address_Street,Address_2,Address_3,Address_4,Address_Postcode,Owner_Contact_ID,Owner_Name,Owner_Number_Type_1,Owner_Contact_Number_1,Owner_Number_Type_2,Owner_Contact_Number_2,Owner_Number_Type_3,Owner_Contact_Number_3,Owner_Email_Address,Display_Property_Type,Property_Type,Property_Style,Property_Bedrooms,Property_Bathrooms,Property_Ensuites,Property_Toilets,Property_Reception_Rooms,Property_Kitchens,Floor_Area_Sq_Ft,Acres,Rent,Rent_Frequency,Furnished,Next_Available_Date,Property_Status,Office_Name,Negotiator,Date_Created)' 'VALUES(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)', row)
mydb.commit()
cursor.close()
print"Imported!"
I want to decode a JSON-formated string and insert it in a SQL-database via MySQL.
import json
import MySQLdb as mdb
my_json = '{"timestamp":1479132183,"sn":"B59EC63F","u":[3346,3346,3347,3346],"soc":96,"i":-32,"cc":345351,"ccMax":360000}'
parsed_stuff = json.loads(my_json)
con = None
try:
con = mdb.connect('localhost', 'name', 'password', 'mysql')
cur = con.cursor()
cur.execute("SELECT * FROM database")
cur.execute("INSERT INTO database (timestamp, sn, u0, u1, u2, u3, soc, i, cc, ccMax) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s)" % (parsed_stuff['timestamp'], parsed_stuff['sn'], parsed_stuff['u'[1]], parsed_stuff['u'[2]], parsed_stuff['u'[3]], parsed_stuff['u'[4]], parsed_stuff['soc'], parsed_stuff['i'], parsed_stuff['cc'], parsed_stuff['ccMax']))
con.commit()
except mdb.Error as e:
print("Error %d: %s" % (e.args[0], e.args[1]))
sys.exit(1)
if con:
con.close()
It throws the error:
"Unknown column 'B59EC63F' in 'field list'"
Additionally throws
"IndexError: string index out of range".
I changed
VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s)
to
VALUES (%s, '%s', %s, %s, %s, %s, %s, %s, %s, %s)
and it works now.
I have two lists.
one a list of output strings from a sensor.
eg
output = "1,2,3,4,5,6"
output = output.split (',')
the second list is the column headers
columns = "date,time,name,age,etc,etc2"
columns = columns.split (',')
the order of the columns variable changes as do the output values.
I want to upload the date to mysql using columns variable to dictate the fields and outputs variable for the values.
this is my code which does not see columns as a list only as a single value?
#!/usr/bin/python
from device import output
from device import columns
mydb = MySQLdb.connect(host="localhost", user="", passwd="", db="")
cursor = mydb.cursor()
with mydb:
cursor.execute('INSERT INTO logging(columns) '\
'VALUES(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)', output)
error given
Traceback (most recent call last):
File "./upload4.py", line 27, in <module>
'VALUES(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)', output)
File "/usr/lib/python2.7/dist-packages/MySQLdb/cursors.py", line 174, in execute
self.errorhandler(self, exc, value)
File "/usr/lib/python2.7/dist-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
raise errorclass, errorvalue
_mysql_exceptions.OperationalError: (1136, "Column count doesn't match value count at row 1")
Any help appreciated!
The query should be built in this way:
'INSERT INTO logging({}) VALUES({})'.format(','.join(columns), ','.join(['%s'] * len(columns)))