import csv
from cs50 import SQL
db = SQL("sqlite:///roster.db")
with open ("students.csv" , "r") as file :
reader = csv.DictReader(file)
record = {}
same = []
for row in reader :
n = db.execute("INSERT INTO houses(house_id , house) VALUES (?, ?)", row['id'] , row['house'])
a = db.execute("SELECT * from houses")
print(a)
the program above keeps telling me some error messages that I do not really understand
I do not know how to fix that. I did try to put the variable row['id'] directly to the value parenthesis, but I got a empty table with nothing in it.
That is the part when I ran ".schema" to get the table.
The table "name" is created in the command line argument with sqlite3 instead of running python code, is that why the error above mentioned about the "name" table?
enter image description here
Assuming the second image is the schema (better to post as text not image!), there is a typo in the REFERENCES clauses of the house and head CREATE statements. Read them carefully and critically. It will not fail on the CREATE, only when trying to insert into either of the tables.
Related
I'm new with psycopg2 and I do have a question (which I cannot really find a respond in the Internet): Do we have any difference (for exemaple in the aspect of performance) between using copy_xxx() method and combo execute() + fetchxxx() method when we try to write the result of query into a CSV file?
...
query_str = "SELECT * FROM mytable"
cursor.execute(query_str)
with open("my_file.csv", "w+") as file:
writer = csv.writer(file)
while True:
rows = cursor.fetchmany()
if not rows:
break
writer.writerows(rows)
vs
...
query_str = "SELECT * FROM mytable"
output_query = f"COPY {query_str} TO STDOUT WITH CSV HEADER"
with open("my_file.csv", "w+") as file:
cursor.copy_expert(output_query, file)
And if I try to do a very complex query (my assumption is that we cannot simplify this query anymore for ex) with psycopg2, which method should I use? Or do you guys have any advice, please?
Many thanks!!!
COPY is faster, but if query execution time is dominant or the file is small, it won't matter much.
You don't show us how the cursor was declared. If it is an anonymous cursor, then execute/fetch will read all query data into memory upfront, leading to out of memory conditions for very large queries. If it is a named cursor, then you will individually request every row from the server, leading to horrible performance (which can be overcome by specifying a count argument to fetchmany, as the default is bizarrely set to 1)
I started created a database with postgresql and I am currently facing a problem when I want to copy the data from my csv file to my database
Here is my code:
connexion = psycopg2.connect(dbname= "db_test" , user = "postgres", password ="passepasse" )
connexion.autocommit = True
cursor = connexion.cursor()
cursor.execute("""CREATE TABLE vocabulary(
fname integer PRIMARY KEY,
label text,
mids text
)""")
with open (r'C:\mypathtocsvfile.csv', 'r') as f:
next(f) # skip the header row
cursor.copy_from(f, 'vocabulary', sep=',')
connexion.commit()
I asked to allocate 4 column to store my csv data, the problem is that datas in my csv are stored like this:
fname,labels,mids,split
64760,"Electric_guitar,Guitar,Plucked_string_instrument,Musical_instrument,Music","/m/02sgy,/m/0342h,/m/0fx80y,/m/04szw,/m/04rlf",train
16399,"Electric_guitar,Guitar,Plucked_string_instrument,Musical_instrument,Music","/m/02sgy,/m/0342h,/m/0fx80y,/m/04szw,/m/04rlf",train
...
There is comas inside my columns label and mids, thats why i get the following error:
BadCopyFileFormat: ERROR: additional data after the last expected column
Which alternativ should I use to copy data from this csv file?
ty
if the file is small, then the easiest way is to open the file in LibreOffice and save the file with a new separetor.
I usually use ^.
If the file is large, write a script to replace ," and "," on ^" and "^", respectively.
COPY supports csv as a format, which already does what you want. But to access it via psycopg2, I think you will need to use copy_expert rather than copy_from.
cursor.copy_expert('copy vocabulary from stdin with csv', f)
I'm using the following code to query a SQL Server DB, and storing the returned results in a CSV file.
import pypyodbc
import csv
connection = pypyodbc.connect('Driver={SQL Server};'
'Server=localhost;'
'Database=testdb;')
cursor = connection.cursor()
SQLCommand = (""" SELECT A as First,
SELECT B as Second,
FROM AB """)
cursor.execute(SQLCommand)
results = cursor.fetchall()
myfile = open('test.csv', 'w')
wr = csv.writer(myfile,dialect='excel')
wr.writerow(results)
connection.close()
The SQL command is just a sample, my query contains a lot more columns, this is just for example sake.
With this code, my CSV looks like this:
But I want my CSV to look like so, and plus I want the headers to show as well, like this:
I'm guessing the formatting needs to be done within the 'csv.writer' part of the code but I cant seem to figure it out. Can someone please guide me?
You are seeing that strange output because fetchall returns multiple rows of output but you are using writerow instead of writerows to dump them out. You need to use writerow to output a single line of column headings, followed by writerows to output the actual results:
with open(r'C:\Users\Gord\Desktop\test.csv', 'w', newline='') as myfile:
wr = csv.writer(myfile)
wr.writerow([x[0] for x in cursor.description]) # column headings
wr.writerows(cursor.fetchall())
cursor.close()
connection.close()
I want to modify one column in a .dpf file using Python with this library http://pythonhosted.org/dbf/. When I want to print out some column, it works just fine. But when I am trying to modify a column, I get error
unable to modify fields individually except in with or Process()
My code:
table = dbf.Table('./path/to/file.dbf')
table.open()
for record in table:
record.izo = sys.argv[2]
table.close()
In docs, they recommend doing it like
for record in Write(table):
But I also get an error:
name 'Write' is not defined
And:
record.write_record(column=sys.argv[2])
Also gives me an error that
write_record - no such field in table
Thanks!
My apologies for the state of the docs. Here are a couple options that should work:
table = dbf.Table('./path/to/file.dbf')
# Process will open and close a table if not already opened
for record in dbf.Process(table):
record.izo = sys.argv[2]
or
with dbf.Table('./path/to/file.dbf')
# with opens a table, closes when done
for record in table:
with record:
record.izo = sys.argv[2]
I have been trying to make a change to my dbf file for several days and searched and browsed several websites, this page was the only one that gave me a solution that worked. Just to add a little more information so that whoever lands here would understand the above piece of code that Ethan Furman shared.
import dbf
table = dbf.Table('your_dbf_filename.dbf')
# Process will open and close a table if not already opened
for record in dbf.Process(table):
record.your_column_name = 'New_Value_of_that_column'
Now, because you don't have a condition mentioned here, you would end you updating all the rows of your column. Remember, this statement will immediately reflect the new value in that column. So, the advice is to save a copy of this dbf file before making any edits to it.
I also tried the 2nd solution that Ethan mentions, but it throws an error that 'table' not defined.
am writing a Python script that will copy files from one dir to another and copy that filename into a doc archive PostgreSQL table. The error I receive is below:
Camt' call excute() on named cursors more than once
Below is my code:
cursor = conn.cursor('cur', cursor_factory=psycopg2.extras.DictCursor)
cursor.execute('SELECT * FROM doc_archive.table LIMIT 4821')
row_count = 0
for row in cursor:
row_count += 1
print "row: %s %s\r" % (row_count, row),
pathForListFiles = srcDir
files = os.listdir(pathForListFiles)
for file in files:
print file
try:
# Perform an insert with the docid
cursor.execute("INSERT INTO doc_archive.field_photo_vw VALUES)
Is this the actual code? You've got unmatched quotes in the second execute.
when iterating through results, I normally use
for var in range(int(cursor.rowcount)):
row = cursor.fetchone()
Without trouble.
for var in cursor:
Seems wrong to me.
results = cur.fetchall()
for var in enumerate(results):
Is basically the same thing there. But would allow you to close your cursor in case you have to do another execute while iterating the first set of results. Generally I just declare another cursor in those instances.
In either case, your current code doesn't seem to be fetching the results of the execute. Which is important if you need to process that data.