Related
I'm sure this is a really easy question but I can't seem to find any information on it.
I have a very large CSV file which I need to insert a row directly after the header which helps with another code that reads the csv and joins it to a parcel shapefile.
I have the code to append the row of data that I want, but it will only go to the last line. I cannot figure out how to get the code to insert my row immediately after the header row. Here is my code:
import os
import csv
insert_row = '"AAAAAAAAAAAAAAAAAAA","**********","**********","**********","**********","**********","**********","**","**********","**********","****","**********",999999,9999,00'
os.chdir(r"D:\PROPERTY\PINELLAS\Data_20201001_t")
with open("owner_mail.csv", 'r') as csv_file, open("owner_mail.csv", 'a', newline = "") as new_file:
csv_reader = csv.reader(csv_file)
csv_writer = csv.writer(new_file)
csv_writer.writerow(insert_row)
So that's it. I just need the insert_row line of data to be in row position number 2 instead of at the end of the file. Thank you.
You can't just insert a row in the middle of a file unless replacing data of exactly the same length. You have to read the entire file, edit it, and re-write it.
Something like this should work:
import csv
# This must be an iterable not a string
insert_row = "AAAAAAAAAAAAAAAAAAA","**********","**********","**********","**********","**********","**********","**","**********","**********","****","**********",999999,9999,00
with open("owner_mail.csv", 'r') as csv_file, open("owner_mail_updated.csv", 'w', newline = "") as new_file:
csv_reader = csv.reader(csv_file)
csv_writer = csv.writer(new_file)
header = next(csv_reader)
csv_writer.writerow(header)
csv_writer.writerow(insert_row)
for line in csv_reader:
csv_writer.writerow(line)
If the CSV file is not too large to fit entirely in memory than you can read all the lines at once, edit them, and write them back out to the same file. It's riskier if there is a problem. Safer to write to a new file, then delete original and rename if no errors:
import csv
# This must be an iterable not a string
insert_row = "AAAAAAAAAAAAAAAAAAA","**********","**********","**********","**********","**********","**********","**","**********","**********","****","**********",999999,9999,00
with open("owner_mail.csv", 'r') as csv_file:
rows = list(csv.reader(csv_file))
rows.insert(1,insert_row) # insert after header row
with open("owner_mail.csv", 'w') as csv_file:
w = csv.writer(csv_file)
w.writerows(rows)
Please try this:
import os
import csv
insert_row = '"AAAAAAAAAAAAAAAAAA","**********","**********","**********","**********","**********","**********","**","**********","**********","****","**********",999999,9999,00'
with open("owner_mail.csv", 'r') as csv_file, open("owner_mail.csv", 'w') as new_file:
csv_reader = csv.reader(csv_file)
reader = list(csv_reader)
reader.insert(1,insert_row)
csv_writer = csv.writer(new_file)
csv_writer.writerows(reader)
I have a csv with two columns of data. I want to extract data from one column and write to a text file with single-quote on each element and separated by a comma. For example, I have this..
taxable_entity_id,id
45efc167-9254-406c-b5a8-6aef91a73dd9,331999
5ae97680-f489-4182-9dcb-eb07a73fab15,103507
00018d93-ae71-4367-a0da-f252cea4dfa2,32991
I want all the taxable_entity_ids in a text file like this
'45efc167-9254-406c-b5a8-6aef91a73dd9','5ae97680-f489-4182-9dcb-eb07a73fab15','00018d93-ae71-4367-a0da-f252cea4dfa2'
without any space between two elements, separated by a comma.
Edit:
This is what i tried..
import csv
with open("Taxable_entity_those_who_filed_G1_M_July_but_not_in_Aug.csv", 'r') as csv_File:
reader = csv.DictReader(csv_File)
with open("te_id.csv", 'w') as text_file:
writer = csv.writer(text_file, quotechar='\'', quoting=csv.QUOTE_MINIMAL)
for row in reader:
writer.writerow(row["taxable_entity_id"])
# print(row["taxable_entity_id"])
text_file.close()
csv_File.close()
and this is what I got..
4,5,e,f,c,1,6,7,-,9,2,5,4,-,4,0,6,c,-,b,5,a,8,-,6,a,e,f,9,1,a,7,3,d,d,9
5,a,e,9,7,6,8,0,-,f,4,8,9,-,4,1,8,2,-,9,d,c,b,-,e,b,0,7,a,7,3,f,a,b,1,5
0,0,0,1,8,d,9,3,-,a,e,7,1,-,4,3,6,7,-,a,0,d,a,-,f,2,5,2,c,e,a,4,d,f,a,2
You were close. Simply as you want one single line in the output file, you should write it at once by using a comprehension:
import csv
with open("Taxable_entity_those_who_filed_G1_M_July_but_not_in_Aug.csv", 'r') as csv_File:
reader = csv.DictReader(csv_File)
with open("te_id.csv", 'w') as text_file:
# use QUOTE_ALL to force the quoting
writer = csv.writer(text_file, quotechar='\'', quoting=csv.QUOTE_ALL)
writer.writerow((row["taxable_entity_id"] for row in reader))
And do not use close as you have (correctly) used with.
try that
import pandas as pd
df = pd.read_csv('nameoffile.csv',delimiter = ',')
X = df[0].values
f = open('newfile.txt','w')
for i in X:
f.write(X[i] + ',')
f.close()
It's seems a little odd that you basically want a one row csv file for the taxable_entity_ids, but certain possible. You also don't need to explicitly close() the open files because the with context manager will do it for you automatically.
You also need to open the CSV file with newline='' as shown in all the examples in the csv module's documentation.
Lastly, if you want the all the fields to be quoted you need to use quoting=csv.QUOTE_ALL instead of quoting=csv.QUOTE_MINIMAL.
import csv
inp_filename = "Taxable_entity_those_who_filed_G1_M_July_but_not_in_Aug.csv"
outp_filename = "te_id.csv"
with open(outp_filename, 'w', newline='') as text_file, \
open(inp_filename, 'r', newline='') as csv_File:
reader = csv.DictReader(csv_File)
writer = csv.writer(text_file, quotechar="'", quoting=csv.QUOTE_ALL)
taxable_entity_ids = (row["taxable_entity_id"] for row in reader)
writer.writerow(taxable_entity_ids)
print('done')
import csv
with open("somecities.csv") as f:
reader = csv.DictReader(f)
data = [r for r in reader]
Contents of somecities.csv:
Country,Capital,CountryPop,AreaSqKm
Canada,Ottawa,35151728,9984670
USA,Washington DC,323127513,9833520
Japan,Tokyo,126740000,377972
Luxembourg,Luxembourg City,576249,2586
New to python and I'm trying to read and append a csv file. I've spent some time experimenting with some responses to similar questions with no luck--which is why I believe the code above to be pretty useless.
What I am essentially trying to achieve is to store each row from the CSV in memory using a dictionary, with the country names as keys, and values being tuples containing the other information in the table in the sequence they are in within the CSV file.
And from there I am trying to add three more cities to the csv(Country, Capital, CountryPop, AreaSqKm) and view the updated csv. How should I go about doing all of this?
The desired additions to the updated csv are:
Brazil, Brasília, 211224219, 8358140
China, Beijing, 1403500365, 9388211
Belgium, Brussels, 11250000, 30528
EDIT:
Import csv
with open("somecities.csv", "r") as csvinput:
with open(" somecities_update.csv", "w") as csvresult:
writer = csv.writer(csvresult, lineterminator='\n')
reader = csv.reader(csvinput)
all = []
headers = next(reader)
for row in reader:
all.append(row)
# Now we write to the new file
writer.write(headers)
for record in all:
writer.write(record)
#row.append(Brazil, Brasília, 211224219, 8358140)
#row.append(China, Beijing, 1403500365, 9388211)
#row.append(Belgium, Brussels, 11250000, 30528)
So assuming you can use pandas for this I would go about it this way:
import pandas as pd
df1 = pd.read_csv('your_initial_file.csv', index_col='Country')
df2 = pd.read_csv('your_second_file.csv', index_col='Country')
dfs = [df1,df2]
final_df = pd.concat(dfs)
DictReader will only represent each row as a dictionary, eg:
{
"Country": "Canada",
...,
"AreaSqKm": "9984670"
}
If you want to store the whole CSV as a dictionary you'll have to create your own:
import csv
all_data = {}
with open("somecities.csv", "r") as f:
reader = csv.DictReader(f)
for row in reader:
# Key = country, values = tuple containing the rest of the data.
all_data[row["Country"]] = (row["Capital"], row["CountryPop"], row["AreaSqKm"])
# Add the new cities to the dictionary here...
# To write the data to a new CSV
with open("newcities.csv", "w") as f:
writer = csv.writer(f)
for key, values in all_data.items():
writer.writerow([key] + list(values))
As others have said, though, the pandas library could be a good choice. Check out its read_csv and to_csv functions.
Just another idea with creating and list and appending the new values through list construct as below, not tested:
import csv
with open("somecities.csv", "r") as csvinput:
with open("result.csv", "w") as csvresult:
writer = csv.writer(csvresult, lineterminator='\n')
reader = csv.reader(csvinput)
all = []
row = next(reader)
row.append(Brazil, Brasília, 211224219, 8358140)
row.append(China, Beijing, 1403500365, 9388211)
all.append(row)
for row in reader:
row.append(row[0])
all.append(row)
writer.writerows(all)
The simplest Form i see, tested in python 3.6
Opening a file with the 'a' parameter allows you to append to the end of the file instead of simply overwriting the existing content. Try that.
>>> with open("somecities.csv", "a") as fd:
... fd.write("Brazil, Brasília, 211224219, 8358140")
OR
#!/usr/bin/python3.6
import csv
fields=['Brazil', 'Brasília', '211224219','8358140']
with open(r'somecities.csv', 'a') as f:
writer = csv.writer(f)
writer.writerow(fields)
I need to read each csv file from a predefined dir, for each csv in that dir I need to take each row and write it to a new csv file.
Currently, I have this code snippet that reads a specific csv file and loops on each row.
import csv
with open('E:\EE\EE\TocsData\CSAT\csat_20140331.csv', 'rb') as csvfile:
reader = csv.reader(csvfile, delimiter=',', quotechar='|')
for row in reader:
# write row to a seperate csv file
Try this:
import glob
import csv
with open('somefile.out', 'wb') as out:
writer = csv.writer(out, delimiter=',')
for inf in glob.glob(r'E:\EE\EE\TocsData\CSAT\*.csv'):
with open(inf, 'rb') as csv_input:
reader = csv.reader(csv_input, delimiter=',', quotechar='|')
for row in reader:
writer.write(row)
Something similar to this should work. Loop through all files in the directory, read from each and write all rows to the new file.
myDirectory = "E:\EE\EE\TocsData\CSAT\\"
myNewCSV= "E:\myNewCSV.csv"
with open(myNewCSV,'ab') as w: #append bytes:'ab', write bytes:'wb' - whichever you want.
for myFile in os.listdir(myDirectory):
absolutePath = myDirectory + str(myFile)
writer = csv.writer(w)
with open(absolutePath, 'rb') as r:
reader = csv.reader(r)
for row in reader:
writer.writerow(row)
I already have written what I need for identifying and parsing the value I am seeking, I need help writing a column to the csv file (or a new csv file) with the parsed value. Here's some pseudocode / somewhat realistic Python code for what I am trying to do:
# Given a CSV file, this function creates a new CSV file with all values parsed
def handleCSVfile(csvfile):
with open(csvfile, 'rb') as file:
reader = csv.reader(file, delimiter=',', lineterminator='\n')
for row in reader:
for field in row:
if isWhatIWant(field):
parsedValue = parse(field)
# write new column to row containing parsed value
I've already written the isWhatIWant and parse functions. If I need to write a completely new csv file, then I am not sure how to have both open simultaneously and read and write from one into the other.
I'd do it like this. I'm guessing that isWhatIWant() is something that is supposed to replace a field in-place.
import csv
def handleCSVfile(infilename, outfilename):
with open(infilename, 'rb') as infile:
with open(outfilename, 'wb') as outfile:
reader = csv.reader(infile, lineterminator='\n')
writer = csv.writer(outfile, lineterminator='\n')
for row in reader:
for field_index, field in enumerate(row):
if isWhatIWant(field):
row[field_index] = parse(field)
writer.writerow(row)
This sort of pattern occurs a lot and results in really long lines. It can sometimes be helpful to break out the logic from opening and files into a different function, like this:
import csv
def load_save_csvfile(infilename, outfilename):
with open(infilename, 'rb') as infile:
with open(outfilename, 'wb') as outfile:
reader = csv.reader(infile, lineterminator='\n')
writer = csv.writer(outfile, lineterminator='\n')
read_write_csvfile(reader, writer)
def read_write_csvfile(reader, writer)
for row in reader:
for field_index, field in enumerate(row):
if isWhatIWant(field):
row[field_index] = parse(field)
writer.writerow(row)
This modularizes the code, making it easier for you to change the way the files and formats are handled from the logic independently from each other.
Additional hints:
Don't name variables file as that is a built-in function. Shadowing those names will bite you when you least expect it.
delimiter=',' is the default so you don't need to specify it explicitly.