I have a text file which contains this data (items corresponds to code,entry1,entry2) :
a,1,2
b,2,3
c,4,5
....
....
Here a,b,c.. will be unique always
Every time I read this file in python to either create a new entry for example d,6,7 or to update existing values: say a,1,2 to a,4,3.
I use the following code :
data = ['a',5,6]
datastring = ''
for d in data
datastring = datastring + str(d) + ','
try:
with open("opfile.txt", "a") as f:
f.write(datastring + '\n')
f.close()
return(True)
except:
return(False)
This appends any data as a new entry.
I am trying something like this which checks the first character of each line:
f = open("opfile.txt", "r")
for x in f:
if(x[0] == username):
pass
I don't know how to club these two so that a check will be done on first character(lets say it as id) and if an entry with id is already in the file, then it should be replaced with new data and all other data remains same else it will be entered as new line item.
Read the file into a dictionary that uses the first field as keys. Update the appropriate dictionary, then write it back.
Use the csv module to parse and format the file.
import csv
data = ['a',5,6]
with open("opfile.txt", "r", newline='') as infile:
incsv = csv.reader(infile)
d = {row[0]: row for row in incsv if len(row) != 0}
d[data[0]] = data
with open("opfile.txt", "w") as outfile:
outcsv = csv.writer(outfile)
outcsv.writerows(d.values())
first append all new row to the file.
second, try using write to update rows in your file
def update_record(file_name, field1, field2, field3):
with open(file_name, 'r') as f:
lines = f.readlines()
with open(file_name, 'w') as f:
for line in lines:
if field1 in line:
f.write(field1 + ',' + field2 + ',' + field3 + '\n')
else:
f.write(line)
I would like to do the following
read a csv file, Add a new first column, then rename some of the columns
then load the records from csv file.
Ultimately, I would like the first column to be populated with the file
name.
I'm fairly new to Python and I've kind of worked out how to change the fieldnames however, loading the data is a problem as it's looking for the original fieldnames which no longer match.
Code snippet
import csv
import os
inputFileName = "manifest1.csv"
outputFileName = os.path.splitext(inputFileName)[0] + "_modified.csv"
with open(inputFileName, 'rb') as inFile, open(outputFileName, 'wb') as outfile:
r = csv.DictReader(inFile)
fieldnames = ['MapSvcName','ClientHostName', 'Databasetype', 'ID_A', 'KeepExistingData', 'KeepExistingMapCache', 'Name', 'OnPremisePath', 'Resourcestype']
w = csv.DictWriter(outfile,fieldnames)
w.writeheader()
*** Here is where I start to go wrong
# copy the rest
for node, row in enumerate(r,1):
w.writerow(dict(row))
Error
File "D:\Apps\Python27\ArcGIS10.3\lib\csv.py", line 148, in _dict_to_list
+ ", ".join([repr(x) for x in wrong_fields]))
ValueError: dict contains fields not in fieldnames: 'Databases [xsi:type]', 'Resources [xsi:type]', 'ID'
Would like to some assistance to not just learn but truly understand what I need to do.
Cheers and thanks
Peter
Update..
I think I've worked it out
import csv
import os
inputFileName = "manifest1.csv"
outputFileName = os.path.splitext(inputFileName)[0] + "_modified.csv"
with open(inputFileName, 'rb') as inFile, open(outputFileName, 'wb') as outfile:
r = csv.reader(inFile)
w = csv.writer(outfile)
header = next(r)
header.insert(0, 'MapSvcName')
#w.writerow(header)
next(r, None) # skip the first row from the reader, the old header
# write new header
w.writerow(['MapSvcName','ClientHostName', 'Databasetype', 'ID_A', 'KeepExistingData', 'KeepExistingMapCache', 'Name', 'OnPremisePath', 'Resourcestype'])
prevRow = next(r)
prevRow.insert(0, '0')
w.writerow(prevRow)
for row in r:
if prevRow[-1] == row[-1]:
val = '0'
else:
val = prevRow[-1]
row.insert(0,val)
prevRow = row
w.writerow(row)
The objective of this script is to take an incoming csv file, read it with a DictReader,
take the keys that were read, see if they match any of the pre-designated values in the fieldMap dictionary, and if they do match, append those keys to my hdrlist. Then, write the header list to an outputted file call ofp.
This issue that I am having is that when I don't a key that matches one of the pre-designated values in the fieldMap, I need to insert a blank (' ').
I've tried appending blank values to the hdrlist in an else statement and having a blank key value pair in my fieldMap dictionary:
if row.has_key(ft_test):
hdrlist.append(ft_test)
else:
hdrlist.append('')
'':[''] #blank key:value pair
,but then my:
if hdrlen != len(hdrlist)-1:
print "Cannot Cannot find a key for %s in file %s" % (ft,fn)"
error handling statement returns more print statements than I think it should, and I'm not sure as to why.
If anyone can shed some light as to how to insert blank into my ofp.write(fmtstring), it would be greatly appreciated.
Also, if anyone could shed some light as to why i get more print statement than I think I should with the above else statement, it would be greatly appreciated as well.
My whole script is below, and if there is any other info needed to help me with this code, I will gladly provide it.
Here is a sample of an input file that would produce to many print statements.
input_file.csv = {'cust_no':1, 'streetaddr':'2103 Union Ave','address2':' ','city':'Chicago'}
#!/usr/bin/env python
import sys, csv, glob
fieldMap = {'zipcode':['Zip5', 'zip9','zipcode','ZIP','zip_code','zip','ZIPCODE'],
'firstname':['firstname','FIRSTNAME'],
'lastname':['lastname','LASTNAME'],
'cust_no':['cust_no','CUST_NO'],
'user_name':['user_name','USER_NAME'],
'status':['status','STATUS'],
'cancel_date':['cancel_date','CANCEL_DATE'],
'reject_date':['REJECT_DATE','reject_date'],
'streetaddr':['streetaddr','STREETADDR','ADDRESS','address'],
'streetno':['streetno','STREETNO'],
'streetnm':['streetnm','STREETNM'],
'suffix':['suffix','SUFFIX'], #suffix of street name: dr, ave, st
'city':['city','CITY'],
'state':['state','STATE'],
'phone_home':['phone_home','PHONE_HOME'],
'email':['email','EMAIL'],
'':['']
}
def readFile(fn,ofp):
count = 0
CSVreader = csv.DictReader(open(fn,'rb'), dialect='excel', delimiter=',')
for row in CSVreader:
count+= 1
if count == 1:
hdrlist = []
for ft in fieldMap.keys():
hdrlen = len(hdrlist)
for ft_test in fieldMap[ft]:
if row.has_key(ft_test):
hdrlist.append(ft_test)
if hdrlen != len(hdrlist)-1:
print "Cannot find a key for %s in file %s" % (ft,fn)
if len(hdrlist) != 16:
print "Another error. Not all header's have been assigned new values."
if count < 5:
x=len(hdrlist)
fmtstring = "%s\t" * len(hdrlist) % tuple(row[x] for x in hdrlist)
ofp.write(fmtstring)
break
if __name__ == '__main__':
filenames = glob.glob(sys.argv[1])
ofp = sys.stdout
ofp.write("zipcode\tfirstname\tlastname\tcust_no\tuser_name\tstatus\t"
"cancel_date\treject_date\tstreetaddr\tstreetno\tstreetnm\t"
"suffix\tcity\tstate\tphone_home\temail")
for filename in filenames:
readFile(filename,ofp)
Sample data:
cust_no,status,streetaddr,address2,city,state,zipcode,billaddr,servaddr,title,latitude,longitude,custsize,telemarket,dirmail,nocredhold,email,phone_home,phone_work,phone_fax,phone_page,phone_cell,phone_othr,taxrate1,taxrate2,taxrate3,taxtot,company,firstname,lastname,user_name,dpbc,container,seq,paytype_am,paytype_di,paytype_mc,paytype_vi
0,0,'123 fake st.',,'chicago','il',60185,'123 billaddr st.','123 servaddr st.','mr.',43.123,54.234 ,2000,'TRUE','TRUE','TRUE','email#email.com',(666)555-6666,,,,,,,,,,,'bob','smith','bob smith',,,,'TRUE','TRUE','TRUE','TRUE'
0,0,'123 fake st.','','chicago','il',60185,'123 billaddr st.','123 servaddr st.','mr.',43.123,54.234 ,2000,'TRUE','TRUE','TRUE','email#email.com',(666)555-6666,'','','','','','','','','','','bob','smith','bob smith','','','','TRUE','TRUE','TRUE','TRUE'
If all you want is a hdrlist of the recognized field names in the csv file being processed, you can create it by comparing the values in the DictReader.fieldnames attribute to the contents of fieldMap immediately after creating the DictReader because doing so with a filenames argument will automatically read in the header row of the file.
I also changed your fieldMap dictionary into an OrderedDict so it would preserve the order of the keys.
import glob
from collections import OrderedDict
import csv
import sys
fieldMap = OrderedDict([
('zipcode', ['zipcode', 'ZIPCODE', 'Zip5', 'zip9', 'ZIP', 'zip_code', 'zip']),
('firstname', ['firstname', 'FIRSTNAME']),
('lastname', ['lastname', 'LASTNAME']),
('cust_no', ['cust_no', 'CUST_NO']),
('user_name', ['user_name', 'USER_NAME']),
('status', ['status', 'STATUS']),
('cancel_date', ['cancel_date', 'CANCEL_DATE']),
('reject_date', ['reject_date', 'REJECT_DATE']),
('streetaddr', ['streetaddr', 'STREETADDR', 'ADDRESS', 'address']),
('streetno', ['streetno', 'STREETNO']),
('streetnm', ['streetnm', 'STREETNM']),
('suffix', ['suffix', 'SUFFIX']), # suffix of street name: dr, ave, st
('city', ['city', 'CITY']),
('state', ['state', 'STATE']),
('phone_home', ['phone_home',' PHONE_HOME']),
('email', ['email', 'EMAIL']),
])
def readFile(fn,ofp):
with open(fn, 'rb') as csvfile:
# the following reads the header line into csvReader.fieldnames
csvReader = csv.DictReader(csvfile, dialect='excel', delimiter=',')
# create a list of recognized fieldnames in the csv file
hdrlist = []
for ft in fieldMap:
for ft_test in fieldMap[ft]:
if ft_test in csvReader.fieldnames:
hdrlist.append(ft_test)
break
else:
hdrlist.append(None) # placeholder (could also be '')
hdrlen = len(hdrlist)
ofp.write('hdrlist: {}\n'.format(hdrlist))
if hdrlen != len(fieldMap):
print "Note that not all field names were present in file."
ofp.write("\t".join(fieldMap) + '\n')
for row in csvReader:
fmtstring = "%s\t" * hdrlen % tuple(
row[field] if field else 'NA' for field in hdrlist)
ofp.write(fmtstring+'\n')
if __name__ == '__main__':
# sys.argv = [sys.argv[0], 'ofp_input.csv'] # hardcode for testing
if len(sys.argv) != 2:
print "Error: Filename argument missing!"
sys.exit(-1)
filenames = glob.glob(sys.argv[1])
ofp = sys.stdout
for filename in filenames:
readFile(filename, ofp)
I'm trying to iterate over a CSV file that has a 'master list' of names, and compare it to another CSV file that contains only the names of people who were present and made phone calls.
I'm trying to iterate over the master list and compare it to the names in the other CSV file, take the number of calls made by the person and write a new CSV file containing number of Calls if the name isn't found or if it's 0, I need that column to have 0 there.
I'm not sure if its something incredibly simple I'm overlooking, or if I am truly going about this incorrectly.
Edited for formatting.
import csv
import sys
masterlst = open('masterlist.csv')
comparelst = open(sys.argv[1])
masterrdr = csv.DictReader(masterlst, dialect='excel')
comparerdr = csv.DictReader(comparelst, dialect='excel')
headers = comparerdr.fieldnames
with open('callcounts.csv', 'w') as outfile:
wrtr = csv.DictWriter(outfile, fieldnames=headers, dialect='excel', quoting=csv.QUOTE_MINIMAL, delimiter=',', escapechar='\n')
wrtr.writerow(dict((fn,fn) for fn in headers))
for lines in masterrdr:
for row in comparerdr:
if lines['Names'] == row['Names']:
print(lines['Names'] + ' has ' + row['Calls'] + ' calls')
wrtr.writerow(row)
elif lines['Names'] != row['Names']:
row['Calls'] = ('%s' % 0)
wrtr.writerow(row)
print(row['Names'] + ' had 0 calls')
masterlst.close()
comparelst.close()
Here's how I'd do it, assuming the file sizes do not prove to be problematic:
import csv
import sys
with open(sys.argv[1]) as comparelst:
comparerdr = csv.DictReader(comparelst, dialect='excel')
headers = comparerdr.fieldnames
names_and_counts = {}
for line in comparerdr:
names_and_counts[line['Names']] = line['Calls']
# or, if you're sure you only want the ones with 0 calls, just use a set and only add the line['Names'] values that that line['Calls'] == '0'
with open('masterlist.csv') as masterlst:
masterrdr = csv.DictReader(masterlst, dialect='excel')
with open('callcounts.csv', 'w') as outfile:
wrtr = csv.DictWriter(outfile, fieldnames=headers, dialect='excel', quoting=csv.QUOTE_MINIMAL, delimiter=',', escapechar='\n')
wrtr.writerow(dict((fn,fn) for fn in headers))
# or if you're on 2.7, wrtr.writeheader()
for line in masterrdr:
if names_and_counts.get(line['Names']) == '0':
row = {'Names': line['Names'], 'Calls': '0'}
wrtr.writerow(row)
That writes just the rows with 0 calls, which is what your text description said - you could tweak it if you wanted to write something else for non-0 calls.
Thanks everyone for the help. I was able to nest another with statement inside of my outer loop, and add a variable to test whether or not the name from the master list was found in the compare list. This is my final working code.
import csv
import sys
masterlst = open('masterlist.csv')
comparelst = open(sys.argv[1])
masterrdr = csv.DictReader(masterlst, dialect='excel')
comparerdr = csv.DictReader(comparelst, dialect='excel')
headers = comparerdr.fieldnames
with open('callcounts.csv', 'w') as outfile:
wrtr = csv.DictWriter(outfile, fieldnames=headers, dialect='excel', quoting=csv.QUOTE_MINIMAL, delimiter=',', escapechar='\n')
wrtr.writerow(dict((fn,fn) for fn in headers))
for line in masterrdr:
found = False
with open(sys.argv[1]) as loopfile:
looprdr = csv.DictReader(loopfile, dialect='excel')
for row in looprdr:
if row['Names'] == line['Names']:
line['Calls'] = row['Calls']
wrtr.writerow(line)
found = True
break
if found == False:
line['Calls'] = '0'
wrtr.writerow(line)
masterlst.close()
comparelst.close()