I am writing to a csv and it works good, except some of the rows have commas in there names and when I write to the csv those commas throw the fields off...how do I write to a csv and ignore the commas in the rows
header = "Id, FACID, County, \n"
row = "{},{},{}\n".format(label2,facidcsv,County)
with open('example.csv', 'a') as wildcsv:
if z==0:
wildcsv.write(header)
wildcsv.write(row)
else:
wildcsv.write(row)
Strip any comma from each field that you write to the row, eg:
label2 = ''.join(label2.split(','))
facidcsv = ''.join(facidcsv.split(','))
County = ''.join(County.split(','))
row = "{},{},{}\n".format(label2,facidcsv,County)
Generalized to format a row with any number of fields:
def format_row(*fields):
row = ''
for field in fields:
if row:
row = row + ', ' + ''.join(field.split(','))
else:
row = ''.join(field.split(','))
return row
label2 = 'label2, label2'
facidcsv = 'facidcsv'
county = 'county, county'
print(format_row(label2, facidcsv, county))
wildcsv.write(format_row(label2, facidcsv, county))
Output
label2 label2, facidcsv, county county
As #TomaszPlaskota and #quapka allude to in the comments, Python's csv writers and readers by default write/read csv fields that contain a delimiter with a surrounding '"'. Most applications that work with csv files follow the same format. So the following is the preferred approach if you want to keep the commas in the output fields:
import csv
label2 = 'label2, label2'
facidcsv = 'facidccv'
county = 'county, county'
with open('out.csv', 'w') as f:
writer = csv.writer(f)
writer.writerow((label2, facidcsv, county))
out.csv
"label2, label2",facidccv,"county, county"
Related
I need read a csv file and fill the empty/null values in column "Phone and Email" based on the person's address and write to a new csv file. Ex: if a person "Jonas Kahnwald" doesn't have the phone number or an email address but has the same address as the person above or below, say "Hannah Kahnwald", then we should fill the empty/null values with those person's details.
I won't be able to use python pandas as the rest of the code/programs are purely based on python 2.7 (unfortunately), so I just need to write a function or a logic to capture this information alone.
Input format/table looks like below with empty cells (csv file):
FirstName,LastName,Phone,Email,Address
Hannah,Kahnwald,1457871452,hannkahn#gmail.com,145han street
Micheal,Kahnwald,6231897383,,145han street
Jonas,Kahnwald,,,145han street
Mikkel,Nielsen,4509213887,mikneil#yahoo.com,887neil ave
Magnus,Nielsen,,magnusneil#kyle.co,887neil ave
Ulrich,Nielsen,,,887neil ave
katharina,Nielsen,,,887neil ave
Elisabeth,Doppler,5439001211,elsisop#amaz.com,211elis park
Peter,Doppler,,,211elis park
bartosz,Tiedmannn,6263172828,tiedman#skype.com,828alex street
Alexander,washington,,,321notsame street
claudia,Tiedamann,,,828alex street
Output format should be like below:
Hannah,Kahnwald,1457871452,hannkahn#gmail.com,145han street
Micheal,Kahnwald,6231897383,hannkahn#gmail.com,145han street
Jonas,Kahnwald,1457871452,hannkahn#gmail.com,145han street
Mikkel,Nielsen,4509213887,mikneil#yahoo.com,887neil ave
Magnus,Nielsen,4509213887,magnusneil#kyle.co,887neil ave
Ulrich,Nielsen,4509213887,mikneil#yahoo.com,887neil ave
katharina,Nielsen,4509213887,mikneil#yahoo.com,887neil ave
Elisabeth,Doppler,5439001211,elsisop#amaz.com,211elis park
Peter,Doppler,5439001212,elsisop#amaz.com,211elis park
bartosz,Tiedmannn,6263172828,tiedman#skype.com,828alex street
Alexander,washington,,,321notsame street
claudia,Tiedamann,6263172828,tiedman#skype.com,828alex street
import csv,os
def get_info(file path):
data = []
with open(file, 'rb') as fin:
csv_reader = csv.reader(fin)
next(reader)
for each in csv_reader:
FirstName = each[0]
LN = each[1]
Phone = "some function or logic"
email = " some function or logic"
Address = each[4]
login = ""
logout = ""
data.append([FirstName,LN,Phone,email,Address,login,logout])
f.close()
return data
Here's a significantly updated version that attempts to fill in missing data from other entries in the file, but only if they have the same Address field. To make the searching faster it builds a dictionary for internal use called attr_dict which contains all the records with a particular address. It also uses namedtuples internally to make the code a little more readable.
Note that when retrieving missing information, it will use the data from the first entry it finds stored in this internal dictionary at the Address. In addition, I don't think the sample data you provided contains every possible case, so will need to do additional testing.
import csv
from collections import namedtuple
def get_info(file_path):
# Read data from file and convert to list of namedtuples, also create address
# dictionary to use to fill in missing information from others at same address.
with open(file_path, 'rb') as fin:
csv_reader = csv.reader(fin, skipinitialspace=True)
header = next(csv_reader)
Record = namedtuple('Record', header)
newheader = header + ['Login', 'Logout'] # Add names of new columns.
NewRecord = namedtuple('NewRecord', newheader)
addr_dict = {}
data = [newheader]
for rec in (Record._make(row) for row in csv_reader):
if rec.Email or rec.Phone: # Worth saving?
addr_dict.setdefault(rec.Address, []).append(rec) # Remember it.
login, logout = "", "" # Values for new columns.
data.append(NewRecord._make(rec + (login, logout)))
# Try to fill in missing data from any other records with same Address.
for i, row in enumerate(data[1:], 1):
if not (row.Phone and row.Email): # Info missing?
# Try to copy it from others at same address.
updated = False
for other in addr_dict.get(row.Address, []):
if not row.Phone and other.Phone:
row = row._replace(Phone=other.Phone)
updated = True
if not row.Email and other.Email:
row = row._replace(Email=other.Email)
updated = True
if row.Phone and row.Email: # Info now filled in?
break
if updated:
data[i] = row
return data
INPUT_FILE = 'null_cols.csv'
OUTPUT_FILE = 'fill_cols.csv'
data = get_info(INPUT_FILE)
with open(OUTPUT_FILE, 'wb') as fout:
writer = csv.DictWriter(fout, data[0]) # First elem has column names.
writer.writeheader()
for row in data[1:]:
writer.writerow(row._asdict())
print('Done')
Screenshot of results in Excel:
I did a python script to access a site, and on that site do a certain search for me to do a scan of the search result.
I write the return of the result as txt
clear = self.browser.find_elements_by_class_name('Whitebackground')
for scrape in clear:
with open('result.txt', 'a') as writer:
writer.write(scrape.text)
writer.write('\n')
writer.close()
I want to return the result in CSV to open in Excel
clear = self.browser.find_elements_by_class_name('Whitebackground')
for scrape in clear:
with open('result.csv', 'a') as writer:
writer.write(scrape.text)
writer.write('\n')
writer.close()
My problem is that I have to fill 4 columns
I get my current result that way
656473930362736
The car needs to change the oil
Model: sedan
type of fuel: Gasoline
I want to receive my result in CSV in this way
'Number'; 'description'; 'Model'; 'Type of fuel'
6564...; The car needs..; sedan ; Gasoline
'Number', 'description', 'Model', 'Type of fuel' would be the titles by columns
'6564...', 'The car needs...', 'sedan', 'Gasoline' Would be the rows of the columns
does anyone have any idea how I can do this??
if you can convert your data into dictionaries, its super easy:
data = []
datapoint = {}
datapoint['Number'] = 656473930362736
datapoint['Description'] = "The car needs to change the oil."
datapoint['Model'] = "Sedan"
datapoint['Type of Fuel'] = "Gasoline"
data.append(datapoint)
fieldnames = ['Number','Description','Model','Type of Fuel']
def filewriter(filename, data, fieldnames):
with open (filename, "w", newline='', encoding='utf-8-sig') as csvfile:
csvfile = csv.DictWriter(csvfile, fieldnames=fieldnames, delimiter=',', dialect='excel')
csvfile.writeheader()
for row in data:
csvfile.writerow(row)
filewriter("out.csv", data, fieldnames)
converting your data into dictionaries is a separate problem, but should be no big deal if your data is structured well.
Simply parse your text by splitting into elements:
txt = "656473930362736\n" \
"The car needs to change the oil\n" \
"Model: sedan\n" \
"type of fuel: Gasoline"
list_of_elements = txt.split('\n')
required_text = list_of_elements[0] + ';' + list_of_elements[1] + ';' list_of_elements[2].split(':')[1] + ';' + list_of_elements[3].split(':') + ';'
file.write(required_text + '\n')
I have a python function that creates a CSV file using a Postgresql copy statement. I need to add a new column to this spreadsheet called 'UAL' with an example value in the first row of say 30,000, but without editing the copy statement. This is the current code:
copy_sql = 'COPY (
SELECT
e.name AS "Employee Name",
e.title AS "Job Title"
e.gross AS "Total Pay",
e.total AS "Total Pay & Benefits",
e.year AS "Year",
e.notes AS "Notes",
j.name AS "Agency",
e.status AS "Status"
FROM employee_employee e
INNER JOIN jurisdiction_jurisdiction j on e.jurisdiction_id = j.id
WHERE
e.year = 2011 AND
j.id = 4479
ORDER BY "Agency" ASC, "Total Pay & Benefits" DESC
)'
with open(path, 'w') as csvfile:
self.cursor.copy_expert(copy_sql, csvfile)
What I am trying to do is use something like csv.writer to add content like this:
with open(path, 'w') as csvfile:
self.cursor.copy_expert(copy_sql, csvfile)
writer = csv.writer(csvfile)
writer.writerow('test123')
But this is adding the text to the last row. I am also unsure how to add a new header column. Any advice?
adding a header is easy: write the header before the call to copy_expert.
with open(path, 'w') as csvfile:
writer = csv.writer(csvfile)
writer.writerow(["my","super","header"])
self.cursor.copy_expert(copy_sql, csvfile)
But adding a column cannot be done without re-reading the file again and add your info on each row, so the above solution doesn't help much.
If the file isn't too big and fits in memory, you could write the sql output to a "fake" file:
import io
fakefile = io.StringIO()
self.cursor.copy_expert(copy_sql, fakefile)
now rewind the file and parse it as csv, add the extra column when writing it back
import csv
fakefile.seek(0)
with open(path, 'w', newline="") as csvfile:
writer = csv.writer(csvfile)
reader = csv.reader(fakefile) # works if copy_expert uses "," as separator, else change it
writer.writerow(["my","super","header","UAL"])
for row in reader:
writer.writerow(row+[30000])
or instead of the inner loop:
writer.writerows(row+[30000] for row in reader)
And if the file is too big, write it in a temp file, and proceed the same way (less performant)
I'm attempting to get a series of weather reports from a website, I have the below code which creates the needed URLs for the XMLs I want, what would be the best way to save the returned XMLs with different names?
with open('file.csv') as csvfile:
towns_csv = csv.reader(csvfile, dialect='excel')
for rows in towns_csv:
x = float(rows[2])
y = float(rows[1])
url = ("http://api.met.no/weatherapi/locationforecast/1.9/?")
lat = "lat="+format(y)
lon = "lon="+format(x)
text = url + format(lat) + ";" + format(lon)
I have been saving single XMls with this code;
response = requests.get(text)
xml_text=response.text
winds= bs4.BeautifulSoup(xml_text, "xml")
f = open('test.xml', "w")
f.write(winds.prettify())
f.close()
The first column of the CSV file has city names on it, I would ideally like to use those names to save each XML file as it is created. I'm sure another for loop would do, I'm just not sure how to create it.
Any help would be great, thanks again stack.
You have done most of the work already. Just use rows[0] as your filename. Assuming rows[0] is 'mumbai', then rows[0]+'.xml' will give you 'mumbai.xml' as the filename. You might want to check if city names have spaces which need to be removed, etc.
with open('file.csv') as csvfile:
towns_csv = csv.reader(csvfile, dialect='excel')
for rows in towns_csv:
x = float(rows[2])
y = float(rows[1])
url = ("http://api.met.no/weatherapi/locationforecast/1.9/?")
lat = "lat="+format(y)
lon = "lon="+format(x)
text = url + format(lat) + ";" + format(lon)
response = requests.get(text)
xml_text=response.text
winds= bs4.BeautifulSoup(xml_text, "xml")
f = open(rows[0]+'.xml', "w")
f.write(winds.prettify())
f.close()
I need to get information from a list and add a column year from name. I still not sure how to add one field 'year' in record. Can I use append?
And about output file, I just need use outputcsv.writerow(records) isn't it?
This is a part of code that I stuck:
filenames = ('babyQld2010.csv',
'babyQld2011.csv',
'babyQld2012.csv',
'babyQld2012.csv',
'babyQld2014.csv')
outFile = open('babyQldAll.csv','w')
csvFile_out = csv.writer(outFile, delimiter=',')
for filename in filenames:
name, ext = filename.split('.')
year = name[-4:] #extract year from file names
records = extract_names(filename)
# Get (name, count, gender) from list "records",
# and add value of "year" and write into output file (using "for" loop )
Output file look like:
2010,Lola,69,Girl
And input, I have 5 file babyQld2010.csv, babyQld2011.csv, babyQld2012.csv, babyQld2012.csv, babyQld2014.csv which contains:
Mia,425,William,493
and I have to sort it in format and I already done it and save in list 'records'
Lola,69,Girl
now I need to add one field 'year' on 'record' list and export csv file.
This is my full code:
import csv
def extract_names(filename):
''' Extract babyname, count, gender from a csv file,
and return the data in a list.
'''
inFile = open(filename, 'rU')
csvFile = csv.reader(inFile, delimiter=',')
# Initialization
records = []
rowNum = 0
for row in csvFile:
if rowNum != 0:
# +++++ You code here ++++
# Read each row of csv file and save information in list 'records'
# as (name, count, gender)
records.append([row[0], row[1], "Female"])
records.append([row[2], row[3], "Male"])
print('Process each row...')
rowNum += 1
inFile.close()
return(records)
#### Start main program #####
filenames = ('babyQld2010.csv',
'babyQld2011.csv',
'babyQld2012.csv',
'babyQld2012.csv',
'babyQld2014.csv')
with open('babyQldAll.csv','w') as outFile:
csvFile_out = csv.writer(outFile, delimiter=',')
for filename in filenames:
name, ext = filename.split('.')
year = name.split('.')[0][-4:] #extract year from file names
records = extract_names(filename)
for record in records:
csvFile_out.write([year] + record)
print("Write in csv file...")
outFile.close()
To get the year from the csv file you can simply split the string at '.' and then take the last four characters from the first part of the split. Example -
>>> s = 'babyQld2010.csv'
>>> s.split('.')[0][-4:]
'2010'
Then just simply iterate over your list of records, which you say is correct, for each list within in, use list contatenation to create a new list with year at the start and write that to csv file.
I would also suggest that you use with statement for opening the file to write to (and even in the function where you are reading from the other csv files). Example -
filenames = ('babyQld2010.csv',
'babyQld2011.csv',
'babyQld2012.csv',
'babyQld2012.csv',
'babyQld2014.csv')
with open('babyQldAll.csv','w') as outFile:
csvFile_out = csv.writer(outFile, delimiter=',')
for filename in filenames:
name, ext = filename.split('.')
year = name.split('.')[0][-4:] #extract year from file names
records = extract_names(filename)
for record in records:
csvFile_out.writerow([year] + record)
Yes, you can just append the year column to each row as you read it in from your source files. You can read in & write out each row as a dictionary so that you can use your existing column headers to address the data if you need to massage it on the way through.
Using the csv.DictWriter() method you specify your headers (fieldnames) when you set it up. You can then write them out with the writeheader() method.
import csv
file_list = ['babyQld2010.csv',
'babyQld2011.csv',
'babyQld2012.csv',
'babyQld2012.csv',
'babyQld2014.csv']
outFile = open('babyQldAll.csv', 'wb')
csv_writer = csv.DictWriter(outFile,
fieldnames=['name','count','gender','year'])
csv_write_out.writeheader()
for a_file in file_list:
name,ext = a_file.split('.')
year = name[-4:]
with open(a_file, 'rb') as inFile:
csv_read_in = csv.DictReader(inFile)
for row in csv_read_in:
row['year'] = year
csv_writer.writerow(row)
outfile.close()
Hope this helps.