Writing a CSV File with Pandas Python - python

I did a python script to access a site, and on that site do a certain search for me to do a scan of the search result.
I write the return of the result as txt
clear = self.browser.find_elements_by_class_name('Whitebackground')
for scrape in clear:
with open('result.txt', 'a') as writer:
writer.write(scrape.text)
writer.write('\n')
writer.close()
I want to return the result in CSV to open in Excel
clear = self.browser.find_elements_by_class_name('Whitebackground')
for scrape in clear:
with open('result.csv', 'a') as writer:
writer.write(scrape.text)
writer.write('\n')
writer.close()
My problem is that I have to fill 4 columns
I get my current result that way
656473930362736
The car needs to change the oil
Model: sedan
type of fuel: Gasoline
I want to receive my result in CSV in this way
'Number'; 'description'; 'Model'; 'Type of fuel'
6564...; The car needs..; sedan ; Gasoline
'Number', 'description', 'Model', 'Type of fuel' would be the titles by columns
'6564...', 'The car needs...', 'sedan', 'Gasoline' Would be the rows of the columns
does anyone have any idea how I can do this??

if you can convert your data into dictionaries, its super easy:
data = []
datapoint = {}
datapoint['Number'] = 656473930362736
datapoint['Description'] = "The car needs to change the oil."
datapoint['Model'] = "Sedan"
datapoint['Type of Fuel'] = "Gasoline"
data.append(datapoint)
fieldnames = ['Number','Description','Model','Type of Fuel']
def filewriter(filename, data, fieldnames):
with open (filename, "w", newline='', encoding='utf-8-sig') as csvfile:
csvfile = csv.DictWriter(csvfile, fieldnames=fieldnames, delimiter=',', dialect='excel')
csvfile.writeheader()
for row in data:
csvfile.writerow(row)
filewriter("out.csv", data, fieldnames)
converting your data into dictionaries is a separate problem, but should be no big deal if your data is structured well.

Simply parse your text by splitting into elements:
txt = "656473930362736\n" \
"The car needs to change the oil\n" \
"Model: sedan\n" \
"type of fuel: Gasoline"
list_of_elements = txt.split('\n')
required_text = list_of_elements[0] + ';' + list_of_elements[1] + ';' list_of_elements[2].split(':')[1] + ';' + list_of_elements[3].split(':') + ';'
file.write(required_text + '\n')

Related

Add data to new column and first row in CSV file

I have a python function that creates a CSV file using a Postgresql copy statement. I need to add a new column to this spreadsheet called 'UAL' with an example value in the first row of say 30,000, but without editing the copy statement. This is the current code:
copy_sql = 'COPY (
SELECT
e.name AS "Employee Name",
e.title AS "Job Title"
e.gross AS "Total Pay",
e.total AS "Total Pay & Benefits",
e.year AS "Year",
e.notes AS "Notes",
j.name AS "Agency",
e.status AS "Status"
FROM employee_employee e
INNER JOIN jurisdiction_jurisdiction j on e.jurisdiction_id = j.id
WHERE
e.year = 2011 AND
j.id = 4479
ORDER BY "Agency" ASC, "Total Pay & Benefits" DESC
)'
with open(path, 'w') as csvfile:
self.cursor.copy_expert(copy_sql, csvfile)
What I am trying to do is use something like csv.writer to add content like this:
with open(path, 'w') as csvfile:
self.cursor.copy_expert(copy_sql, csvfile)
writer = csv.writer(csvfile)
writer.writerow('test123')
But this is adding the text to the last row. I am also unsure how to add a new header column. Any advice?
adding a header is easy: write the header before the call to copy_expert.
with open(path, 'w') as csvfile:
writer = csv.writer(csvfile)
writer.writerow(["my","super","header"])
self.cursor.copy_expert(copy_sql, csvfile)
But adding a column cannot be done without re-reading the file again and add your info on each row, so the above solution doesn't help much.
If the file isn't too big and fits in memory, you could write the sql output to a "fake" file:
import io
fakefile = io.StringIO()
self.cursor.copy_expert(copy_sql, fakefile)
now rewind the file and parse it as csv, add the extra column when writing it back
import csv
fakefile.seek(0)
with open(path, 'w', newline="") as csvfile:
writer = csv.writer(csvfile)
reader = csv.reader(fakefile) # works if copy_expert uses "," as separator, else change it
writer.writerow(["my","super","header","UAL"])
for row in reader:
writer.writerow(row+[30000])
or instead of the inner loop:
writer.writerows(row+[30000] for row in reader)
And if the file is too big, write it in a temp file, and proceed the same way (less performant)

How to save multiple xml files in python

I'm attempting to get a series of weather reports from a website, I have the below code which creates the needed URLs for the XMLs I want, what would be the best way to save the returned XMLs with different names?
with open('file.csv') as csvfile:
towns_csv = csv.reader(csvfile, dialect='excel')
for rows in towns_csv:
x = float(rows[2])
y = float(rows[1])
url = ("http://api.met.no/weatherapi/locationforecast/1.9/?")
lat = "lat="+format(y)
lon = "lon="+format(x)
text = url + format(lat) + ";" + format(lon)
I have been saving single XMls with this code;
response = requests.get(text)
xml_text=response.text
winds= bs4.BeautifulSoup(xml_text, "xml")
f = open('test.xml', "w")
f.write(winds.prettify())
f.close()
The first column of the CSV file has city names on it, I would ideally like to use those names to save each XML file as it is created. I'm sure another for loop would do, I'm just not sure how to create it.
Any help would be great, thanks again stack.
You have done most of the work already. Just use rows[0] as your filename. Assuming rows[0] is 'mumbai', then rows[0]+'.xml' will give you 'mumbai.xml' as the filename. You might want to check if city names have spaces which need to be removed, etc.
with open('file.csv') as csvfile:
towns_csv = csv.reader(csvfile, dialect='excel')
for rows in towns_csv:
x = float(rows[2])
y = float(rows[1])
url = ("http://api.met.no/weatherapi/locationforecast/1.9/?")
lat = "lat="+format(y)
lon = "lon="+format(x)
text = url + format(lat) + ";" + format(lon)
response = requests.get(text)
xml_text=response.text
winds= bs4.BeautifulSoup(xml_text, "xml")
f = open(rows[0]+'.xml', "w")
f.write(winds.prettify())
f.close()

Python write to csv ignore commas

I am writing to a csv and it works good, except some of the rows have commas in there names and when I write to the csv those commas throw the fields off...how do I write to a csv and ignore the commas in the rows
header = "Id, FACID, County, \n"
row = "{},{},{}\n".format(label2,facidcsv,County)
with open('example.csv', 'a') as wildcsv:
if z==0:
wildcsv.write(header)
wildcsv.write(row)
else:
wildcsv.write(row)
Strip any comma from each field that you write to the row, eg:
label2 = ''.join(label2.split(','))
facidcsv = ''.join(facidcsv.split(','))
County = ''.join(County.split(','))
row = "{},{},{}\n".format(label2,facidcsv,County)
Generalized to format a row with any number of fields:
def format_row(*fields):
row = ''
for field in fields:
if row:
row = row + ', ' + ''.join(field.split(','))
else:
row = ''.join(field.split(','))
return row
label2 = 'label2, label2'
facidcsv = 'facidcsv'
county = 'county, county'
print(format_row(label2, facidcsv, county))
wildcsv.write(format_row(label2, facidcsv, county))
Output
label2 label2, facidcsv, county county
As #TomaszPlaskota and #quapka allude to in the comments, Python's csv writers and readers by default write/read csv fields that contain a delimiter with a surrounding '"'. Most applications that work with csv files follow the same format. So the following is the preferred approach if you want to keep the commas in the output fields:
import csv
label2 = 'label2, label2'
facidcsv = 'facidccv'
county = 'county, county'
with open('out.csv', 'w') as f:
writer = csv.writer(f)
writer.writerow((label2, facidcsv, county))
out.csv
"label2, label2",facidccv,"county, county"

python rename header row after w.writerow is finished

In the below script, I cannot figure out how to either rename or "faux-rename" the headers.
import csv,time,string,os
print "rendering report. This will take a few minutes..."
raw_report = "\\\\network\\x\\RAWREPORT.csv"
today = time.strftime("%Y-%m-%d")
fields = ["As of Date", "EB", "Cycle", "Col", "APP Name", "Home Country" ]
with open(raw_report) as infile, open("c:\\upload\\test_" + today + ".csv", "wb") as outfile:
r = csv.DictReader(infile)
w = csv.DictWriter(outfile, fields, extrasaction="ignore")
w.writeheader()
for row in r:
w.writerow(row)
This script works fine, and it takes 6 columns out of a .csv with about 90 columns, but in order to write only those 6 columns in fields to my output file, I need to call them by name.
However, I need them to ultimately be named something different., (e.g. - "order_date", "phone_number"... instead of "As of Date", "EB").
I tried the approach of just skipping the first row and writing my own:
r = csv.DictReader(infile)
w = csv.DictWriter(outfile, fields, extrasaction="ignore")
next(r, None)
w.writerow(["order_date","phone_number",...])
but then python doesn't know which columns to copy into the new file because the names don't match.
How would I go about doing what I'm trying to do? Can I reference the columns I want to copy by number instead of by name, or is there a way to go back and change the value of the first row once everything is copied?
I was thinking about this incorrectly. I can define fields as the columns in the original file to pull from, but I don't need to include those necessarily in the output file as they are two separate files.
This code works:
fields = ["As of Date", "EB", "Cycle", "Col", "APP Name", "Home Country" ]
with open(raw_report) as infile, open("c:\\upload\\test_" + today + ".csv", "wb") as outfile:
r = csv.DictReader(infile)
w = csv.DictWriter(outfile, fields, extrasaction="ignore")
#w.writeheader() #remove the writeheader command
#write our custom header
wtr = csv.writer( outfile )
wtr.writerow(["order_date", "phone_number", etc....])
#then, write the rest of the file
for row in r:
w.writerow(row)

How to not just add a new first column to csv but alter the header names

I would like to do the following
read a csv file, Add a new first column, then rename some of the columns
then load the records from csv file.
Ultimately, I would like the first column to be populated with the file
name.
I'm fairly new to Python and I've kind of worked out how to change the fieldnames however, loading the data is a problem as it's looking for the original fieldnames which no longer match.
Code snippet
import csv
import os
inputFileName = "manifest1.csv"
outputFileName = os.path.splitext(inputFileName)[0] + "_modified.csv"
with open(inputFileName, 'rb') as inFile, open(outputFileName, 'wb') as outfile:
r = csv.DictReader(inFile)
fieldnames = ['MapSvcName','ClientHostName', 'Databasetype', 'ID_A', 'KeepExistingData', 'KeepExistingMapCache', 'Name', 'OnPremisePath', 'Resourcestype']
w = csv.DictWriter(outfile,fieldnames)
w.writeheader()
*** Here is where I start to go wrong
# copy the rest
for node, row in enumerate(r,1):
w.writerow(dict(row))
Error
File "D:\Apps\Python27\ArcGIS10.3\lib\csv.py", line 148, in _dict_to_list
+ ", ".join([repr(x) for x in wrong_fields]))
ValueError: dict contains fields not in fieldnames: 'Databases [xsi:type]', 'Resources [xsi:type]', 'ID'
Would like to some assistance to not just learn but truly understand what I need to do.
Cheers and thanks
Peter
Update..
I think I've worked it out
import csv
import os
inputFileName = "manifest1.csv"
outputFileName = os.path.splitext(inputFileName)[0] + "_modified.csv"
with open(inputFileName, 'rb') as inFile, open(outputFileName, 'wb') as outfile:
r = csv.reader(inFile)
w = csv.writer(outfile)
header = next(r)
header.insert(0, 'MapSvcName')
#w.writerow(header)
next(r, None) # skip the first row from the reader, the old header
# write new header
w.writerow(['MapSvcName','ClientHostName', 'Databasetype', 'ID_A', 'KeepExistingData', 'KeepExistingMapCache', 'Name', 'OnPremisePath', 'Resourcestype'])
prevRow = next(r)
prevRow.insert(0, '0')
w.writerow(prevRow)
for row in r:
if prevRow[-1] == row[-1]:
val = '0'
else:
val = prevRow[-1]
row.insert(0,val)
prevRow = row
w.writerow(row)

Categories