I want to read CSV file which is present in Unix at a path say-/var/lib/Folder/abc.csv
I am using below code to read this file, but it looks like it's not returning any rows and hence it's not going inside the for loop.
file_path = "/var/lib/Folder/abc.csv"
with open(file_path, newline='') as csv_file:
reader = csv.reader(csv_file)
for row in reader:
logging.debug(str(datetime.datetime.now()) + " Checking rows...")
logging.debug(str(datetime.datetime.now()) + " Row(" + str(count) + ") = " + row)
CSV file looks something like-
"Account ID","Detail","Description","Date created"
"123456","Customer","Savings","2017/10/24"
I am using Python 2.7
This works when i try in my local. But I am actually using Jenkins to run this and the file is placed in my Jenkins Master server. I have copied that file from the code server to jenkins using below - with ssh_shell.open(path + fileName, "rb") as remote_file: with open(path + fileName, "wb") as local_file: shutil.copyfileobj(remote_file, local_file) After this I am trying to read the file, its not working. i.e not going inside that for loop. Any idea on that?
You could try the following:
reader = csv.reader(csv_file, delimiter=',', quotechar='"')
Also, please try to see what "reader" actually contains. If the for loop is not entered that means that the contents have not been read correctly. Try to start debugging from there.
try this changed row to str(row)
file_path = "/var/lib/Folder/abc.csv"
with open(file_path, newline='') as csv_file:
reader = csv.reader(csv_file)
for row in reader:
logging.debug(str(datetime.datetime.now()) + " Checking rows...")
logging.debug(str(datetime.datetime.now()) + " Row(" + str(count) + ") = " + str(row))
Related
So i wrote a little program in python which allows me to take a .csv file, filter out the lines i need and then export these into a new .txt file.
This worked quite well, so i decided to make it more user friendly by allowing the user to select the file that should be converted by himself through the console (command line).
My problem: The file is imported as a .csv file but not exported as a .txt file which leads to my program overwriting the original file which will be emptied because of a step in my program which allows me to delete the first two lines of the output text.
Does anyone know a solution for this?
Thanks :)
import csv
import sys
userinput = raw_input('List:')
saveFile = open(userinput, 'w')
with open(userinput, 'r') as file:
reader = csv.reader(file)
count = 0
for row in reader:
print(row[2])
saveFile.write(row[2] + ' ""\n')
saveFile.close()
saveFile = open(userinput, 'r')
data_list = saveFile.readlines()
saveFile.close()
del data_list[1:2]
saveFile = open(userinput, 'w')
saveFile.writelines(data_list)
saveFile.close()
Try This:
userinput = raw_input('List:')
f_extns = userinput.split(".")
saveFile = open(f_extns[0]+'.txt', 'w')
I think you probably just want to save the file with a new name, this Extracting extension from filename in Python talks about splitting out the extension so then you can just add your own extension
you would end up with something like
name, ext = os.path.splitext(userinput)
saveFile = open(name + '.txt', 'w')
You probably just need to change the extension of the output file. Here is a solution that sets the output file extension to .txt; if the input file is also .txt then there will be a problem, but for all other extensions of the input file this should work.
import csv
import os
file_name = input('Name of file:')
# https://docs.python.org/3/library/os.path.html#os.path.splitext
# https://stackoverflow.com/questions/541390/extracting-extension-from-filename-in-python
file_name, file_ext_r = os.path.splitext(file_name)
file_ext_w = '.txt'
file_name_r = ''.format(file_name, file_ext_r)
file_name_w = ''.format(file_name, file_ext_w)
print('File to read:', file_name_r)
print('File to write:', file_name_w)
with open(file_name_r, 'r') as fr, open(file_name_w, 'w') as fw:
reader = csv.reader(fr)
for i, row in enumerate(reader):
print(row[2])
if i >= 2:
fw.write(row[2] + ' ""\n')
I also simplified your logic to avoid writting the first 2 lines to the output file; no need to read and write the output file again.
Does this work for you?
I am pretty new to python and trying to run a script to edit csv files. The problem I am facing is that I need to split the csv files into smaller pieces(as they are large files and getting memory errors) and then run another script to edit the files but when im trying to append these two scripts and run the test, the script is reading only the first small file and not reading the rest of the files.
For example: When I split the main csv file, the files are getting split and the names come as big-1.csv,big-2.csv. Then when the script is picking up the files to edit, only big-1.csv is getting edited and rest are not getting edited.
The script is:
import csv
from csv import DictWriter
divisor = 990
outfileno = 1
outfile = None
with open('MOCK_DATA.csv', 'r', newline='') as infile:
infile_iter = csv.reader(infile, delimiter='\t')
header = next(infile_iter)
for index, row in enumerate(infile_iter):
if index % divisor == 0:
if outfile:
outfile.close()
outfilename = 'big-{}.csv'.format(outfileno)
outfile = open(outfilename, 'w', newline='')
outfileno += 1
writer = csv.writer(outfile, delimiter='\t', quoting=csv.QUOTE_NONE)
writer.writerow(header)
writer.writerow(row)
# Don't forget to close the last file
if outfile:
outfile.close()
#export the data
# with correct quoting, and that you are stuck with what you have.
for i in range(1,2):
with open("big-" + str(i) + ".csv") as people_file:
next(people_file)
corrected_people = []
for person_line in people_file:
chomped_person_line = person_line.rstrip()
person_tokens = chomped_person_line.split(",")
# check that each field has the expected type
try:
corrected_person = {
"id": person_tokens[0],
"first_name":person_tokens[1],
"last_name": "".join(person_tokens[2:-3]),
"email":person_tokens[-3],
"gender":person_tokens[-2],
"ip_address":person_tokens[-1]
}
if not corrected_person["ip_address"].startswith(
"") and corrected_person["ip_address"] !="n/a":
raise ValueError
corrected_people.append(corrected_person)
except (IndexError, ValueError):
# print the ignored lines, so manual correction can be performed later.
print("Could not parse line: " + chomped_person_line)
with open("fix-" + str(i) + ".csv", "w") as corrected_people_file:
writer = DictWriter(
corrected_people_file,
fieldnames=[
"id","first_name","last_name","email","gender","ip_address"
],delimiter=',')
writer.writeheader()
writer.writerows(corrected_people)
I think this maybe an issue with reading the smaller files in the for loop. The script is running without any error. Please help.
I've got a CSV of over 500 entries and I'm trying to generate redirect files. The formatting of the CSV is:
/contact,/contact-us,
/about,/about-us,
The /contact is the old URL and the /contact-us is the new URL.
The formatting of the desired .htm file is:
url = "/contact"
is_hidden = 0
==
<?php
function onStart(){return Redirect::to("/contact-us");}
?>
==
The filename for the .htm files are unimportant (could be 1.htm, 2.htm, etc.).
I haven't really touched Python in several years and I'm not sure if it's the best option, but from what I've been reading, it seems like it's a solid choice for CSV parsing.
Any help would be greatly appreciated.
Edit:
This is what I have so far
import pip
import csv
with open('redirects.csv') as csvfile:
reader = csv.reader(csvfile, delimiter=',')
for row in reader:
print 'url = "'+row[0]+'\nis_hidden = 0\n==\n\n<?php\nfunction onStart(){return Redirect::to("'+row[1]+'");}\n?>\n=='
This prints out exactly what I need. I just need to put each entry into a .htm file (auto-incremented filename).
Edit #2:
I got what I was looking for with this code:
import pip
import csv
count = 0
with open('redirects.csv') as csvfile:
reader = csv.reader(csvfile, delimiter=',')
for row in reader:
count += 1
count_str = str(count)
file = open('redirects/'+count_str+'.htm', 'w')
file.write('url = "' + row[0] + '"\nis_hidden = 0\n==\n\n<?php\nfunction onStart(){return Redirect::to("' + row[1] + '");}\n?>\n==')
file.close()
If I understand correctly, something like below might work.
directories = open('filename', 'r').read()
splitted = directories.split(",")
correctlyformatted = [x.strip() for x in splitted]
counter = 0
for i in correctlyformatted:
f=open(str(counter) + '.html', 'w')
f.writeLines([
'url = "' + i + '"',
'i.s_hidden = 0',
'==',
'<?php',
'function onStart(){return Redirect::to("' + i +'");}',
'?>', '=='])
counter += 1
Essentially the data is temperatures from 4 different states over the course of 12 months, so there is 48 files to be populated into my folder on my desktop directory. But I am not sure how to take the data being pulled from the web and then take the files being saved in my program to be sent to the directory of my desktop. That's what I am confused about, how to take the files being created on in my program and send them to a folder on my desktop.
I am copying the data from the web, cleaning it up, then saving it into a file, then taking that file and wanting to save it to a folder on my desktop.
Here is the code:
import urllib
def accessData(Id, Month):
url = "https://www.wunderground.com/weatherstation/WXDailyHistory.asp?ID=" + str(Id) + "&year=2017&month=" + str(Month) + "&graphspan=month&format=1"
infile = urllib.urlopen(url)
readLineByLine = infile.readlines()
infile.close()
return readLineByLine
f = open('stations.csv', 'r')
for line in f.readlines():
vals = line.split(',')
for j in range(1,13): # accessing months here from 1 to 12, b/c 13 exclusive
data = accessData(line, j)
filename = "{}-0{}-2017.csv".format(vals[0], j)
print(str(filename))
row_count = len(data)
for i in range(2, row_count):
if(data[i] != '<br>\n' and data[i] != '\n'):
writeFile = open(filename, 'w')
writeFile.write(data[i])
openfile = open(Desktop, writeFile , 'r')
file.close()
Have you tried running the script from your desktop. It looks like you haven't specified a directory. So maybe running from your desktop should output your results to your current working directory.
Alternatively, you could try use the in-built os library.
import os
os.getcwd() # to get the current working directory
os.chdir(pathname) # change your working directory to the path specified.
This would change your working directory to the place you want to save your files.
Also, in regards to the last four lines of your code. file is not open, so you cannot close this. Also, I do not believe you need the openfile statement.
writeFile = open(filename, 'w')
writeFile.write(data[i])
openfile = open(Desktop, writeFile , 'r')
file.close()
Try this instead.
with open(filename, 'w') as writeFile:
for i in range(2, row_count):
if(data[i] != '<br>\n' and data[i] != '\n'):
writeFile.write(data[i])
Using this approach you shouldn't need to close the file. 'w' is to write as if a new file, change this to 'a' if you need to append to the file.
You just need to provide writeFile.write() with the path to your destination file, rather than just the filename (which will otherwise be saved into your current working directory.)
Try something like:
f = open('stations.csv', 'r')
target_dir = "/path/to/your/Desktop/folder/"
for line in f.readlines():
...
# We can open the file outside your inner "row" loop
# using the combination of the path to your Desktop
# and your filename
with open(target_dir+filename, 'w') as writeFile:
for i in range(2, row_count):
if(data[i] != '<br>\n' and data[i] != '\n'):
writeFile.write(data[i])
# The "writeFile" object will close automatically outside the
# "with ... " block
As others have mentioned, you could approach this two different ways:
1) Run the script directly from the directory to which you would like to save the files. Then you would just need to specify the full path to the .csv file you are reading.
2) You could provide the full path to where you would like to save the files when you write them, however this seems more intensive and unnecessary.
On another note, when opening files for the purpose of reading/writing them, use with to simply open the file for as long as you need it, then when you exit the with statement, the file will automatically be closed.
Here is an example of Option 1 with some clean-up:
import urllib
def accessData(Id, Month):
url = "https://www.wunderground.com/weatherstation/WXDailyHistory.asp?ID=" + str(Id) + "&year=2017&month=" + str(Month) + "&graphspan=month&format=1"
infile = urllib.urlopen(url)
readLineByLine = infile.readlines()
infile.close()
return readLineByLine
with open('Path to File' + 'stations.csv', 'r') as f:
for line in f.readlines():
vals = line.split(',')
for j in range(1,13):
data = accessData(line, j)
filename = "{}-0{}-2017.csv".format(vals[0], j)
with open(filename, 'w') as myfile:
for i in range(2, len(data)):
if data[i]!='<br>\n' and data[i]!='\n':
myfile.write(data[i])
print(filename + ' - Completed')
I am new to Python. I am working on gps files. I need to convert a CSV file having all the gps data to kml file. Below is the code in python I am using :
import csv
#Input the file name.
fname = raw_input("Enter file name WITHOUT extension: ")
data = csv.reader(open(fname + '.csv'), delimiter = ',')
#Skip the 1st header row.
data.next()
#Open the file to be written.
f = open('csv2kml.kml', 'w')
#Writing the kml file.
f.write("<?xml version='1.0' encoding='UTF-8'?>\n")
f.write("<kml xmlns='http://earth.google.com/kml/2.1'>\n")
f.write("<Document>\n")
f.write(" <name>" + fname + '.kml' +"</name>\n")
for row in data:
f.write(" <Placemark>\n")
f.write(" <name>" + str(row[1]) + "</name>\n")
f.write(" <description>" + str(row[0]) + "</description>\n")
f.write(" <Point>\n")
f.write(" <coordinates>" + str(row[3]) + "," + str(row[2]) + "," + str(row[4]) + "</coordinates>\n")
f.write(" </Point>\n")
f.write(" </Placemark>\n")
f.write("</Document>\n")
f.write("</kml>\n")
print "File Created. "
print "Press ENTER to exit. "
raw_input()
The csv file I am using is available here : dip12Sep11newEdited.csv
The kml file generated is available here : csv2kml.kml
But the kml file is not getting created correctly. Apparently after some rows in the csv the code is not able to generate more Placemarks. Its not able to iterate. You can see that by scrolling to the last part of the kml file generated.
Can anyone help me finding out the error in the code, because for some smaller csv files it worked correctly and created kml files fully.
Thanks.
You didn't answer the query above, but my guess is that the error is that you're not closing your output file (which would flush your output).
f.close()
use etree to create your file
http://docs.python.org/library/xml.etree.elementtree.html
It's included with Python and protects you from generating broken XML. (eg. because fname contained &, which has special meaning in XML.)
This code is well written thank you for the post. I got it to work by putting my CSV in the same directory as the .py code.
I made a few edits to bring it to py 3.3
import csv
#Input the file name."JoeDupes3_forearth"
fname = input("Enter file name WITHOUT extension: ")
data = csv.reader(open(fname + '.csv'), delimiter = ',')
#Skip the 1st header row.
#data.next()
#Open the file to be written.
f = open('csv2kml.kml', 'w')
#Writing the kml file.
f.write("<?xml version='1.0' encoding='UTF-8'?>\n")
f.write("<kml xmlns='http://earth.google.com/kml/2.1'>\n")
f.write("<Document>\n")
f.write(" <name>" + fname + '.kml' +"</name>\n")
for row in data:
f.write(" <Placemark>\n")
f.write(" <name>" + str(row[1]) + "</name>\n")
f.write(" <description>" + str(row[3]) + "</description>\n")
f.write(" <Point>\n")
f.write(" <coordinates>" + str(row[10]) + "," + str(row[11]) + "," + str() + "</coordinates>\n")
f.write(" </Point>\n")
f.write(" </Placemark>\n")
f.write("</Document>\n")
f.write("</kml>\n")
print ("File Created. ")
print ("Press ENTER to exit. ")
input()
f.close()
Hope it helps if you are trying to convert your data.
One answer mentions the "etree", one advantage that you do not have to hardcode the xml format:
Below one of my examples, of course you have to adjust it to your case, but you may get the principle idea of how etree works:
to get something like this
<OGRVRTDataSource>
<OGRVRTLayer name="GW1AM2_201301010834_032D_L1SGRTBR_1110110_channel89H">
<SrcDataSource>G:\AMSR\GW1AM2_201301010834_032D_L1SGRTBR_1110110_channel89H.csv</SrcDataSource>
<GeometryType>wkbPoint</GeometryType>
<GeometryField encoding="PointFromColumns" x="lon" y="lat" z="brightness" />
</OGRVRTLayer>
</OGRVRTDataSource>
you can use this code:
import xml.etree.cElementTree as ET
[....]
root = ET.Element("OGRVRTDataSource")
OGRVRTLayer = ET.SubElement(root, "OGRVRTLayer")
OGRVRTLayer.set("name", AMSRcsv_shortname)
SrcDataSource = ET.SubElement(OGRVRTLayer, "SrcDataSource")
SrcDataSource.text = AMSRcsv
GeometryType = ET.SubElement(OGRVRTLayer, "GeometryType")
GeometryType.text = "wkbPoint"
GeometryField = ET.SubElement(OGRVRTLayer,"GeometryField")
GeometryField.set("encoding", "PointFromColumns")
GeometryField.set("x", "lon")
GeometryField.set("y", "lat")
GeometryField.set("z", "brightness")
tree = ET.ElementTree(root)
tree.write(AMSRcsv_vrt)
also some more info here
The simplekml package works very well, and makes easy work of such things.
To install on Ubuntu, download the latest version and run the following from the directory containing the archive contents.
sudo python setup.py install
There are also some tutorials to get you started.
Just use simplekml library to create kml easily.. instead of writing the kml data.. I achieved it directly by using simplekml.
import simplekml
Read the documentation of simplekml
with open(arguments+'.csv', 'r') as f:
datam = [(str(line['GPSPosLongitude']), str(line['GPSPosLatitude'])) for line in csv.DictReader(f)]
kml = simplekml.Kml()
linestring = kml.newlinestring(name='linename')
linestring.coords = datam
linestring.altitudemode = simplekml.AltitudeMode.relativetoground
linestring.style.linestyle.color = simplekml.Color.lime
linestring.style.linestyle.width = 2
linestring.extrude = 1
kml.save('file.kml')
kml.savekmz('file.kmz', format=False)
kml2geojson.main.convert('file.kml', '')