Python's csv reader keep reading the same file

Python's csv reader keep reading the same file - python

i'm having a problem with the python's csv reader. The problem is that i want to open and read different csv files, but he keeps on reading always the same one.
from csv import reader
alphabet = ["a", "b", "c"]
for letter in alphabet:
csv_file = open('/home/desktop/csv/' + letter + '.csv', 'r')
csv_data = reader(csv_file)
The problem is that he seems to open the other files, but he keep on reading always the first file.
Is there a way to "clean" the reader or make him read another file? I even tried to close the csv file, but it didn't work.
The full code is this
from csv import reader
#Orario
orario_csv_file = '/home/andrea/Scrivania/orario.csv'
orario_csv = open(orario_csv_file)
orario_data = reader(orario_csv)
orario = []
#Corsi
corsi = ["EDILIZIA", "EDILE-ARCHIT", "ELETTRONICA", "TECNOLOGIE_DI_INTERNET", "INFORMATICA", "GESTIONALE", "ENERGETICA", "MECCANICA", "CIVILE_ED_AMBIENTALE", "MEDICA", "ENGINEERING_SCIENCES"]
giorni = ["Lun", "Mar", "Mer", "Gio", "Ven"]
for row in orario_data:
orario.append(row)
for corso in corsi:
nome_corso_file = '/home/andrea/Scrivania/xml/' + corso + '.xml'
nome_corso_xml = open(nome_corso_file, 'wt')
nome_corso_xml.write('<?xml version="1.0"?>' + "\n")
nome_corso_xml.write('<orario>' + "\n")
nome_csv = corso + '_csv'
nome_csv = '/home/andrea/Scrivania/csv/' + corso + '.csv'
nome_corso_csv = open(nome_csv, 'rt')
corso_data = reader(nome_corso_csv)
nome_corso_xml.write(' <corso name="' + corso + '">' + "\n")
for a in range(0, 3):
nome_corso_xml.write(' <anno num="' + str(a+1) + '">' + "\n")
for j in range(1, 6):
nome_corso_xml.write(' <giorno name="' + orario[2][j] + '">' + "\n")
for i in range(3, 12):
lez = orario[i + a*12][j]
if lez == "":
nome_corso_xml.write(' <lezione>' + "-" + '</lezione>' + "\n")
else:
for riga in corso_data:
if riga[0] == lez:
if riga[2] == "":
nome_corso_xml.write(' <lezione name="' + lez + '">' + riga[1] + '</lezione>' + "\n")
else:
for g in range(0, len(riga)):
if riga[g].lower() == orario[2][j].lower():
nome_corso_xml.write(' <lezione name="' + lez + '">' + riga[g+1] + '</lezione>' + "\n")
nome_corso_csv.seek(0)
nome_corso_xml.write(' </giorno>' + "\n")
nome_corso_xml.write(' </anno>' + "\n")
nome_corso_xml.write(' </corso>' + "\n")
nome_corso_xml.write('</orario>' + "\n")
nome_corso_xml.close()
He open the "EDILIZIA.csv" and compile the "EDILIZIA.xml", then he should open the "EDILE-ARCHIT.csv" and compile its xml, but when he read, he keeps on reading from "EDILIZIA.csv"
Here's the .csv files that you need.
http://pastebin.com/kJhL8HpK
If you try to make it read first EDILIZIA.csv and then EDILE-ARCHIT.csv he'll keep on using always the EDILIZIA.csv to compile the xml, but he should firt open EDILIZIA.csv, compile the EDILIZIA.xml, then read the EDILE-ARCHIT.csv and compile the EDILE-ARCHIT.xml.
If you take a look at the final xmls, you'll see that the EDILE-ARCHIT.xml will only display the common subjects of EDILIZIA.csv and EDILE-ARCHIT.csv

It took a long time to figure out what you are doing here. To tell the truth your code is a mess - there are many unused variables and lines that make no sense at all. Anyway, your code reads the appropriate csv file each time, thus the error is not where you thought it was.
If I am right, orario.csv contains the timetable of each course (stored in corsi list) for three semesters or years, and the corso.csv files contain the room where subjects are held. So you want to merge the information into an XML file.
You only forgot one thing: to proceed in orario.csv. Your code wants to merge the very first three anno with the current corso. To fix it, you have to make two changes.
First in this for loop header:
for corso in corsi:
Modify to:
for num, corso in enumerate(corsi):
And when you assign lez:
lez = orario[i + a*12][j]
Modify to:
lez = orario[i + a*12*(num+1)][j]
Now it should work.
This code produces exactly the same result, but it uses Python's XML module to build the output file:
from csv import reader
import xml.etree.cElementTree as ET
import xml.dom.minidom as DOM
corsi = ["EDILIZIA", "EDILE-ARCHIT", "ELETTRONICA", "TECNOLOGIE_DI_INTERNET", "INFORMATICA", "GESTIONALE", "ENERGETICA", "MECCANICA", "CIVILE_ED_AMBIENTALE", "MEDICA", "ENGINEERING_SCIENCES"]
with open('orario.csv', 'r') as orario_csv:
orario = reader(orario_csv)
orario_data = [ row for row in orario ]
for num, corso in enumerate(corsi):
with open(corso + '.csv', 'r') as corso_csv:
corso_raw = reader(corso_csv)
corso_data = [ row for row in corso_raw ]
root_elem = ET.Element('orario')
corso_elem = ET.SubElement(root_elem, 'corso')
corso_elem.set('name', corso)
for anno in range(0, 3):
anno_elem = ET.SubElement(corso_elem, 'anno')
anno_elem.set('num', str(anno + 1))
for giorno in range(1, 6):
giorno_elem = ET.SubElement(anno_elem, 'giorno')
giorno_elem.set('name', orario_data[2][giorno])
for lezione in range(3, 12):
lez = orario_data[lezione + anno * 12 * (num + 1)][giorno]
if lez == '':
lezione_elem = ET.SubElement(giorno_elem, 'lezione')
lezione_elem.text = '-'
else:
for riga in corso_data:
if riga[0] == lez:
if riga[2] == '':
lezione_elem = ET.SubElement(giorno_elem, 'lezione')
lezione_elem.set('name', lez)
lezione_elem.text = riga[1]
else:
for g in range(0, len(riga)):
if riga[g].lower() == orario_data[2][giorno].lower():
lezione_elem = ET.SubElement(giorno_elem, 'lezione')
lezione_elem.set('name', lez)
lezione_elem.text = riga[g + 1]
with open(corso + '_new.xml', 'w') as corso_xml:
xml_data = DOM.parseString(ET.tostring(root_elem, method = 'xml')).toprettyxml(indent = ' ')
corso_xml.write(xml_data)
Cheers.

I think I may have spotted the cause of your problem.
The second item in your corsi list ends with a full stop. This means that you will be looking for the file "EDILE-ARCHIT..csv", which is almost does not exist. When you try and open the file, the open() call will throw an exception, and your program will terminate.
Try removing the trailing full stop, and running it again.

Related

Python replace str in list with new value

I’m writing a program that makes music albums into files that you can search for, and for that i need a str in the file that have a specific value that is made after the list is complete. Can you go back in that list and change a blank str with a new value?
I have searched online and found something called words.replace, but it doesn’t work, i get a Attribute error.
def create_album():
global idnumber, current_information
file_information = []
if current_information[0] != 'N/A':
save()
file_information.append(idnumber)
idnumber += 1
print('Type c at any point to abort creation')
for i in creation_list:
value = input('\t' + i)
if value.upper == 'C':
menu()
else:
-1file_information.append('')
file_information.append(value)
file_information.append('Album created - ' + file_information[2] +'\nSongs:')
-2file_information = [w.replace(file_information[1], str(file_information[0]) + '-' + file_information[2]) for w in file_information]
current_information = file_information
save_name = open(save_path + str(file_information[0]) + '-' + str(file_information[2]) + '.txt', 'w')
for i in file_information:
save_name.write(str(i) + '\n')
current_files_ = open(information_file + 'files.txt', 'w')
filenames.append(file_information[0])
for i in filenames:
current_files_.write(str(i) + '\n')
id_file = open(information_file + 'albumid.txt', 'w')
id_file.write(str(idnumber))
-1 is where i have put aside a blank row
-2 is the where i try to replace row 1 in the list with the value of row 0 and row 2.
The error message I receive is ‘int’ object has no attribute ‘replace’

Did you try this?
-2file_information = [w.replace(str(file_information[1]), str(file_information[0]) + '-' + file_information[2]) for w in file_information]

How can I programmatically return the location (row and column) of a keyword

I am trying to use python to automate the analysis of hundreds of excel files. Currently I can open, write, and save files but I need to insert calculations in cells that contain keywords. I am using python 2.7, here is a snipet from my code where I am struggling:
def run_analysis():
excel = Dispatch('Excel.Application')
excel.DisplayAlerts = False
keyword = "Total Traffic"
x = 0
t = data_file_location_list
while x < len(t):
#for files in data_file_location_list:
wb = excel.Workbooks.Open(t[x].root_dir + "\\" + t[x].file_name)
ws = wb.Sheets('Bandwidth Over Time')
keyword_range = #here is where I am stuck
ws.Range(keyword_range).Value = 'write something'
wb.SaveAs(Filename=str(t[x].root_dir + "\\" + t[x].file_name))
wb.Close()
excel.Quit()
print x
x += 1

Just in case anyone was following this action, I figured it out and thought I would share the answer. The FindCell function was what I came up with, although there is probably a much more elegant way to do this.
def run_analysis():
excel = Dispatch('Excel.Application')
excel.DisplayAlerts = False
x = 0
t = data_file_location_list
while x < len(t):
# MATCHES ROW AND COL INPUT FOR CELL ADDRESS OUTPUT
def FindCell(descriptor, banner):
return(ws.Cells(excel.WorksheetFunction.Match(descriptor, ws.Range("B:B"), 0),
excel.WorksheetFunction.Match(banner, ws.Range("A:A"), 0)).Address)
try:
print 'opening: ' + t[x].root_dir + "\\" + t[x].file_name
wb = excel.Workbooks.Open(t[x].root_dir + "\\" + t[x].file_name)
ws = wb.Sheets('Bandwidth Over Time')
#find the cell below the cell containing "Total Traffic"
total_traffic_calc_range = FindCell("Traffic", 'Bandwidth Over Time')
total_traffic_calc_range_delimiter = int(total_traffic_calc_range.strip('$A$'))
total_traffic_equation_cell = "C" + str(total_traffic_calc_range_delimiter)
#add the equations for calculation
ws.Range(total_traffic_equation_cell).Value = '=Sum(' + 'B' + str(
total_traffic_calc_range_delimiter + 1) + ':B' + str(
total_throughput_calc_range_delimiter - 2) + ')'
wb.SaveAs(Filename = str(t[x].root_dir + "\\" + t[x].file_name))
except Exception as e:
print e
finally:
wb.Close()
excel.Quit()
print x
x += 1

Increment file name while writing file in Python

My code works and increments filename but only for two first files, after that it creates new strings in existing second file. Please help me upgrade code to increment go further.
text = 'some text'
file_path = '/path/to/file'
filename = 'textfile'
i = 1
txtfile = self.file_path + filename + str(i) + '.txt'
if not os.path.exists(txtfile):
text_file = open(txtfile, "a")
text_file.write(self.text)
text_file.close()
elif os.path.exists(txtfile) and i >= 1:
i += 1
text_file1 = open(self.file_path + filename + str(i) + '.txt', "a")
text_file1.write(self.text)
text_file1.close()

If your example is part of a loop, your resetting i to 1 in every iteration. Put the i=1 outside of this part.
And it will also start at 1 when you restart your program - sometimes not what you want.

python wont write even when I use f.close

I'm trying to write some code that outputs some text to a list. output is a variable that is a string which is the name of the file to be written. However whenever I look at the file nothing is written.
with open(output, 'w') as f:
f.write("Negative numbers mean the empty space was moved to the left and positive numbers means it was moved to the right" + '\n')
if A == True:
the_h = node.h
elif A== False:
the_h = 0
f.write("Start " + str(node.cargo) + " " + str(node.f) +" " +str(the_h)+" " + '\n')
if flag == 0:
flag = len(final_solution)
for i in range (1,flag):
node = final_solution[i]
f.write(str(node.e_point - node.parent.e_point) + str(node.cargo) + " " + str(node.f) +'\n')
f.close()

Program looks ok, check if the output is set ok, I set as a dummy filename, it worked, presuming code within the block after open has no compiler/interpreter error. The output file should be in the same directory where the source is.
output = "aa.txt"
with open(output, 'w') as f:
f.write("Negative numbers mean the empty space was moved to the left and positive numbers means it was moved to the right" + '\n')
if A == True:
the_h = node.h
elif A== False:
the_h = 0
f.write("Start " + str(node.cargo) + " " + str(node.f) +" " +str(the_h)+" " + '\n')
if flag == 0:
flag = len(final_solution)
for i in range (1,flag):
node = final_solution[i]
f.write(str(node.e_point - node.parent.e_point) + str(node.cargo) + " " + str(node.f) +'\n')
f.close()

You should not add f.close(), as the with statement will do it for you. Also ensure you don't reopen the file elsewhere with open(output, 'w') as that will erase the file.

optimize python processing json retrieved from the fb-graph-api

i'm getting json data from the facebook-graph-api about:
my relationship with my friends
my friends relationships with each other.
right now my program looks like this (in python pseudo code, please note some variables have been changed for privacy):
import json
import requests
# protected
_accessCode = "someAccessToken"
_accessStr = "?access_token=" + _accessCode
_myID = "myIDNumber"
r = requests.get("https://graph.facebook.com/" + _myID + "/friends/" + _accessStr)
raw = json.loads(r.text)
terminate = len(raw["data"])
# list used to store the friend/friend relationships
a = list()
for j in range(0, terminate + 1):
# calculate terminating displacement:
term_displacement = terminate - (j + 1)
print("Currently processing: " + str(j) + " of " + str(terminate))
for dj in range(1, term_displacement + 1):
# construct urls based on the raw data:
url = "https://graph.facebook.com/" + raw["data"][j]["id"] + "/friends/" + raw["data"][j + dj]["id"] + "/" + _accessStr
# visit site *THIS IS THE BOTTLENECK*:
reqTemp = requests.get(url)
rawTemp = json.loads(reqTemp.text)
if len(rawTemp["data"]) != 0:
# data dumps to list which dumps to file
a.append(str(raw["data"][j]["id"]) + "," + str(rawTemp["data"][0]["id"]))
outputFile = "C:/Users/franklin/Documents/gen/friendsRaw.csv"
output = open(outputFile, "w")
# write all me/friend relationship to file
for k in range(0, terminate):
output.write(_myID + "," + raw["data"][k]["id"] + "\n")
# write all friend/friend relationships to file
for i in range(0, len(a)):
output.write(a[i])
output.close()
So what its doing is: first it calls my page and gets my friend list (this is allowed through the facebook api using an access_token) calling a friend's friend list is NOT allowed but I can work around that by requesting a relationship between a friend on my list and another friend on my list. so in part two (indicated by the double for loops) i'm making another request to see if some friend, a, is also a friend of b, (both of which are on my list); if so there will be a json object of length one with friend a's name.
but with about 357 friends there's literally thousands of page requests that need to be made. in other words the program is spending a lot of time just waiting around for the json-requests.
my question is then can this be rewritten to be more efficient? currently, due to security restrictions, calling a friend's friend list attribute is disallowed. and it doesn't look like the api will allow this. are there any python tricks that can make this run faster? maybe parallelism?
Update modified code is pasted below in the answers section.

Update this is the solution I came up with. Thanks #DMCS for the FQL suggestion but I just decided to use what I had. I will post the FQL solution up when I get a chance to study the implementation. As you can see this method just makes use of more condensed API calls.
Incidentally for future reference the API call limit is 600 calls per 600 seconds, per token & per IP, so for every unique IP address, with a unique access token, the number of calls is limited to 1 call per second. I'm not sure what that means for asynchronous calling #Gerrat, but there is that.
import json
import requests
# protected
_accessCode = "someaccesscode"
_accessStr = "?access_token=" + _accessCode
_myID = "someidnumber"
r = requests.get("https://graph.facebook.com/"
+ _myID + "/friends/" + _accessStr)
raw = json.loads(r.text)
terminate = len(raw["data"])
a = list()
for k in range(0, terminate - 1):
friendID = raw["data"][k]["id"]
friendName = raw["data"][k]["name"]
url = ("https://graph.facebook.com/me/mutualfriends/"
+ friendID + _accessStr)
req = requests.get(url)
temp = json.loads(req.text)
print("Processing: " + str(k + 1) + " of " + str(terminate))
for j in range(0, len(temp["data"])):
a.append(friendID + "," + temp["data"][j]["id"] + ","
+ friendName + "," + temp["data"][j]["name"])
# dump contents to file:
outputFile = "C:/Users/franklin/Documents/gen/friendsRaw.csv"
output = open(outputFile, "w")
print("Dumping to file...")
# write all me/friend relationships to file
for k in range(0, terminate):
output.write(_myID + "," + raw["data"][k]["id"]
+ ",me," + str(raw["data"][k]["name"].encode("utf-8", "ignore")) + "\n")
# write all friend/friend relationships to file
for i in range(0, len(a)):
output.write(str(a[i].encode("utf-8", "ignore")) + "\n")
output.close()

This isn't likely optimal, but I tweaked your code a bit to use Requests async method (untested):
import json
import requests
from requests import async
# protected
_accessCode = "someAccessToken"
_accessStr = "?access_token=" + _accessCode
_myID = "myIDNumber"
r = requests.get("https://graph.facebook.com/" + _myID + "/friends/" + _accessStr)
raw = json.loads(r.text)
terminate = len(raw["data"])
# list used to store the friend/friend relationships
a = list()
def add_to_list(reqTemp):
rawTemp = json.loads(reqTemp.text)
if len(rawTemp["data"]) != 0:
# data dumps to list which dumps to file
a.append(str(raw["data"][j]["id"]) + "," + str(rawTemp["data"][0]["id"]))
async_list = []
for j in range(0, terminate + 1):
# calculate terminating displacement:
term_displacement = terminate - (j + 1)
print("Currently processing: " + str(j) + " of " + str(terminate))
for dj in range(1, term_displacement + 1):
# construct urls based on the raw data:
url = "https://graph.facebook.com/" + raw["data"][j]["id"] + "/friends/" + raw["data"][j + dj]["id"] + "/" + _accessStr
req = async.get(url, hooks = {'response': add_to_list})
async_list.append(req)
# gather up all the results
async.map(async_list)
outputFile = "C:/Users/franklin/Documents/gen/friendsRaw.csv"
output = open(outputFile, "w")

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python's csv reader keep reading the same file - python

Related

Python replace str in list with new value

How can I programmatically return the location (row and column) of a keyword

Increment file name while writing file in Python

python wont write even when I use f.close

optimize python processing json retrieved from the fb-graph-api

Categories

Resources