How to add another loop to a Python nested loop? - python
Edited
I am new to Python, having a problem adding a loop to a nested loop Python code.
using Python 3.8 on my windows 7 machine.
The code does when run once: it reads from multiple CSV files, row by row, and CSV file by CSV file, and uses the data from each row ( within a given range)to run the function until there is no CSV file left, each CSV file has 4 columns, all CSV files have one header each.
There are a few seconds of delay between each row reading.
since the code is just for one-time use, when you run the code again, it reads the same rows, it does not loop to read other rows.
So I want to add another loop to it, so each time you run the file somehow it remembers the last row that was used and starts from the next row.
So assume it has been set to a range of 2 rows:
the first-time run: uses row 1 and 2 to run the function
second-time run: uses row 3 and 4 to run the function, and so on
Appreciate your help to make it work.
Example CSV
img_url,desc_1 title_1,link_1
site.com/image22.jpg;someTitle;description1;site1.com
site.com/image32.jpg;someTitle;description2;site2.com
site.com/image44.jpg;someTitle;description3;site3.com
Here is the working code I have:
from abc.zzz import xyz
path_id_map = [
{'path':'file1.csv', 'id': '12345678'},
{'path':'file2.csv', 'id': '44556677'}
{'path':'file3.csv', 'id': '33377799'}
{'path':'file4.csv', 'id': '66221144'}]
s_id = None
for pair in path_id_map:
with open(pair['path'], 'r') as f:
next(f) # skip first header line
for _ in range(1, 3):
line = next(f)
img_url, title_1, desc_1, link_1 = map(str.strip, line.split(';'))
zzz.func1(img_url=img_url, title_1=title_1, desc_1=desc_1,
link_1=link_1, B_id=B_id=pair['id'], s_id=s_id)
time.sleep(25)
**** Update ****
After a few days of looking for a solution, a Code has been posted( UPDATE 2):
but there is a major problem with it.
it works the way I want only when using the print function,
I adopted my function to it but, when it runs for a second time or more, it does not loop to the next rows, (it only does loop correctly on the last CSV file though),
the author of the code could not correct his code, I can not figure out what is wrong with it.
I checked the CSV files and tested them with the print function, they are OK.
perhaps someone helps to correct the problem or another solution altogether.
Hi I hope I have understood what you're asking. I think the below code might guide you if you adjust it a little bit for your case. You can store the number of the final line into a text file. I also assume that as a delimiter the semi-colon is used.
UPDATE 1:
Okay, I think I came up with this solution to your problem, hopefully. The only prerequisite to run this is to have a text file which includes the number of row you want to begin with for the first run (e.g. 1).
# define function
import csv
import time
import subprocess
import os
import itertools
# txt file that contains the number of line to start the next time
dir_txt = './'
fname_txt = 'number_of_last_line.txt'
path = os.path.join(dir_txt, fname_txt)
# assign line number to variable after reading relevant txt
with open(path, 'r', newline='') as f:
n = int(f.read())
# define path of csv file
fpath = './file1.csv'
# open csv file
with open(fpath, 'r', newline='') as csvfile:
csv_reader = csv.reader(csvfile, delimiter=';')
# Iterate every row of csv. csv_reader row number starts from 1,
# csv_reader generator starts from 0
for row in itertools.islice(csv_reader, n, n+3):
print('row {0} contains {1}'.format(csv_reader.line_num, row))
time.sleep(3)
# Store the number of line to start the next time
n = csv_reader.line_num + 1
# Bash (or cmd) command execution, option. You can do this with python also
sh_command = 'echo {0} > {1}'.format(csv_reader.line_num, path)
subprocess.run(sh_command, shell=True)
UPDATE 2:
Here's a revision with the code working for multiple files using the input of #Error - Syntactical Remorse. The first thing you need to do is open the metadata.json file and insert the number of row you want to begin each file, for the first run only. You also need to change the file directories according to your situation.
# define function
def get_json_metadata(json_fpath):
"""Read json file
Args:
json_fpath -- string (filepath)
Returns:
json_list -- list"""
with open(json_fpath, mode='r') as json_file:
json_str = json_file.read()
json_list = json.loads(json_str)
return json_list
# Imports
import csv, json
import time
import os
import itertools
# json file that contains the number of line to start the next time
dir_json = './'
fname_json = 'metadata.json'
json_fpath = os.path.join(dir_json, fname_json)
# csv filenames, IDs and number of row to start reading are extracted
path_id_map = get_json_metadata(json_fpath)
# iterate over csvfiles
for nfile in path_id_map:
print('\n------ Reading {} ------\n'.format(nfile['path']))
with open(nfile['path'], 'r', newline='') as csvfile:
csv_reader = csv.reader(csvfile, delimiter=';')
# Iterate every row of csv. csv_reader row number starts from 1,
# csv_reader generator starts from 0
for row in itertools.islice(csv_reader, nfile['nrow'], nfile['nrow']+5):
# skip empty line (list)
if not row:
continue
# assign values to variables
img_url, title_1, desc_1, link_1 = row
B_id = nfile['id']
print('row {0} contains {1}'.format(csv_reader.line_num, row))
time.sleep(3)
# Store the number of line to start the next time
nfile['nrow'] = csv_reader.line_num
with open(json_fpath, mode='w') as json_file:
json_str = json.dumps(path_id_map, indent=4)
json_file.write(json_str)
This is how the metadata.json format should be:
[
{
"path": "file1.csv",
"id": "12345678",
"nrow": 1
},
{
"path": "file2.csv",
"id": "44556677",
"nrow": 1
},
{
"path": "file3.csv",
"id": "33377799",
"nrow": 1
},
{
"path": "file4.csv",
"id": "66221144",
"nrow": 1
}
]
Related
How to read a csv file and create a new csv file after every nth number of rows?
I'm trying to write a function that reads a sheet of an existing .csv file and every 20 rows are copied to a newly created csv file. Therefore, it needs to be designed like a file counter "file_01, file_02, file_04,...," where the first 20 rows are copied to file_01, the next 20 to file_02.csv, and so on. Currently I have this code which hasn't worked for me work so far. import csv import os.path from itertools import islice N = 20 new_filename = "" filename = "" with open(filename, "rb") as file: # the a opens it in append mode reader = csv.reader(file) for i in range(N): line = next(file).strip() #print(line) with open(new_filename, 'wb') as outfh: writer = csv.writer(outfh) writer.writerow(line) writer.writerows(islice(reader, 2)) I have attached a file for testing. https://1drv.ms/u/s!AhdJmaLEPcR8htYqFooEoYUwDzdZbg 32.01,18.42,58.98,33.02,55.37,63.25,12.82,-32.42,33.99,179.53, 41.11,33.94,67.85,57.61,59.23,94.69,19.43,-19.15,21.71,-161.13, 49.80,54.12,72.78,100.74,56.97,128.84,26.95,-6.76,10.07,-142.62, 55.49,81.02,68.93,148.17,49.25,157.32,34.94,5.39,0.44,-123.32, 56.01,112.81,59.27,177.87,38.50,179.63,43.43,18.42,-5.81,-102.24, 50.79,142.87,48.06,-162.32,26.60,-161.21,52.38,34.37,-7.42,-79.64, 41.54,167.36,37.12,-145.93,15.01,-142.84,60.90,57.05,-4.47,-56.54, 30.28,-172.09,27.36,-130.24,5.11,-123.66,66.24,91.12,-0.76,-35.44, 18.64,-153.20,19.52,-114.09,-1.54,-102.96,64.77,131.32,5.12,-21.68, 7.92,-134.07,14.24,-96.93,-3.79,-80.91,57.10,162.35,12.51,-9.21, -0.34,-113.74,11.80,-78.73,-2.49,-58.46,46.75,-175.86,20.81,2.87, -4.81,-91.85,11.78,-60.28,0.59,-39.26,35.75,-158.12,29.79,15.71, -4.76,-68.67,13.79,-43.84,6.82,-24.69,25.27,-141.56,39.05,30.71, -1.33,-46.42,18.44,-30.23,14.53,-11.95,16.21,-124.45,47.91,50.25, 4.14,-29.61,24.89,-18.02,23.01,0.10,9.59,-106.05,54.46,77.07, 11.04,-15.39,32.33,-6.66,31.92,12.48,6.24,-86.34,55.72,110.53, 18.69,-2.32,40.46,4.57,41.11,26.87,6.07,-65.68,50.25,142.78, 26.94,10.56,49.18,16.67,49.92,45.39,8.06,-46.86,40.13,168.29, 35.80,24.58,58.45,31.99,56.83,70.92,12.96,-31.90,28.10,-171.07, 44.90,41.72,67.41,55.89,59.21,103.94,19.63,-18.67,15.97,-152.40, -5.41,-77.62,11.40,-63.21,4.80,-29.06,31.33,-151.44,43.00,37.25, -2.88,-54.38,13.08,-46.00,12.16,-15.86,21.21,-134.62,51.25,59.16, 1.69,-35.73,17.44,-32.01,20.37,-3.78,13.06,-117.10,56.18,88.98, 8.15,-20.80,23.70,-19.66,29.11,8.29,7.74,-98.22,54.91,123.30, 15.52,-7.45,31.04,-8.22,38.22,21.78,5.76,-77.99,47.34,153.31, 23.53,5.38,39.07,2.98,47.29,38.71,6.58,-57.45,36.18,176.74, 32.16,18.76,47.71,14.88,55.08,61.71,9.76,-40.52,23.99,-163.75, 41.27,34.36,56.93,29.53,59.23,92.75,15.53,-26.40,12.16,-145.27, 49.92,54.65,66.04,51.59,57.34,126.97,22.59,-13.65,2.14,-126.20, 55.50,81.56,72.21,90.19,49.88,155.84,30.32,-1.48,-4.71,-105.49, 55.92,113.45,70.26,139.40,39.23,178.48,38.55,10.92,-7.09,-83.11, 50.58,143.40,61.40,172.50,27.38,-162.27,47.25,24.86,-4.77,-60.15, 41.30,167.74,50.34,-166.33,15.74,-143.93,56.21,43.14,-0.54,-38.22, 30.03,-171.78,39.24,-149.48,5.71,-124.87,63.77,70.19,4.75,-24.15, 18.40,-152.91,29.17,-133.78,-1.18,-104.31,66.51,108.81,11.86,-11.51, 7.69,-133.71,20.84,-117.74,-3.72,-82.28,61.95,146.15,20.05,0.65, -0.52,-113.33,14.97,-100.79,-2.58,-59.75,52.78,172.46,28.91,13.29, -4.91,-91.36,11.92,-82.84,0.34,-40.12,41.93,-167.91,38.21,27.90,
These are some of the problems with your current solution. You created a csv.reader object but then you did not use it You read each line but then you did not store them anywhere You are not keeping track of 20 rows which was supposed to be your requirement You created the output file in a separate with block which does not have access anymore to the read lines or the csv.reader object Here's a working solution: import csv inp_file = "input.csv" out_file_pattern = "file_{:{fill}2}.csv" max_rows = 20 with open(inp_file, "r") as inp_f: reader = csv.reader(inp_f) all_rows = [] cur_file = 1 for row in reader: all_rows.append(row) if len(all_rows) == max_rows: with open(out_file_pattern.format(cur_file, fill="0"), "w") as out_f: writer = csv.writer(out_f) writer.writerows(all_rows) all_rows = [] cur_file += 1 The flow is as follows: Read each row of the CSV using a csv.reader Store each row in an all_rows list Once that list gets 20 rows, open a file and write all the rows to it Use the csv.writer's writerows method Use a cur_file counter to format the filename Every time 20 rows are dumped to a file, empty out the list and increment the file counter This solution includes the blank lines as part of the 20 rows. Your test file has actually 19 rows of CSV data and 1 row for a blank line. If you need to skip the blank line, just add a simple check of if not row: continue Also, as I mentioned in a comment, I assume that the input file is an actual CSV file, meaning it's a plain text file with CSV formatted data. If the input is actually an Excel file, then solutions like this won't work, because you'll need some special libraries to read Excel files, even if the contents visually looks like CSV or even if you rename the file to .csv.
Without using any special CSV libraries (e.g. csv, though you could, just that I don't know how to use them, however don't think it is necessary for this case), you could: excel_csv_fp = open(r"<file_name>", "r", encoding="utf-8") # Check proper encoding for your file csv_data = excel_csv_fp.readlines() file_counter = 0 new_file_name = "" new_fp = "" for line in csv_data: if line == "": if new_fp != "": new_fp.close() file_counter += 1 new_file_name = "file_" + "{:02d}".format(file_counter) # 1 turns into 01 and 10 turns 10 i.e. remains the same new_fp = open("<some_path>/" + new_file_name + ".csv", "w", encoding="utf-8") # Makes a new CSV file to start writing to elif new_fp != "": # Updated code to make sure new_fp is a file pointer and not a string new_fp.write(line) # Write each line after a space If you have any questions on any of the code (how it works, why I choose what etc.), just ask in the comments and I'll try to reply as soon as possible.
How do I add from a newly created csv file Column?
I would like to create a file in real time and add the values corresponding to the columns to an existing file in real time in the corresponding CSV file. How can I add each of the CSV files that I generate in that program? I'll write down the code I'm using now. import csv for i in range(10): SD="Save datas(Angle)"+str(i) ## 해당 각도별로 배열을 지정 SDArray1=str(SD) ## 파일을 만들어준다 f=open(SDArray1+".csv","a+t")# ## 이름을 만들어준 파일을 생성 csv_writer = csv.writer(f) csv_writer.writerow([SD]) print("One loop has started") f.close()# for i in range(1,5): cdata=[i] f=open(SDArray1+".csv","a+t") csv_writer =csv.writer(f) csv_writer.writerow(cdata) print(cdata) f.close()# print("loop's finished!") If you look at the code above, a certain file is created. I completed the next file, but I was wondering how to add columns to the file.
csv.write_row() takes a complete row of columns - if you need more, add them to your cdata=[i]- f.e. cdata=[i,i*2,i*3,i*4]. You should use with open() as f: for file manipulation, it is more resilient against errors and autocloses the file when leaving the with-block. Fixed: import csv # do not use i here and down below, thats confusing, better names are a plus for fileCount in range(10): filename = "filename{}.csv".format(fileCount) # creates filename0.csv ... filename9.csv with open(filename,"w") as f:# # create file new csv_writer = csv.writer(f) # write headers csv_writer.writerow(["data1","data2","data3"]) # write 4 rows of data for i in range(1,5): cdata=[(fileCount*100000+i*1000+k) for k in range(3)] # create 3 datapoints # write one row of data [1000,1001,1002] up to [9004000,9004001,9004002] # for last i and fileCount csv_writer.writerow(cdata) # no file.close- leaving wiht open() scope autocloses Check what we have written: import os for d in sorted(os.listdir("./")): if d.endswith("csv"): print(d,":") print("*"*(len(d)+2)) with open(d,"r") as f: print(f.read()) print("") Output: filename0.csv : *************** data1,data2,data3 1000,1001,1002 2000,2001,2002 3000,3001,3002 4000,4001,4002 filename1.csv : *************** data1,data2,data3 101000,101001,101002 102000,102001,102002 103000,103001,103002 104000,104001,104002 filename2.csv : *************** data1,data2,data3 201000,201001,201002 [...snip the rest - you get the idea ...] filename9.csv : *************** data1,data2,data3 901000,901001,901002 902000,902001,902002 903000,903001,903002 904000,904001,904002 To add a new column to an existing file: open old file to read open new file to write read the old files header, add new column header and write it in new file read all rows, add new columns value to each row and write it in new file Example: Adding the sum of column values to the file and writing as new file: filename = "filename0.csv" newfile = "filename0new.csv" # open one file to read, open other (new one) to write with open(filename,"r") as r, open(newfile,"w") as w: reader = csv.reader(r) writer = csv.writer(w) newHeader = next(reader) # read the header newHeader.append("Sum") # append new column-header writer.writerow(newHeader) # write header # for each row: for row in reader: row.append(sum(map(int,row))) # read it, sum the converted int values writer.writerow(row) # write it # output the newly created file: with open(newfile,"r") as n: print(n.read()) Output: data1,data2,data3,Sum 1000,1001,1002,3003 2000,2001,2002,6003 3000,3001,3002,9003 4000,4001,4002,12003
Compare rows of csv and work out percentage
I'm relatively new to Python. I'm trying to find a way to create a script that looks at a CSV file called "data_old" from a previous month, and compares it with the data in a more recent month called "data_new", then finally outputs that data into a new CSV "data_compare". The files each month are consistently laid out and look like this (example) Month 1 Company, StaffNumber, NeedToPass, Passed, %age meeting requirement xxxxxxxx, 100, 80, 30, 30% Month 3 Company, StaffNumber, NeedToPass, Passed, %meeting requirement xxxxxxxx, 101, 81, 54, 60% I'm trying to get the output file to compare the data from all rows and show me "Percentage improved, instead of "Percentage meeting requirement". Nothing I try seems to work. As the numbers change all the time the only common data will be the company name. I need a simple, explanatory way with comments... as I'd like to understand the logic so I can modify it and add functions. Much appreciated.
Here ist a python code example which might does what you want. This script asumes that the two input csv files have the same amount of lines. In the function test the function zip i used, which stops if one list is at the end. If your files have a different amount of lines you have to manually loop over both. But I think it is a good starting point #!/usr/bin/env python # -*- coding: utf-8 -*- import csv def parse_csv(filename, sort_row=0, as_dict=False, delimiter=","): r = list() with open(filename, "rb") as f: # make csv reader object reader = csv.reader(f, delimiter=delimiter) if as_dict: # make dict if desired header = [h.strip() for h in reader.next()] for row in reader: if as_dict: # make dict if desired r.append(dict(zip(header, row))) else: # strip each item in the row and append it to the return list r.append([h.strip() for h in row]) # sort the list by the first item (company name in this example) r.sort(key=lambda x: x[sort_row]) return r def write_csv(filename, fieldnames, rows, delimiter=","): with open(filename, "w") as f: # make csv writer object writer = csv.writer(f, delimiter=delimiter) # write the first header line writer.writerow(fieldnames) for row in rows: # write each row writer.writerow(row) def test(): data_old = parse_csv("m1.csv") data_new = parse_csv("m2.csv") #write_csv("data_compare.csv", data_old[:1][0], data_old[1:]) result = list() # loop over the items (skipping the first header row) for o, n in zip(data_old[1:], data_new[1:]): # calculate the improvement (or whatever needs to be calculated) value = float(n[4].replace("%", "")) - float(o[4].replace("%", "")) # create the row result.append([o[0], "%s%%" % value, o[4], n[4]]) #result.append(["%s%%" % value]) header = ["Company", "Percentage improved", "old", "new"] #header = ["Company", "Percentage improved"] write_csv("data_compare.csv", header, result) if __name__ == '__main__': test()
Loop retrieve data from csv, append to file
I have created a Python 2.7 script that does the following: Gets a list of filenames from a folder, and writes them to a csv file, one for each row. And Enters data into a search box on the web. Writes the result from the search box into another csv file. So what I would like now, is for the csv data in (1 ) to act as the input for (2 ). i.e. for each filename in the csv file, it conducts a search for that cell. Additionally, instead of just writing the results into a second csv file in (3 ), I would like to append the result into the first csv file – OR generate a new one with both columns. I can provide the code, but since it's 50 lines already, I've just tried to keep this question descriptive. Update: Proposed retrieval and append: with open("file.csv","a+") as f: r = csv.reader(f) wr = csv.writer(f, delimiter="\n") result = [] for line in r: searchbox = driver.find_element_by_name("searchbox") searchbox.send_keys(line) sleep(8) search_reply = driver.find_element_by_class_name("search_reply") result = re.findall("((?<=\()[0-9]*)", search_reply.text) wr.writerow(result)
Open for reading and appending, store the output then write at the end: import csv with open("first.csv","a+") as f: r = csv.reader(f) wr = csv.writer(f,delimiter="\n") result = [] for line in r: # process lines/step 2 # append to result wr.writerow(result)
Building list of lists from CSV file
I have an Excel file(that I am exporting as a csv) that I want to parse, but I am having trouble with finding the best way to do it. The csv is a list of computers in my network, and what accounts are in the local administrator group for each one. I have done something similar with tuples, but the number of accounts for each computer range from 1 to 30. I want to build a list of lists, then go through each list to find the accounts that should be there(Administrator, etc.) and delete them, so that I can then export a list of only accounts that shouldn't be a local admin, but are. The csv file is formatted as follows: "computer1" Administrator localadmin useraccount "computer2" localadmin Administrator "computer3" localadmin Administrator user2account Any help would be appreciated EDIT: Here is the code I am working with import csv import sys #used for passing in the argument file_name = sys.argv[1] #filename is argument 1 with open(file_name, 'rU') as f: #opens PW file reader = csv.reader(f) data = list(list(rec) for rec in csv.reader(f, delimiter=',')) #reads csv into a list of lists f.close() #close the csv for i in range(len(data)): print data[i][0] #this alone will print all the computer names for j in range(len(data[i])) #Trying to run another for loop to print the usernames print data[i][j] The issue is with the second for loop. I want to be able to read across each line and for now, just print them.
This should get you on the right track: import csv import sys #used for passing in the argument file_name = sys.argv[1] #filename is argument 1 with open(file_name, 'rU') as f: #opens PW file reader = csv.reader(f) data = list(list(rec) for rec in csv.reader(f, delimiter=',')) #reads csv into a list of lists for row in data: print row[0] #this alone will print all the computer names for username in row: #Trying to run another for loop to print the usernames print username Last two lines will print all of the row (including the "computer"). Do for x in range(1, len(row)): print row[x] ... to avoid printing the computer twice. Note that f.close() is not required when using the "with" construct because the resource will automatically be closed when the "with" block is exited. Personally, I would just do: import csv import sys #used for passing in the argument file_name = sys.argv[1] #filename is argument 1 with open(file_name, 'rU') as f: #opens PW file reader = csv.reader(f) # Print every value of every row. for row in reader: for value in row: print value That's a reasonable way to iterate through the data and should give you a firm basis to add whatever further logic is required.
This is how I opened a .csv file and imported columns of data as numpy arrays - naturally, you don't need numpy arrays, but... data = {} app = QApplication( sys.argv ) fname = unicode ( QFileDialog.getOpenFileName() ) app.quit() filename = fname.strip('.csv') + ' for release.csv' #open the file and skip the first two rows of data imported_array = np.loadtxt(fname, delimiter=',', skiprows = 2) data = {'time_s':imported_array[:,0]} data['Speed_RPM'] = imported_array[:,1]
It can be done using the pandas library. import pandas as pd df = pd.read_csv(filename) list_of_lists = df.values.tolist() This approach applies to other kinds of data like .tsv, etc.