duplicate values in new csv file - python
I'm working on a program and want to write my result into a comma separated file, like a CSV.
new_throughput =[]
self.t._interval = 2
self.f = open("output.%s.csv"%postfix, "w")
self.f.write("time, Byte_Count, Throughput \n")
cur_throughput = stat.byte_count
t_put.append(cur_throughput)
b_count = (cur_throughput/131072.0) #calculating bits
b_count_list.append(b_count)
L = [y-x for x,y in zip(b_count_list, b_count_list[1:])] #subtracting current value - previous, saves value into list
for i in L:
new_throughput.append(i/self.t._interval)
self.f.write("%s,%s,%s,%s \n"%(self.experiment, b_count, b_count_list,new_throughput)) #write to file
when running this code i get this in my CSV file
picture here.
It somehow prints out the previous value every time.
What I want is new row for each new line:
time , byte_count, throughput
20181117013759,0.0,0.0
20181117013759,14.3157348633,7.157867431640625
0181117013759,53.5484619141,, 19.616363525390625
I don't have a working minimal example, but your last line should refer to the last member of each list, not the whole list. Something like this:
self.f.write("%s,%s,%s,%s \n"%(self.experiment, b_count, b_count_list[-1],new_throughput[-1])) #write to file
Edit: ...although if you want this simple solution to work, then you should initialize the lists with one initial value, e.g. [0], otherwise you'd get a "list index out of range error" at the first iteration according to your output.
Related
Indexing error after removing line from 2D array
I am facing an 'List Index out of range' error when trying to iterate a for-loop over a table I've created from a CSV extract, but cannot figure out why - even after trying many different methods. Here is the step by step description of how the error happens : I'm removing the first line of an imported CSV file, as this line contains the columns' names but no data. The CSV has the following structure. columnName1, columnName2, columnName3, columnName4 This, is, some, data I, have, in, this very, interesting, CSV, file After storing the CSV in a first array called oldArray, I want to populate a newArray that will get all values from oldArray but not the first line, which is the column name line, as previously mentioned. My newArray should then look like this. This, is, some, data I, have, in, this very, interesting, CSV, file To create this newArray, I'm using the following code with the append() function. tempList = [] newArray = [] for i in range(len(oldArray)): if i > 0: #my ugly way of skipping line 0... for j in range(len(oldArray[0])): tempList.append(oldArray[i][j]) newArray.append(tempList) tempList = [] I also stored the columns in their own separate list. i = 0 for i in range(len(oldArray[0])): my_columnList[i] = oldArray[0][i] And the error comes up next : I now want to populate a treeview table from this newArray, using a for-loop and insert (in a function). But I always get the 'Index List out of range error' and I cannot figure out why. def populateTable(my_tree, newArray, my_columnList): i = 0 for i in range(len(newArray)): my_tree.insert('','end', text=newArray[i][0], values = (newArray[i][1:len(newArray[0])) #(im using the text option to bypass treeview's column 0 problem) return my_tree Error message --> " File "(...my working directory...)", line 301, in populateTable my_tree.insert(parent='', index='end', text=data[i][0], values=(data[i][1:len(data[0])])) IndexError: list index out of range " Using that same function with different datasets and columns worked fine, but not for this here newArray. I'm fairy certain that the error comes strictly from this 'newArray' and is not linked to another parameter. I've tested the validity of the columns list, of the CSV import in oldArray through some print() functions, and everything seems normal - values, row dimension, column dimension. This is a great mystery to me... Thank you all very much for your help and time.
You can find a problem from your error message: File "(...my working directory...)", line 301, in populateTable my_tree.insert(parent='', index='end', text=data[i][0], values=(data[i][1:len(data[0])])) IndexError: list index out of range It means there is an index out of range in line 301: data[i][0] or data[i][1:len(data[0])] (i is over len(data)) or (0 or 1 is over len(data[0])) My guess is there is some empty list in data(maybe data[-1]?). if data[i] is [] or [some_one_item], then data[i][1:len(data[0])] try to access to second item which not exists.
there is no problem in your "ugly" way to skip line 0 but I recommend having a look on this way new_array = old_array.copy() new_array.remove(new_array[0]) now for fixing your issue looks like you have a problem in the indexing when you use a for loop using the range of the length of an array you use normal indexing which starts from one while you identify your i variable to be zero to make it simple len(oldArray[0]) this is equal to 4 so when you use it in the for loop it's just like saying for i in range(4): to fix this you can either subtract 1 from the length of the old array or just identify the i variable to be 1 at the first i = 1 for i in range(len(oldArray[0])): my_columnList[i] = oldArray[0][i] or i = 0 for i in range(len(oldArray[0])-1): my_columnList[i] = oldArray[0][i] this mistake is also repeated in your populateTree function so in the same way your code would be def populateTree(my_tree, newArray, my_columnList): i = 0 for i in range(len(newArray)-1): my_tree.insert('','end', text=newArray[i][0], values = (newArray[i][1:len(newArray[0])) #(im using the text option to bypass treeview's column 0 problem) return my_tree
Using an if statement to pass through variables ot further functions for python
I am a biologist that is just trying to use python to automate a ton of calculations, so I have very little experience. I have a very large array that contains values that are formatted into two columns of observations. Sometimes the observations will be the same between the columns: v1,v2 x,y a,b a,a x,x In order to save time and effort I wanted to make an if statement that just prints 0 if the two columns are the same and then moves on. If the values are the same there is no need to run those instances through the downstream analyses. This is what I have so far just to test out the if statement. It has yet to recognize any instances where the columns are equivalen. Script: mylines=[] with open('xxxx','r') as myfile: for myline in myfile: mylines.append(myline) ##reads the data into the two column format mentioned above rang=len(open ('xxxxx,'r').readlines( )) ##returns the number or lines in the file for x in range(1, rang): li = mylines[x] ##selected row as defined by x and the number of lines in the file spit = li.split(',',2) ##splits the selected values so they can be accessed seperately print(spit[0]) ##first value print(spit[1]) ##second value if spit[0] == spit[1]: print(0) else: print('Issue') Output: 192Alhe52 192Alhe52 Issue ##should be 0 188Alhe48 192Alhe52 Issue 191Alhe51 192Alhe52 Issue How do I get python to recgonize that certain observations are actually equal?
When you read the values and store them in the array, you can be storing '\n' as well, which is a break line character, so your array actually looks like this print(mylist) ['x,y\n', 'a,b\n', 'a,a\n', 'x,x\n'] To work around this issue, you have to use strip(), which will remove this character and occasional blank spaces in the end of the string that would also affect the comparison mylines.append(myline.strip()) You shouldn't use rang=len(open ('xxxxx,'r').readlines( )), because you are reading the file again rang=len(mylines) There is a more readable, pythonic way to replicate your for for li in mylines[1:]: spit = li.split(',') if spit[0] == spit[1]: print(0) else: print('Issue') Or even for spit.split(',') in mylines[1:]: if spit[0] == spit[1]: print(0) else: print('Issue') will iterate on the array mylines, starting from the first element. Also, if you're interested in python packages, you should have a look at pandas. Assuming you have a csv file: import pandas as pd df = pd.read_csv('xxxx') for i, elements in df.iterrows(): if elements['v1'] == elements['v2']: print('Equal') else: print('Different') will do the trick. If you need to modify values and write another file df.to_csv('nameYouWant')
For one, your issue with the equals test might be because iterating over lines like this also yields the newline character. There is a string function that can get rid of that, .strip(). Also, your argument to split is 2, which splits your row into three groups - but that probably doesn't show here. You can avoid having to parse it yourself when using the csv module, as your file presumably is that: import csv with open("yourfile.txt") as file: reader = csv.reader(file) next(reader) # skip header for first, second in reader: print(first) print(second) if first == second: print(0) else: print("Issue")
How to find matches between csv files based on two columns within a range
I'm currently struggling to put together some code that will find the matches of values in two different columns in two csv files within a range. I have tried using the code below, but it doesn't output what I am trying to accomplish. Basically, I want to output a new file that contains all of the lines in the second file that have matches to the same columns in the first file, not merge them together. I've added more detailed clarification below my code. I feel like what I've done so far is probably completely wrong. What do I need to change in order for my code to produce the results I am looking for? import csv with open('F435W.csv') as csvF435: readCSV1 = csv.reader(csvF435, delimiter=',') with open("F550Mnew.csv", "w") as new_F550M: pass with open("F550Mnew.csv", "a") as new_F550M: for header in readCSV1: new_F550M.write(','.join(header)+'\n') break for l435 in readCSV1: with open('F550M.csv') as csvF550: readCSV2 = csv.reader(csvF550, delimiter=',') for l550 in readCSV2: if isfloat(l435[12]) and isfloat(l550[12]) and abs(float(l435[12])-float(l550[12])) < 0.002778: if isfloat(l435[13]) and isfloat(l550[13]) and abs(float(l435[13])-float(l550[13])) < 0.002778: new_F550M.write(','.join(l550)+'\n') For clarification, each file has an X column and a Y column so basically each row corresponds to an (X,Y) point. In addition, there are 21 other columns of data that are not necessary for finding matches, but need to be included in the final output file. I am trying to find points in the second file that match the points in the first file within a radius. This is because I know that none of my points will be exact matches. In my data, my X is column 13 and my Y is column 14. The way I have tried to accomplish this is by finding the differences between every X in the first file and every X in the second file (eg. X1-X2), and the differences between every Y in the first file and every Y in the second file (eg. Y1-Y2). Then, every row in the second file which corresponds to differences for both X and Y which are less than my radius value (0.0002778) would be considered a match to the first file. Unfortunately, my code produces a file with over 300,000 points when my original files only have 7000 points. There should be less data, not more data. It also includes many repeats of data, when there should not be any repeats at all. Thank you for your time! Sample of what the data looks like: I apologize for the length, but I am afraid they will not contain enough matches to be useful if I don't include enough of the data. F435W.csv (file 1) 1,2017.013,0.01242859,-8.2618,0,51434.12,0.3269918,-11.7781,0,0.01957931,1387.9406,541.916,49.9898514,41.5266996,8.81E+01,1.63E+03,1.44E+02,40.535,8.65,84.72,0.00061,0.00035,62.14 2,84.73392,0.01245409,-4.8201,0.0002,112.9723,0.04012135,-5.1324,0.0004,-0.002142646,150.306,146.7986,49.9942613,41.5444392,4.92E+00,5.60E+00,-2.02E-01,2.379,2.206,-74.69,0.00339,0.0029,88.88 3,215.1939,0.01242859,-5.8321,0.0001,262.2751,0.03840466,-6.0469,0.0002,-0.002961465,3248.686,52.8478,50.003155,41.5019044,4.77E+00,5.05E+00,-1.63E-01,2.263,2.166,-65.29,0.002,0.0019,-66.78 4,0.3796681,0.01240305,1.0515,0.0355,0.5823653,0.05487975,0.587,0.1023,-0.00425157,3760.344,11.113,50.0051049,41.4949256,1.93E+00,1.02E+00,-7.42E-02,1.393,1.007,-4.61,0.05461,0.03818,-6.68 5,0.9584663,0.01249223,0.0461,0.0142,1.043696,0.0175857,-0.0464,0.0183,-0.004156116,4013.2063,9.1225,50.0057256,41.4914444,1.12E+00,9.75E-01,1.09E-01,1.085,0.957,28.34,0.01934,0.01745,44.01 6,2.379565,0.01249223,-0.9412,0.0057,0.231205,0.02710035,1.59,0.1273,-0.004135321,3824.3706,9.0756,50.0052903,41.4940468,7.81E-01,6.99E-02,4.27E-02,0.885,0.26,3.42,0.01265,0.00622,15.52 7,0.3171223,0.01250492,1.2469,0.0428,0.5233852,0.05406558,0.7029,0.1122,-0.00399635,4097.3604,7.0301,50.0059585,41.4902884,9.61E-01,1.63E+00,-3.94E-01,1.346,0.883,-65.16,0.06171,0.04005,-65.05 8,0.289245,0.0125176,1.3468,0.047,0.2744479,0.02238134,1.4039,0.0886,-0.004173243,3904.7402,7.3912,50.0055069,41.4929422,7.90E-01,2.38E-01,7.13E-02,0.894,0.479,7.24,0.04501,0.02071,8.29 9,0.3543034,0.01247953,1.1266,0.0383,0.7666836,0.06376094,0.2885,0.0903,-0.004009248,4107.0684,3.259,50.0060503,41.4901611,3.53E+00,1.28E+00,-4.60E-01,1.903,1.09,-11.12,0.06873,0.03955,-11.22 10,1.308331,0.01250492,-0.2918,0.0104,-0.005209296,0.004877397,99,99,-0.004193406,3933.9834,6,50.0056001,41.4925416,5.78E-01,8.33E-02,0.00E+00,0.76,0.289,0,0.01272,0.00424,0 11,3.995717,0.01250492,-1.504,0.0034,0.1589517,0.007450347,1.9968,0.0509,-0.003990021,4069.0469,3.0234,50.0059668,41.4906855,8.03E-01,2.29E-02,1.02E-02,0.896,0.151,0.75,0.00888,0.00361,5.59 12,1.067634,0.01250492,-0.0711,0.0127,0.1260926,0.02787585,2.2483,0.2401,-0.004042602,4048.9148,4,50.0059023,41.4909612,7.40E-01,8.33E-02,0.00E+00,0.86,0.289,0,0.02449,0.00576,0 13,0.2808423,0.01162418,1.3788,0.0449,0.4633991,0.02235104,0.8351,0.0524,-0.004015559,4114.6655,2.0641,50.0060898,41.4900585,9.65E-01,5.88E-01,-9.47E-02,0.994,0.752,-13.34,0.05405,0.03814,-15.13 14,1.067291,0.01245409,-0.0707,0.0127,1.081617,0.01516444,-0.0852,0.0152,-0.004168633,3960.8787,18.0524,50.0054405,41.4921501,6.84E-01,8.29E-01,-6.18E-02,0.923,0.813,-69.77,0.01468,0.01229,-78.83 15,0.5216251,0.0125176,0.7066,0.0261,0.584776,0.01824955,0.5825,0.0339,-0.003026338,2661.6533,58.4563,50.0016952,41.5099844,8.51E-01,1.17E+00,-7.27E-02,1.089,0.914,-77.72,0.03244,0.02498,-81.68 16,0.6062042,0.01249223,0.5435,0.0224,0.8726375,0.05509822,0.1479,0.0686,-0.003950399,4149.8169,31.0127,50.0056384,41.489524,9.30E-01,3.48E+00,2.03E-01,1.87,0.956,85.48,0.05307,0.0241,86.01 17,0.1324067,0.01242859,2.1952,0.1019,0.1208224,0.01290438,2.2946,0.116,-0.004166729,3911.6807,12.661,50.005426,41.4928374,2.17E-01,2.24E-01,-1.08E-01,0.574,0.335,-45.89,0.0721,0.04162,-44.98 18,0.2136006,0.01247953,1.676,0.0634,0.3511444,0.02471001,1.1363,0.0764,-0.003978713,4096.9111,15.6285,50.0057993,41.4902797,1.00E+00,4.37E-01,2.85E-01,1.058,0.564,22.64,0.07548,0.03957,23.17 19,0.1470979,0.01244135,2.081,0.0919,0.1216703,0.0168958,2.287,0.1508,-0.004147241,3695.311,13.7044,50.004907,41.4958173,2.14E-01,2.08E-01,9.20E-02,0.551,0.345,44.05,0.07073,0.04115,45.12 20,0.5434682,0.01250492,0.6621,0.025,0.5819249,0.01592951,0.5878,0.0297,-0.004136056,3866.6416,24.8316,50.0050981,41.493437,8.34E-01,9.96E-01,2.74E-01,1.096,0.793,53.22,0.02966,0.02055,58.08 21,0.2259093,0.01249223,1.6152,0.0601,0.2848583,0.01867901,1.3634,0.0712,-0.00409535,3645.521,20.0162,50.0046759,41.4964926,5.71E-01,4.26E-01,-1.11E-02,0.756,0.652,-4.34,0.03735,0.0305,0.08 22,0.9499883,0.01247953,0.0557,0.0143,0.9711754,0.01891141,0.0318,0.0211,-0.003134006,3378.7927,19.5305,50.0040686,41.5001691,8.66E-01,4.09E-01,3.57E-03,0.931,0.639,0.45,0.01623,0.01142,-1.19 23,1.125635,0.01240305,-0.1285,0.012,1.050538,0.02402694,-0.0535,0.0248,-0.003295973,3132.9458,24.9024,50.0034018,41.5035477,9.65E-01,7.83E-01,-1.44E-01,1.022,0.839,-28.88,0.01702,0.01288,-21 24,0.168302,0.01249223,1.9348,0.0806,0.2447732,0.01930529,1.5281,0.0857,-0.004140488,3904.7268,27.0386,50.0051454,41.4929084,4.47E-01,4.56E-01,-1.28E-02,0.682,0.662,-54.61,0.04399,0.04068,89.66 25,0.0542859,0.01244135,3.1633,0.2489,0.08799078,0.007964755,2.6389,0.0983,-0.003241792,3454.2612,25.2749,50.0041373,41.4991191,1.93E-01,1.99E-01,-7.18E-02,0.518,0.353,-46.27,0.06408,0.03839,-44.76 26,0.4379335,0.01242859,0.8965,0.0308,0.4661828,0.01542368,0.8286,0.0359,-0.00336337,3478.7058,32.3355,50.0040639,41.4987701,6.15E-01,8.96E-01,-2.91E-02,0.948,0.782,-84.15,0.02891,0.02521,-70.04 27,0.1515608,0.01249223,2.0485,0.0895,0.1935181,0.01712885,1.7832,0.0961,-0.002904789,2982.0017,29.9904,50.0029594,41.505619,3.46E-01,3.61E-01,1.55E-05,0.601,0.588,89.94,0.05241,0.05241,-80.48 28,0.6658883,0.01250492,0.4415,0.0204,0.718064,0.01780974,0.3596,0.0269,-0.00324104,3408.0103,36.2539,50.0038284,41.4997375,9.45E-01,1.11E+00,1.98E-01,1.115,0.902,56.45,0.02706,0.02147,51.52 29,0.7244126,0.01244135,0.35,0.0187,1.030102,0.02744665,-0.0322,0.0289,-0.00280412,3259.0889,37.3165,50.0034648,41.5017879,8.65E-01,1.01E+00,5.85E-02,1.017,0.919,70.87,0.02225,0.02011,55.79 30,0.1651701,0.01247953,1.9552,0.0821,0.163293,0.01641976,1.9676,0.1092,-0.003909466,3595.4846,31.9761,50.0043403,41.4971614,2.50E-01,4.42E-01,2.21E-01,0.766,0.324,56.75,0.08087,0.03087,58.28 F550M.csv (file 2) 2,1921.566,0.01258874,-8.2091,0,37128.06,0.2618096,-11.4243,0,0.01455503,4617.5225,554.576,49.9887896,41.5264699,6.09E+01,8.09E+02,1.78E+01,28.459,7.779,88.63,0.00054,0.00036,77.04 3,1.055918,0.01256313,-0.0591,0.0129,9.834856,0.1109255,-2.4819,0.0122,-0.002955142,3936.4946,85.3255,49.9949149,41.5370016,3.98E+01,1.23E+01,1.54E+01,6.83,2.336,24.13,0.06362,0.01965,23.98 4,151.2355,0.01260153,-5.4491,0.0001,184.0693,0.03634057,-5.6625,0.0002,-0.002626019,3409.2642,76.9891,49.9931935,41.5442109,4.02E+00,4.35E+00,-1.47E-03,2.086,2.005,-89.75,0.00227,0.00198,66.61 5,0.3506025,0.01258874,1.138,0.039,0.3466277,0.01300407,1.1503,0.0407,-0.002441164,3351.9893,8.9147,49.9942299,41.5451727,4.97E-01,5.07E-01,7.21E-03,0.715,0.702,62.75,0.02,0.01989,82.88 6,1.166133,0.01257594,-0.1669,0.0117,0.005819145,0.009692424,5.5879,1.8089,-0.003201006,3476.9932,10,49.9946543,41.5434658,5.88E-01,8.33E-02,0.00E+00,0.767,0.289,0,0.01497,0.00499,0 7,0.1372164,0.0125503,2.1565,0.0993,0.1238123,0.02608246,2.2681,0.2288,-0.003556473,3535.5281,13.4586,49.9947993,41.5426587,2.49E-01,2.48E-01,-7.69E-03,0.506,0.491,-43.27,0.05264,0.05237,-55.87 8,0.6174777,0.01260153,0.5234,0.0222,0.6206718,0.01300407,0.5178,0.0228,-0.002441164,3357.0044,20.0487,49.9940449,41.5450748,5.10E-01,5.22E-01,-6.28E-03,0.724,0.712,-66.7,0.01194,0.01192,84.82 9,1.46848,0.01260153,-0.4172,0.0093,0.001897994,0.009688255,6.8043,5.5435,-0.003612399,3584.0171,16,49.9949252,41.5419909,5.87E-01,8.33E-02,0.00E+00,0.766,0.289,0,0.01175,0.00392,0 10,1.452348,0.01258874,-0.4052,0.0094,3.124427,0.04807406,-1.2369,0.0167,-0.003148756,3805.6069,39.5791,49.9952831,41.5389075,2.25E+00,3.87E+00,-6.77E-01,2.03,1.416,-70.08,0.0302,0.01891,-67.61 11,0.1548658,0.01260153,2.0251,0.0884,0.1777253,0.01630147,1.8756,0.0996,-0.002919044,3459.7681,25.6248,49.9943085,41.5436591,4.64E-01,2.34E-01,8.40E-02,0.701,0.455,18.09,0.05739,0.03321,18.33 12,0.5046132,0.01253746,0.7426,0.027,0.7798272,0.04462456,0.27,0.0621,-0.00261193,3418.9119,65.5326,49.9934365,41.5441099,6.87E-01,2.77E+00,-2.92E-01,1.678,0.804,-82.19,0.05363,0.02182,-83.28 13,0.380733,0.01260153,1.0484,0.0359,0.4313257,0.01605258,0.913,0.0404,-0.003497544,3548.8484,34.5602,49.9944623,41.542421,8.27E-01,8.51E-01,8.92E-02,0.964,0.865,48.75,0.03776,0.03252,30.61 14,0.1643925,0.01258874,1.9603,0.0832,0.2181225,0.01839054,1.6532,0.0916,-0.003121084,3710.6785,33.3215,49.9950598,41.5402182,2.18E-01,2.18E-01,1.03E-01,0.567,0.339,45,0.0757,0.04376,45 15,0.3959635,0.01260153,1.0059,0.0346,0.9984215,0.0763398,0.0017,0.083,-0.003106286,3805.9988,48.3363,49.995125,41.5388789,1.87E+00,3.12E+00,4.86E-01,1.813,1.304,71.09,0.0559,0.04105,67.61 16,0.1625628,0.01260153,1.9724,0.0842,0.3490304,0.02234424,1.1428,0.0695,-0.002472953,3410.77,38.0388,49.9939083,41.544294,1.77E-01,4.75E-01,8.92E-03,0.689,0.421,88.29,0.0769,0.04707,89.86 17,0.1725209,0.01260153,1.9079,0.0793,0.2965718,0.02357189,1.3197,0.0863,-0.003454017,3629.0247,40.9706,49.9946304,41.541311,3.73E-01,7.91E-01,-3.73E-01,1.004,0.393,-59.65,0.09781,0.03734,-58.27 18,0.3034717,0.01260153,1.2947,0.0451,0.5031242,0.02774418,0.7458,0.0599,-0.003073985,4079.0825,42,49.9962105,41.5351731,6.68E-01,8.33E-02,0.00E+00,0.818,0.289,0,0.06348,0.02106,0 19,1.593927,0.01260153,-0.5062,0.0086,1.860803,0.0219809,-0.6743,0.0128,-0.003038161,4065.9434,58.3703,49.9958657,41.5353087,1.75E+00,1.41E+00,-7.15E-03,1.323,1.188,-1.21,0.01697,0.01464,-0.43 20,0.5464995,0.01258874,0.656,0.025,0.5661472,0.0144696,0.6177,0.0278,-0.003053429,4045.0474,54.439,49.9958631,41.535604,5.43E-01,8.46E-01,-1.22E-03,0.92,0.737,-89.77,0.02257,0.01649,-89.72 21,1.303251,0.01253746,-0.2876,0.0104,1.296672,0.01418861,-0.2821,0.0119,-0.00259741,4240.1406,55.2714,49.9965409,41.5329423,6.05E-01,6.81E-01,7.89E-03,0.826,0.777,84.15,0.00892,0.00852,69.62 22,0.5174786,0.01260153,0.7153,0.0264,0.5260691,0.01390194,0.6974,0.0287,-0.003019847,3828.95,55.19,49.9950817,41.5385478,5.18E-01,7.56E-01,-6.34E-02,0.879,0.709,-75.96,0.0236,0.01643,-75.02 23,0.1551826,0.01260153,2.0229,0.0882,0.166565,0.01726119,1.946,0.1125,-0.003271136,3504.7439,52.7386,49.9939745,41.5429739,1.91E-01,6.86E-01,1.89E-01,0.866,0.356,71.33,0.10376,0.04235,71.56 24,0.2214222,0.01260153,1.6369,0.0618,0.2389908,0.01360924,1.554,0.0618,-0.00285033,3750.3167,54.0027,49.994824,41.5396229,4.32E-01,5.51E-01,1.68E-03,0.742,0.657,89.18,0.04862,0.04505,89.94 25,0.1336059,0.01253746,2.1854,0.1019,0.1320868,0.009830156,2.1979,0.0808,-0.002921393,3459.6851,51.7091,49.9938331,41.5435908,2.16E-01,2.06E-01,-9.16E-02,0.55,0.345,-43.52,0.06231,0.03626,-45.19 26,0.1703959,0.01260153,1.9214,0.0803,0.1577456,0.0152816,2.0051,0.1052,-0.002779523,3446.95,49,49.9938372,41.5437717,7.29E-01,8.33E-02,0.00E+00,0.854,0.289,0,0.11183,0.03721,0 27,1.896325,0.01258874,-0.6948,0.0072,1.941203,0.0152816,-0.7202,0.0085,-0.00306097,3809.6836,57.8143,49.9949655,41.5388035,7.38E-01,6.80E-01,7.46E-03,0.86,0.824,7.18,0.00713,0.00678,59.71 28,0.6522877,0.01260153,0.4639,0.021,0.1713469,0.01312423,1.9153,0.0832,-0.002447558,4271.9614,52,49.9967135,41.5325172,5.92E-01,8.33E-02,0.00E+00,0.77,0.289,0,0.0274,0.00913,0 29,0.1370073,0.0125503,2.1581,0.0995,0.101415,0.02614047,2.4847,0.2799,-0.002207851,4324.667,55.3374,49.99684,41.5317898,2.22E-01,2.24E-01,1.12E-01,0.579,0.332,45.18,0.07753,0.04476,45 30,0.2240251,0.01253746,1.6243,0.0608,0.2254432,0.01360924,1.6174,0.0656,-0.003037372,3960.3042,58.9024,49.9954807,41.5367473,4.18E-01,4.81E-01,-1.07E-02,0.695,0.645,-80.65,0.03802,0.03492,-88.86
You are complicating the program by nesting all the loops and conditionals. Break it down into simple steps. Do the following. 1. Read both the csv files and convert them into 2d lists. 2. Compare the columns/values of the lists within a loop based on the given index, add the rows from second list to a new output list. 3. Write the output list to a csv file. def read_file(filepath): with open(filepath,'r') as f: x = csv.reader(f) l = list(x) return l l435 = read_file('F435W.csv') l550 = read_file('F550M.csv') new_F550M = [] r = 0.002778 for i in l550: for j in l435: # I did't exactly get your if condition, so I am putting it down based on what I understood, so if it is wrong, modify it accordingly. if isfloat(i[12]) and isfloat(j[12]) and abs(float(i[12]) float(j[12])) < r: if isfloat(i[13]) and isfloat(j[13]) and abs(float(i[13]) float(j[13])) < r: new_F550M.append(i) with open('new_F550M.csv','w') as f: out = csv.writer(f) out.writerows(new_F550M)
Python - Reading a CSV, won't print the contents of the last column
I'm pretty new to Python, and put together a script to parse a csv and ultimately output its data into a repeated html table. I got most of it working, but there's one weird problem I haven't been able to fix. My script will find the index of the last column, but won't print out the data in that column. If I add another column to the end, even an empty one, it'll print out the data in the formerly-last column - so it's not a problem with the contents of that column. Abridged (but still grumpy) version of the code: import os os.chdir('C:\\Python34\\andrea') import csv csvOpen = open('my.csv') exampleReader = csv.reader(csvOpen) tableHeader = next(exampleReader) if 'phone' in tableHeader: phoneIndex = tableHeader.index('phone') else: phoneIndex = -1 for row in exampleReader: row[-1] ='' print(phoneIndex) print(row[phoneIndex]) csvOpen.close() my.csv stuff,phone 1,3235556177 1,3235556170 Output 1 1 Same script, small change to the CSV file: my.csv stuff,phone,more 1,3235556177, 1,3235556170, Output 1 3235556177 1 3235556170 I'm using Python 3.4.3 via Idle 3.4.3 I've had the same problem with CSVs generated directly by mysql, ones that I've opened in Excel first then re-saved as CSVs, and ones I've edited in Notepad++ and re-saved as CSVs. I tried adding several different modes to the open function (r, rU, b, etc.) and either it made no difference or gave me an error (for example, it didn't like 'b'). My workaround is just to add an extra column to the end, but since this is a frequently used script, it'd be much better if it just worked right. Thank you in advance for your help.
row[-1] ='' The CSV reader returns to you a list representing the row from the file. On this line you set the last value in the list to an empty string. Then you print it afterwards. Delete this line if you don't want the last column to be set to an empty string.
If you know it is the last column, you can count them and then use that value minus 1. Likewise you can use your string comparison method if you know it will always be "phone". I recommend if you are using the string compare, convert the value from the csv to lower case so that you don't have to worry about capitalization. In my code below I created functions that show how to use either method. import os import csv os.chdir('C:\\temp') csvOpen = open('my.csv') exampleReader = csv.reader(csvOpen) tableHeader = next(exampleReader) phoneColIndex = None;#init to a value that can imply state lastColIndex = None;#init to a value that can imply state def getPhoneIndex(header): for i, col in enumerate(header): #use this syntax to get index of item if col.lower() == 'phone': return i; return -1; #send back invalid index def findLastColIndex(header): return len(tableHeader) - 1; ## methods to check for phone col. 1. by string comparison #and 2. by assuming it's the last col. if len(tableHeader) > 1:# if only one row or less, why go any further? phoneColIndex = getPhoneIndex(tableHeader); lastColIndex = findLastColIndex(tableHeader) for row in exampleReader: print(row[phoneColIndex]) print('----------') print(row[lastColIndex]) print('----------') csvOpen.close()
Python row.replace issue
Started fiddling with Python for the first time a week or so ago and have been trying to create a script that will replace instances of a string in a file with a new string. The actual reading and creation of a new file with intended strings seems to be successful, but error checking at the end of the file displays output suggesting that there is an error. I checked a few other threads but couldn't find a solution or alternative that fit what I was looking for or was at a level I was comfortable working with. Apologies for messy/odd code structure, I am very new to the language. Initial four variables are example values. editElement = "Testvalue" newElement = "Testvalue2" readFile = "/Users/Euan/Desktop/Testfile.csv" writeFile = "/Users/Euan/Desktop/ModifiedFile.csv" editelementCount1 = 0 newelementCount1 = 0 editelementCount2 = 0 newelementCount2 = 0 #Reading from file print("Reading file...") file1 = open(readFile,'r') fileHolder = file1.readlines() file1.close() #Creating modified data fileHolder_replaced = [row.replace(editElement, newElement) for row in fileHolder] #Writing to file file2 = open(writeFile,'w') file2.writelines(fileHolder_replaced) file2.close() print("Modified file generated!") #Error checking for row in fileHolder: if editElement in row: editelementCount1 +=1 for row in fileHolder: if newElement in row: newelementCount1 +=1 for row in fileHolder_replaced: if editElement in row: editelementCount2 +=1 for row in fileHolder_replaced: if newElement in row: newelementCount2 +=1 print(editelementCount1 + newelementCount1) print(editelementCount2 +newelementCount2) Expected output would be the last two instances of 'print' displaying the same value, however... The first instance of print returns the value of A + B as expected. The second line only returns the value of B (from fileHolder), and from what I can see, A has indeed been converted to B (In fileHolder_replaced). Edit: For example, if the first two counts show A and B to be 2029 and 1619 respectively (fileHolder), the last two counts show A as 0 and B as 2029 (fileHolder_replace). Obviously this is missing the original value of B.
So in am more exdented version as in the comment. If you look for "TestValue" in the modified file, it will find the string, even if you assume it is "TestValue2". Thats because the originalvalue is a substring of the modified value. Therefore it should find twice the number of occurences. Or more precise the number of lines in which the string occurs. If you query if newElement in row It will have a look if the string newElement is contained in the string row