A dynamic list/array which contains matrices and vectors - python

so I have the following setup. I have a folder with two different file types. One file type (.m) are files of matrices and the other file type are the corresponding vectors (.txt). They have the same name i.e. matrix1.m and matrix1.txt
I want to create a list or an array such that I can save each matrix and each corresponding vector. I thought about a panda data frame, but I do not know if this is useful here.
Is there a way to do this?
for file1 in os.listdir(path):
file1name = os.fsdecode(file1)
if file1.endswith('.m'):
file2 = file1name.replace('.m', '.txt')
file2name = os.fsdecode(file2)
matrix = np.matrix(scipy.io.mmread(os.path.join(path, file1name)).toarray())
vector = np.loadtxt(os.path.join(path, file2name)
continue

This is how I would approach something like this given-
I need to do and store some result for each pair of files
Each pair is independent
The reason the code above isn't enough is because you wanted to try and speed things up.
Don't know if this will help, but it is one approach.
pool = ThreadPool()
op_names = {name.split('.')[0] for name in os.listdir(path)}
def operation(name):
matrix_filename = f'name.m'
vector_filename = f'name.txt'
return name, do_op(matrix_filename, vector_filename)
results = pool.map(operation, op_names)

Related

How to read in multiple documents with same code?

So I have a couple of documents, of which each has a x and y coordinate (among other stuff). I wrote some code which is able to filter out said x and y coordinates and store them into float variables.
Now Ideally I'd want to find a way to run the same code on all documents I have (number not fixed, but let's say 3 for now), extract x and y coordinates of each document and calculate an average of these 3 x-values and 3 y-values.
How would I approach this? Never done before.
I successfully created the code to extract the relevant data from 1 file.
Also note: In reality each file has more than just 1 set of x and y coordinates but this does not matter for the problem discussed at hand.
I'm just saying that so that the code does not confuse you.
with open('TestData.txt', 'r' ) as f:
full_array = f.readlines()
del full_array[1:31]
del full_array[len(full_array)-4:len(full_array)]
single_line = full_array[1].split(", ")
x_coord = float(single_line[0].replace("1 Location: ",""))
y_coord = float(single_line[1])
size = float(single_line[3].replace("Size: ",""))
#Remove unecessary stuff
category= single_line[6].replace(" Type: Class: 1D Descr: None","")
In the end I'd like to not have to write the same code for each file another time, especially since the amount of files may vary. Now I have 3 files which equals to 3 sets of coordinates. But on another day I might have 5 for example.
Use os.walk to find the files that you want. Then for each file do you calculation.
https://docs.python.org/2/library/os.html#os.walk
First of all create a method to read a file via it's file name and do the parsing in your way. Now iterate through the directory,I guess files are in the same directory.
Here is the basic code:
import os
def readFile(filename):
try:
with open(filename, 'r') as file:
data = file.read()
return data
except:
return ""
for filename in os.listdir('C:\\Users\\UserName\\Documents'):
#print(filename)
data=readFile( filename)
print(data)
#parse here
#do the calculation here

How to find matches between csv files based on two columns within a range

I'm currently struggling to put together some code that will find the matches of values in two different columns in two csv files within a range. I have tried using the code below, but it doesn't output what I am trying to accomplish. Basically, I want to output a new file that contains all of the lines in the second file that have matches to the same columns in the first file, not merge them together. I've added more detailed clarification below my code. I feel like what I've done so far is probably completely wrong. What do I need to change in order for my code to produce the results I am looking for?
import csv
with open('F435W.csv') as csvF435:
readCSV1 = csv.reader(csvF435, delimiter=',')
with open("F550Mnew.csv", "w") as new_F550M:
pass
with open("F550Mnew.csv", "a") as new_F550M:
for header in readCSV1:
new_F550M.write(','.join(header)+'\n')
break
for l435 in readCSV1:
with open('F550M.csv') as csvF550:
readCSV2 = csv.reader(csvF550, delimiter=',')
for l550 in readCSV2:
if isfloat(l435[12]) and isfloat(l550[12]) and abs(float(l435[12])-float(l550[12])) < 0.002778:
if isfloat(l435[13]) and isfloat(l550[13]) and abs(float(l435[13])-float(l550[13])) < 0.002778:
new_F550M.write(','.join(l550)+'\n')
For clarification, each file has an X column and a Y column so basically each row corresponds to an (X,Y) point. In addition, there are 21 other columns of data that are not necessary for finding matches, but need to be included in the final output file. I am trying to find points in the second file that match the points in the first file within a radius. This is because I know that none of my points will be exact matches. In my data, my X is column 13 and my Y is column 14.
The way I have tried to accomplish this is by finding the differences between every X in the first file and every X in the second file (eg. X1-X2), and the differences between every Y in the first file and every Y in the second file (eg. Y1-Y2). Then, every row in the second file which corresponds to differences for both X and Y which are less than my radius value (0.0002778) would be considered a match to the first file.
Unfortunately, my code produces a file with over 300,000 points when my original files only have 7000 points. There should be less data, not more data. It also includes many repeats of data, when there should not be any repeats at all.
Thank you for your time!
Sample of what the data looks like: I apologize for the length, but I am afraid they will not contain enough matches to be useful if I don't include enough of the data.
F435W.csv (file 1)
1,2017.013,0.01242859,-8.2618,0,51434.12,0.3269918,-11.7781,0,0.01957931,1387.9406,541.916,49.9898514,41.5266996,8.81E+01,1.63E+03,1.44E+02,40.535,8.65,84.72,0.00061,0.00035,62.14
2,84.73392,0.01245409,-4.8201,0.0002,112.9723,0.04012135,-5.1324,0.0004,-0.002142646,150.306,146.7986,49.9942613,41.5444392,4.92E+00,5.60E+00,-2.02E-01,2.379,2.206,-74.69,0.00339,0.0029,88.88
3,215.1939,0.01242859,-5.8321,0.0001,262.2751,0.03840466,-6.0469,0.0002,-0.002961465,3248.686,52.8478,50.003155,41.5019044,4.77E+00,5.05E+00,-1.63E-01,2.263,2.166,-65.29,0.002,0.0019,-66.78
4,0.3796681,0.01240305,1.0515,0.0355,0.5823653,0.05487975,0.587,0.1023,-0.00425157,3760.344,11.113,50.0051049,41.4949256,1.93E+00,1.02E+00,-7.42E-02,1.393,1.007,-4.61,0.05461,0.03818,-6.68
5,0.9584663,0.01249223,0.0461,0.0142,1.043696,0.0175857,-0.0464,0.0183,-0.004156116,4013.2063,9.1225,50.0057256,41.4914444,1.12E+00,9.75E-01,1.09E-01,1.085,0.957,28.34,0.01934,0.01745,44.01
6,2.379565,0.01249223,-0.9412,0.0057,0.231205,0.02710035,1.59,0.1273,-0.004135321,3824.3706,9.0756,50.0052903,41.4940468,7.81E-01,6.99E-02,4.27E-02,0.885,0.26,3.42,0.01265,0.00622,15.52
7,0.3171223,0.01250492,1.2469,0.0428,0.5233852,0.05406558,0.7029,0.1122,-0.00399635,4097.3604,7.0301,50.0059585,41.4902884,9.61E-01,1.63E+00,-3.94E-01,1.346,0.883,-65.16,0.06171,0.04005,-65.05
8,0.289245,0.0125176,1.3468,0.047,0.2744479,0.02238134,1.4039,0.0886,-0.004173243,3904.7402,7.3912,50.0055069,41.4929422,7.90E-01,2.38E-01,7.13E-02,0.894,0.479,7.24,0.04501,0.02071,8.29
9,0.3543034,0.01247953,1.1266,0.0383,0.7666836,0.06376094,0.2885,0.0903,-0.004009248,4107.0684,3.259,50.0060503,41.4901611,3.53E+00,1.28E+00,-4.60E-01,1.903,1.09,-11.12,0.06873,0.03955,-11.22
10,1.308331,0.01250492,-0.2918,0.0104,-0.005209296,0.004877397,99,99,-0.004193406,3933.9834,6,50.0056001,41.4925416,5.78E-01,8.33E-02,0.00E+00,0.76,0.289,0,0.01272,0.00424,0
11,3.995717,0.01250492,-1.504,0.0034,0.1589517,0.007450347,1.9968,0.0509,-0.003990021,4069.0469,3.0234,50.0059668,41.4906855,8.03E-01,2.29E-02,1.02E-02,0.896,0.151,0.75,0.00888,0.00361,5.59
12,1.067634,0.01250492,-0.0711,0.0127,0.1260926,0.02787585,2.2483,0.2401,-0.004042602,4048.9148,4,50.0059023,41.4909612,7.40E-01,8.33E-02,0.00E+00,0.86,0.289,0,0.02449,0.00576,0
13,0.2808423,0.01162418,1.3788,0.0449,0.4633991,0.02235104,0.8351,0.0524,-0.004015559,4114.6655,2.0641,50.0060898,41.4900585,9.65E-01,5.88E-01,-9.47E-02,0.994,0.752,-13.34,0.05405,0.03814,-15.13
14,1.067291,0.01245409,-0.0707,0.0127,1.081617,0.01516444,-0.0852,0.0152,-0.004168633,3960.8787,18.0524,50.0054405,41.4921501,6.84E-01,8.29E-01,-6.18E-02,0.923,0.813,-69.77,0.01468,0.01229,-78.83
15,0.5216251,0.0125176,0.7066,0.0261,0.584776,0.01824955,0.5825,0.0339,-0.003026338,2661.6533,58.4563,50.0016952,41.5099844,8.51E-01,1.17E+00,-7.27E-02,1.089,0.914,-77.72,0.03244,0.02498,-81.68
16,0.6062042,0.01249223,0.5435,0.0224,0.8726375,0.05509822,0.1479,0.0686,-0.003950399,4149.8169,31.0127,50.0056384,41.489524,9.30E-01,3.48E+00,2.03E-01,1.87,0.956,85.48,0.05307,0.0241,86.01
17,0.1324067,0.01242859,2.1952,0.1019,0.1208224,0.01290438,2.2946,0.116,-0.004166729,3911.6807,12.661,50.005426,41.4928374,2.17E-01,2.24E-01,-1.08E-01,0.574,0.335,-45.89,0.0721,0.04162,-44.98
18,0.2136006,0.01247953,1.676,0.0634,0.3511444,0.02471001,1.1363,0.0764,-0.003978713,4096.9111,15.6285,50.0057993,41.4902797,1.00E+00,4.37E-01,2.85E-01,1.058,0.564,22.64,0.07548,0.03957,23.17
19,0.1470979,0.01244135,2.081,0.0919,0.1216703,0.0168958,2.287,0.1508,-0.004147241,3695.311,13.7044,50.004907,41.4958173,2.14E-01,2.08E-01,9.20E-02,0.551,0.345,44.05,0.07073,0.04115,45.12
20,0.5434682,0.01250492,0.6621,0.025,0.5819249,0.01592951,0.5878,0.0297,-0.004136056,3866.6416,24.8316,50.0050981,41.493437,8.34E-01,9.96E-01,2.74E-01,1.096,0.793,53.22,0.02966,0.02055,58.08
21,0.2259093,0.01249223,1.6152,0.0601,0.2848583,0.01867901,1.3634,0.0712,-0.00409535,3645.521,20.0162,50.0046759,41.4964926,5.71E-01,4.26E-01,-1.11E-02,0.756,0.652,-4.34,0.03735,0.0305,0.08
22,0.9499883,0.01247953,0.0557,0.0143,0.9711754,0.01891141,0.0318,0.0211,-0.003134006,3378.7927,19.5305,50.0040686,41.5001691,8.66E-01,4.09E-01,3.57E-03,0.931,0.639,0.45,0.01623,0.01142,-1.19
23,1.125635,0.01240305,-0.1285,0.012,1.050538,0.02402694,-0.0535,0.0248,-0.003295973,3132.9458,24.9024,50.0034018,41.5035477,9.65E-01,7.83E-01,-1.44E-01,1.022,0.839,-28.88,0.01702,0.01288,-21
24,0.168302,0.01249223,1.9348,0.0806,0.2447732,0.01930529,1.5281,0.0857,-0.004140488,3904.7268,27.0386,50.0051454,41.4929084,4.47E-01,4.56E-01,-1.28E-02,0.682,0.662,-54.61,0.04399,0.04068,89.66
25,0.0542859,0.01244135,3.1633,0.2489,0.08799078,0.007964755,2.6389,0.0983,-0.003241792,3454.2612,25.2749,50.0041373,41.4991191,1.93E-01,1.99E-01,-7.18E-02,0.518,0.353,-46.27,0.06408,0.03839,-44.76
26,0.4379335,0.01242859,0.8965,0.0308,0.4661828,0.01542368,0.8286,0.0359,-0.00336337,3478.7058,32.3355,50.0040639,41.4987701,6.15E-01,8.96E-01,-2.91E-02,0.948,0.782,-84.15,0.02891,0.02521,-70.04
27,0.1515608,0.01249223,2.0485,0.0895,0.1935181,0.01712885,1.7832,0.0961,-0.002904789,2982.0017,29.9904,50.0029594,41.505619,3.46E-01,3.61E-01,1.55E-05,0.601,0.588,89.94,0.05241,0.05241,-80.48
28,0.6658883,0.01250492,0.4415,0.0204,0.718064,0.01780974,0.3596,0.0269,-0.00324104,3408.0103,36.2539,50.0038284,41.4997375,9.45E-01,1.11E+00,1.98E-01,1.115,0.902,56.45,0.02706,0.02147,51.52
29,0.7244126,0.01244135,0.35,0.0187,1.030102,0.02744665,-0.0322,0.0289,-0.00280412,3259.0889,37.3165,50.0034648,41.5017879,8.65E-01,1.01E+00,5.85E-02,1.017,0.919,70.87,0.02225,0.02011,55.79
30,0.1651701,0.01247953,1.9552,0.0821,0.163293,0.01641976,1.9676,0.1092,-0.003909466,3595.4846,31.9761,50.0043403,41.4971614,2.50E-01,4.42E-01,2.21E-01,0.766,0.324,56.75,0.08087,0.03087,58.28
F550M.csv (file 2)
2,1921.566,0.01258874,-8.2091,0,37128.06,0.2618096,-11.4243,0,0.01455503,4617.5225,554.576,49.9887896,41.5264699,6.09E+01,8.09E+02,1.78E+01,28.459,7.779,88.63,0.00054,0.00036,77.04
3,1.055918,0.01256313,-0.0591,0.0129,9.834856,0.1109255,-2.4819,0.0122,-0.002955142,3936.4946,85.3255,49.9949149,41.5370016,3.98E+01,1.23E+01,1.54E+01,6.83,2.336,24.13,0.06362,0.01965,23.98
4,151.2355,0.01260153,-5.4491,0.0001,184.0693,0.03634057,-5.6625,0.0002,-0.002626019,3409.2642,76.9891,49.9931935,41.5442109,4.02E+00,4.35E+00,-1.47E-03,2.086,2.005,-89.75,0.00227,0.00198,66.61
5,0.3506025,0.01258874,1.138,0.039,0.3466277,0.01300407,1.1503,0.0407,-0.002441164,3351.9893,8.9147,49.9942299,41.5451727,4.97E-01,5.07E-01,7.21E-03,0.715,0.702,62.75,0.02,0.01989,82.88
6,1.166133,0.01257594,-0.1669,0.0117,0.005819145,0.009692424,5.5879,1.8089,-0.003201006,3476.9932,10,49.9946543,41.5434658,5.88E-01,8.33E-02,0.00E+00,0.767,0.289,0,0.01497,0.00499,0
7,0.1372164,0.0125503,2.1565,0.0993,0.1238123,0.02608246,2.2681,0.2288,-0.003556473,3535.5281,13.4586,49.9947993,41.5426587,2.49E-01,2.48E-01,-7.69E-03,0.506,0.491,-43.27,0.05264,0.05237,-55.87
8,0.6174777,0.01260153,0.5234,0.0222,0.6206718,0.01300407,0.5178,0.0228,-0.002441164,3357.0044,20.0487,49.9940449,41.5450748,5.10E-01,5.22E-01,-6.28E-03,0.724,0.712,-66.7,0.01194,0.01192,84.82
9,1.46848,0.01260153,-0.4172,0.0093,0.001897994,0.009688255,6.8043,5.5435,-0.003612399,3584.0171,16,49.9949252,41.5419909,5.87E-01,8.33E-02,0.00E+00,0.766,0.289,0,0.01175,0.00392,0
10,1.452348,0.01258874,-0.4052,0.0094,3.124427,0.04807406,-1.2369,0.0167,-0.003148756,3805.6069,39.5791,49.9952831,41.5389075,2.25E+00,3.87E+00,-6.77E-01,2.03,1.416,-70.08,0.0302,0.01891,-67.61
11,0.1548658,0.01260153,2.0251,0.0884,0.1777253,0.01630147,1.8756,0.0996,-0.002919044,3459.7681,25.6248,49.9943085,41.5436591,4.64E-01,2.34E-01,8.40E-02,0.701,0.455,18.09,0.05739,0.03321,18.33
12,0.5046132,0.01253746,0.7426,0.027,0.7798272,0.04462456,0.27,0.0621,-0.00261193,3418.9119,65.5326,49.9934365,41.5441099,6.87E-01,2.77E+00,-2.92E-01,1.678,0.804,-82.19,0.05363,0.02182,-83.28
13,0.380733,0.01260153,1.0484,0.0359,0.4313257,0.01605258,0.913,0.0404,-0.003497544,3548.8484,34.5602,49.9944623,41.542421,8.27E-01,8.51E-01,8.92E-02,0.964,0.865,48.75,0.03776,0.03252,30.61
14,0.1643925,0.01258874,1.9603,0.0832,0.2181225,0.01839054,1.6532,0.0916,-0.003121084,3710.6785,33.3215,49.9950598,41.5402182,2.18E-01,2.18E-01,1.03E-01,0.567,0.339,45,0.0757,0.04376,45
15,0.3959635,0.01260153,1.0059,0.0346,0.9984215,0.0763398,0.0017,0.083,-0.003106286,3805.9988,48.3363,49.995125,41.5388789,1.87E+00,3.12E+00,4.86E-01,1.813,1.304,71.09,0.0559,0.04105,67.61
16,0.1625628,0.01260153,1.9724,0.0842,0.3490304,0.02234424,1.1428,0.0695,-0.002472953,3410.77,38.0388,49.9939083,41.544294,1.77E-01,4.75E-01,8.92E-03,0.689,0.421,88.29,0.0769,0.04707,89.86
17,0.1725209,0.01260153,1.9079,0.0793,0.2965718,0.02357189,1.3197,0.0863,-0.003454017,3629.0247,40.9706,49.9946304,41.541311,3.73E-01,7.91E-01,-3.73E-01,1.004,0.393,-59.65,0.09781,0.03734,-58.27
18,0.3034717,0.01260153,1.2947,0.0451,0.5031242,0.02774418,0.7458,0.0599,-0.003073985,4079.0825,42,49.9962105,41.5351731,6.68E-01,8.33E-02,0.00E+00,0.818,0.289,0,0.06348,0.02106,0
19,1.593927,0.01260153,-0.5062,0.0086,1.860803,0.0219809,-0.6743,0.0128,-0.003038161,4065.9434,58.3703,49.9958657,41.5353087,1.75E+00,1.41E+00,-7.15E-03,1.323,1.188,-1.21,0.01697,0.01464,-0.43
20,0.5464995,0.01258874,0.656,0.025,0.5661472,0.0144696,0.6177,0.0278,-0.003053429,4045.0474,54.439,49.9958631,41.535604,5.43E-01,8.46E-01,-1.22E-03,0.92,0.737,-89.77,0.02257,0.01649,-89.72
21,1.303251,0.01253746,-0.2876,0.0104,1.296672,0.01418861,-0.2821,0.0119,-0.00259741,4240.1406,55.2714,49.9965409,41.5329423,6.05E-01,6.81E-01,7.89E-03,0.826,0.777,84.15,0.00892,0.00852,69.62
22,0.5174786,0.01260153,0.7153,0.0264,0.5260691,0.01390194,0.6974,0.0287,-0.003019847,3828.95,55.19,49.9950817,41.5385478,5.18E-01,7.56E-01,-6.34E-02,0.879,0.709,-75.96,0.0236,0.01643,-75.02
23,0.1551826,0.01260153,2.0229,0.0882,0.166565,0.01726119,1.946,0.1125,-0.003271136,3504.7439,52.7386,49.9939745,41.5429739,1.91E-01,6.86E-01,1.89E-01,0.866,0.356,71.33,0.10376,0.04235,71.56
24,0.2214222,0.01260153,1.6369,0.0618,0.2389908,0.01360924,1.554,0.0618,-0.00285033,3750.3167,54.0027,49.994824,41.5396229,4.32E-01,5.51E-01,1.68E-03,0.742,0.657,89.18,0.04862,0.04505,89.94
25,0.1336059,0.01253746,2.1854,0.1019,0.1320868,0.009830156,2.1979,0.0808,-0.002921393,3459.6851,51.7091,49.9938331,41.5435908,2.16E-01,2.06E-01,-9.16E-02,0.55,0.345,-43.52,0.06231,0.03626,-45.19
26,0.1703959,0.01260153,1.9214,0.0803,0.1577456,0.0152816,2.0051,0.1052,-0.002779523,3446.95,49,49.9938372,41.5437717,7.29E-01,8.33E-02,0.00E+00,0.854,0.289,0,0.11183,0.03721,0
27,1.896325,0.01258874,-0.6948,0.0072,1.941203,0.0152816,-0.7202,0.0085,-0.00306097,3809.6836,57.8143,49.9949655,41.5388035,7.38E-01,6.80E-01,7.46E-03,0.86,0.824,7.18,0.00713,0.00678,59.71
28,0.6522877,0.01260153,0.4639,0.021,0.1713469,0.01312423,1.9153,0.0832,-0.002447558,4271.9614,52,49.9967135,41.5325172,5.92E-01,8.33E-02,0.00E+00,0.77,0.289,0,0.0274,0.00913,0
29,0.1370073,0.0125503,2.1581,0.0995,0.101415,0.02614047,2.4847,0.2799,-0.002207851,4324.667,55.3374,49.99684,41.5317898,2.22E-01,2.24E-01,1.12E-01,0.579,0.332,45.18,0.07753,0.04476,45
30,0.2240251,0.01253746,1.6243,0.0608,0.2254432,0.01360924,1.6174,0.0656,-0.003037372,3960.3042,58.9024,49.9954807,41.5367473,4.18E-01,4.81E-01,-1.07E-02,0.695,0.645,-80.65,0.03802,0.03492,-88.86
You are complicating the program by nesting all the loops and conditionals. Break it down into simple steps.
Do the following.
1. Read both the csv files and convert them into 2d lists.
2. Compare the columns/values of the lists within a loop based on the given index, add the rows from second list to a new output list.
3. Write the output list to a csv file.
def read_file(filepath):
with open(filepath,'r') as f:
x = csv.reader(f)
l = list(x)
return l
l435 = read_file('F435W.csv')
l550 = read_file('F550M.csv')
new_F550M = []
r = 0.002778
for i in l550:
for j in l435:
# I did't exactly get your if condition, so I am putting it down based on what I understood, so if it is wrong, modify it accordingly.
if isfloat(i[12]) and isfloat(j[12]) and abs(float(i[12]) float(j[12])) < r:
if isfloat(i[13]) and isfloat(j[13]) and abs(float(i[13]) float(j[13])) < r:
new_F550M.append(i)
with open('new_F550M.csv','w') as f:
out = csv.writer(f)
out.writerows(new_F550M)

How to iteratively bring in different layers from a folder and assign them each different variable names?

I have been working with some code that exports layers individually filled with important data into a folder. The next thing I want to do is bring each one of those layers into a different program so that I can combine them and do some different tests. The current way that I know how to do it is by importing them one by one (as seen below).
fn0 = 'layer0'
f0 = np.genfromtxt(fn0 + '.csv', delimiter=",")
fn1 = 'layer1'
f1 = np.genfromtxt(fn1 + '.csv', delimiter=",")
The issue with continuing this way is that I may have to deal with up to 100 layers at a time, and it would be very inconvenient to have to import each layer individually.
Is there a way I can change my code to do this iteratively so that I can have a code similar to such:
N = 100
for i in range(N)
fn(i) = 'layer(i)'
f(i) = np.genfromtxt(fn(i) + '.csv', delimiter=",")
Please let me know if you know of any ways!
you can use string formatting as follows
N = 100
f = [] #create an empty list
for i in range(N)
fn_i = 'layer(%d)'%i #parentheses!
f.append(np.genfromtxt(fn_i + '.csv', delimiter=",")) #add to f
What I mean by parentheses! is that they are 'important' characters. They indicate function calls and tuples, so you shouldn't use them in variables (ever!)
The answer of Mohammad Athar is correct. However, you should not use the % printing any longer. According to PEP 3101 (https://www.python.org/dev/peps/pep-3101/) it is supposed to be replaced by format(). Moreover, as you have more than 100 files a format like layer_007.csv is probably appreciated.
Try something like:
dataDict=dict()
for counter in range(214):
fileName = 'layer_{number:03d}.csv'.format(number=counter)
dataDict[fileName] = np.genfromtxt( fileName, delimiter="," )
When using a dictionary, like here, you can directly access your data later by using the file name; it is unsorted though, such that you might prefer the list version of Mohammad Athar.

How do I get a text output from a string created from an array to remain unshortened?

Python/Numpy Problem. Final year Physics undergrad... I have a small piece of code that creates an array (essentially an n×n matrix) from a formula. I reshape the array to a single column of values, create a string from that, format it to remove extraneous brackets etc, then output the result to a text file saved in the user's Documents directory, which is then used by another piece of software. The trouble is above a certain value for "n" the output gives me only the first and last three values, with "...," in between. I think that Python is automatically abridging the final result to save time and resources, but I need all those values in the final text file, regardless of how long it takes to process, and I can't for the life of me find how to stop it doing it. Relevant code copied beneath...
import numpy as np; import os.path ; import os
'''
Create a single column matrix in text format from Gaussian Eqn.
'''
save_path = os.path.join(os.path.expandvars("%userprofile%"),"Documents")
name_of_file = 'outputfile' #<---- change this as required.
completeName = os.path.join(save_path, name_of_file+".txt")
matsize = 32
def gaussf(x,y): #defining gaussian but can be any f(x,y)
pisig = 1/(np.sqrt(2*np.pi) * matsize) #first term
sumxy = (-(x**2 + y**2)) #sum of squares term
expden = (2 * (matsize/1.0)**2) # 2 sigma squared
expn = pisig * np.exp(sumxy/expden) # and put it all together
return expn
matrix = [[ gaussf(x,y) ]\
for x in range(-matsize/2, matsize/2)\
for y in range(-matsize/2, matsize/2)]
zmatrix = np.reshape(matrix, (matsize*matsize, 1))column
string2 = (str(zmatrix).replace('[','').replace(']','').replace(' ', ''))
zbfile = open(completeName, "w")
zbfile.write(string2)
zbfile.close()
print completeName
num_lines = sum(1 for line in open(completeName))
print num_lines
Any help would be greatly appreciated!
Generally you should iterate over the array/list if you just want to write the contents.
zmatrix = np.reshape(matrix, (matsize*matsize, 1))
with open(completeName, "w") as zbfile: # with closes your files automatically
for row in zmatrix:
zbfile.writelines(map(str, row))
zbfile.write("\n")
Output:
0.00970926751178
0.00985735189176
0.00999792646484
0.0101306077521
0.0102550302672
0.0103708481917
0.010477736974
0.010575394844
0.0106635442315
.........................
But using numpy we simply need to use tofile:
zmatrix = np.reshape(matrix, (matsize*matsize, 1))
# pass sep or you will get binary output
zmatrix.tofile(completeName,sep="\n")
Output is in the same format as above.
Calling str on the matrix will give you similarly formatted output to what you get when you try to print so that is what you are writing to the file the formatted truncated output.
Considering you are using python2, using xrange would be more efficient that using rane which creates a list, also having multiple imports separated by colons is not recommended, you can simply:
import numpy as np, os.path, os
Also variables and function names should use underscores z_matrix,zb_file,complete_name etc..
You shouldn't need to fiddle with the string representations of numpy arrays. One way is to use tofile:
zmatrix.tofile('output.txt', sep='\n')

Read all files in ascending order to an array in python

I am trying to read around 200 files in a directory into an array sequentially:
2 approaches:
results = [open(f) for f in glob.glob("*.bin")]
here, this gave me an error that lot of files are opened.
for f in glob.glob("*.bin"):
print f
This gives me an unordered list and I am not sure how to use sorted(f,key=itemgetter(0))
Also, Once I read file 0 into an array, I need to do some array ordering and then concatenate with the data from file 1 and so on till the last file in the directory (Assuming that file 0, file 1 are in ascending order). For this, I declare x0 = 0 and then concatenate it this way:
x = numpy.concatenate((x0, x), axis=1)
Thanks in advance for any suggestions !
Edit 1:
I tried your method in this way:
x0 = numpy.zeros(shape=(1026, 718))
f = sorted(glob.glob('*.bin'))
for f in sorted(glob.glob('*.bin')):
print f ## prints files ordered
x = numpy.concatenate((x0, x), axis=1)
x0 = x
I get the following error:
x = numpy.concatenate((x0, x), axis=1)
MemoryError
Probably want to only have one file open at a time:
def my_open(path):
with open(path, 'r') as f:
return numpy.loadtxt(f)
If you just need sorted order like ls would do, you can pass a list through sorted. If you need something fancier, see the sorting howto.
If you can fit all the records into half your available memory, this is probably a bit faster:
numpy.concatenate(map(my_open, sorted(glob('*.bin'))))
Otherwise you can concatenate in a loop and save memory:
arr = numpy.zeros()
for path in sorted(glob('*.bin')):
arr = numpy.concatenate((arr, my_open(path))
I'm not clear about what error you're getting. Are you trying to sort by filename? Then it would make more sense to do so on the list of filenames returned:
filenames = sorted(glob.glob('*.bin'))
Then you can go through the list of filenames and successively read the files to extract the data you need.
for filename in sorted(glob.glob('*.bin')):
f = open(filename)
# Perform your necessary array ordering
# Concatenate the data
If you want to do this backward, that is read the data from your last file into x, then read and append the data from the second to last, then just change your sort to
sorted(glob.glob('*.bin'), reverse = True)

Categories