Single Array & Data Plotting with it - python

The data I have is printed in a .txt file format and has breaks in between readings, its just one really long line. Each data point is a 16 second average of the z-component of the magnetic field of an incoming particle field. This is currently the code I have typed to ascribe the variable name to the file
Bz = np.loadtxt(r'C:\Users\Schmidt\Desktop\Project\Data\ACE\MAG\ACE_MAG_Data.txt', dtype = str)
and that works fine, but when I ask to print Bz I get
[["b'-1.3695e+01'" "b'-1.3481e+01'"]
["b'-1.3804e+01'" "b'-1.3485e+01'"]
["b'-1.3704e+01'" "b'-1.3437e+01'"]
...,
["b'1.6371e+00'" "b'6.2744e-01'"]
["b'1.6171e+00'" "b'6.1338e-01'"]
["b'1.4028e+00'" "b'3.2874e-01'"]]
What my problem is how did that "b" get there in the first place and how do I tell python that each data point is an individual point instead of pairs like it has it now.
This is the link to the file if you need to see. Just remember to remove the words and the file should act appropriately.

numpy is loading your data as bytes, and marking them as such, with the b character. I tried the following:
data = np.loadtxt("ACE_MAG_Data.txt", dtype=bytes).astype(float)
And it converts everything to floats instead:
>>> data
array([[-13.695 , -13.481 ],
[-13.804 , -13.485 ],
[-13.704 , -13.437 ],
...,
[ 1.6371 , 0.62744],
[ 1.6171 , 0.61338],
[ 1.4028 , 0.32874]])
You mention that these are individual points, not pairs. If you had saved them as points in the file, one per line, numpy wouldn't assume these are pairs - I would too :)
However:
singles = [point for pair in data for point in pair]
Will convert it to a list of single points.

Related

How to to remove a double/nested for-loop? String to float transformation in python

I have the following polygon of a geographic area that I fetch via a request in CAP/XML format from an API
The raw data looks like this:
<polygon>22.3243,113.8659 22.3333,113.8691 22.4288,113.8691 22.4316,113.8742 22.4724,113.9478 22.5101,113.9951 22.5099,113.9985 22.508,114.0017 22.5046,114.0051 22.5018,114.0085 22.5007,114.0112 22.5007,114.0125 22.502,114.0166 22.5038,114.0204 22.5066,114.0245 22.5067,114.0281 22.5057,114.0371 22.5051,114.0409 22.5041,114.0453 22.5025,114.0494 22.5023,114.0511 22.5035,114.0549 22.5047,114.0564 22.5059,114.057 22.5104,114.0576 22.512,114.0584 22.5144,114.0608 22.5163,114.0637 22.517,114.0657 22.5172,114.0683 22.5181,114.0717 22.5173,114.0739</polygon>
I store the requested items in a dictionary and then work through them to transform to a GeoJSON list object that is suitable for ingestion into Elasticsearch according to the schema I'm working with. I've removed irrelevant code here for ease of reading.
# fetches and store data in a dictionary
r = requests.get("https://alerts.weather.gov/cap/ny.php?x=0")
xpars = xmltodict.parse(r.text)
json_entry = json.dumps(xpars['feed']['entry'])
dict_entry = json.loads(json_entry)
# transform items if necessary
for entry in dict_entry:
if entry['cap:polygon']:
polygon = entry['cap:polygon']
polygon = polygon.split(" ")
coordinates = []
# take the split list items swap their positions and enclose them in their own arrays
for p in polygon:
p = p.split(",")
p[0], p[1] = float(p[1]), float(p[0]) # swap lon/lat
coordinates += [p]
# more code adding fields to new dict object, not relevant to the question
The output of the p in polygon loop looks like:
[ [113.8659, 22.3243], [113.8691, 22.3333], [113.8691, 22.4288], [113.8742, 22.4316], [113.9478, 22.4724], [113.9951, 22.5101], [113.9985, 22.5099], [114.0017, 22.508], [114.0051, 22.5046], [114.0085, 22.5018], [114.0112, 22.5007], [114.0125, 22.5007], [114.0166, 22.502], [114.0204, 22.5038], [114.0245, 22.5066], [114.0281, 22.5067], [114.0371, 22.5057], [114.0409, 22.5051], [114.0453, 22.5041], [114.0494, 22.5025], [114.0511, 22.5023], [114.0549, 22.5035], [114.0564, 22.5047], [114.057, 22.5059], [114.0576, 22.5104], [114.0584, 22.512], [114.0608, 22.5144], [114.0637, 22.5163], [114.0657, 22.517], [114.0683, 22.5172], [114.0717, 22.5181], [114.0739, 22.5173] ]
Is there a way to do this that is better than O(N^2)? Thank you for taking the time to read.
O(KxNxM)
This process involves three obvious loops. These are:
Checking each entry (K)
Splitting valid entries into points (MxN) and iterating through those points (N)
Splitting those points into respective coordinates (M)
The amount of letters in a polygon string is ~MxN because there are N points each roughly M letters long, so it iterates through MxN characters.
Now that we know all of this, let's pinpoint where each occurs.
ENTRIES (K):
IF:
SPLIT (MxN)
POINTS (N):
COORDS(M)
So, we can finally conclude that this is O(K(MxN + MxN)) which is just O(KxNxM).

How to find matches between csv files based on two columns within a range

I'm currently struggling to put together some code that will find the matches of values in two different columns in two csv files within a range. I have tried using the code below, but it doesn't output what I am trying to accomplish. Basically, I want to output a new file that contains all of the lines in the second file that have matches to the same columns in the first file, not merge them together. I've added more detailed clarification below my code. I feel like what I've done so far is probably completely wrong. What do I need to change in order for my code to produce the results I am looking for?
import csv
with open('F435W.csv') as csvF435:
readCSV1 = csv.reader(csvF435, delimiter=',')
with open("F550Mnew.csv", "w") as new_F550M:
pass
with open("F550Mnew.csv", "a") as new_F550M:
for header in readCSV1:
new_F550M.write(','.join(header)+'\n')
break
for l435 in readCSV1:
with open('F550M.csv') as csvF550:
readCSV2 = csv.reader(csvF550, delimiter=',')
for l550 in readCSV2:
if isfloat(l435[12]) and isfloat(l550[12]) and abs(float(l435[12])-float(l550[12])) < 0.002778:
if isfloat(l435[13]) and isfloat(l550[13]) and abs(float(l435[13])-float(l550[13])) < 0.002778:
new_F550M.write(','.join(l550)+'\n')
For clarification, each file has an X column and a Y column so basically each row corresponds to an (X,Y) point. In addition, there are 21 other columns of data that are not necessary for finding matches, but need to be included in the final output file. I am trying to find points in the second file that match the points in the first file within a radius. This is because I know that none of my points will be exact matches. In my data, my X is column 13 and my Y is column 14.
The way I have tried to accomplish this is by finding the differences between every X in the first file and every X in the second file (eg. X1-X2), and the differences between every Y in the first file and every Y in the second file (eg. Y1-Y2). Then, every row in the second file which corresponds to differences for both X and Y which are less than my radius value (0.0002778) would be considered a match to the first file.
Unfortunately, my code produces a file with over 300,000 points when my original files only have 7000 points. There should be less data, not more data. It also includes many repeats of data, when there should not be any repeats at all.
Thank you for your time!
Sample of what the data looks like: I apologize for the length, but I am afraid they will not contain enough matches to be useful if I don't include enough of the data.
F435W.csv (file 1)
1,2017.013,0.01242859,-8.2618,0,51434.12,0.3269918,-11.7781,0,0.01957931,1387.9406,541.916,49.9898514,41.5266996,8.81E+01,1.63E+03,1.44E+02,40.535,8.65,84.72,0.00061,0.00035,62.14
2,84.73392,0.01245409,-4.8201,0.0002,112.9723,0.04012135,-5.1324,0.0004,-0.002142646,150.306,146.7986,49.9942613,41.5444392,4.92E+00,5.60E+00,-2.02E-01,2.379,2.206,-74.69,0.00339,0.0029,88.88
3,215.1939,0.01242859,-5.8321,0.0001,262.2751,0.03840466,-6.0469,0.0002,-0.002961465,3248.686,52.8478,50.003155,41.5019044,4.77E+00,5.05E+00,-1.63E-01,2.263,2.166,-65.29,0.002,0.0019,-66.78
4,0.3796681,0.01240305,1.0515,0.0355,0.5823653,0.05487975,0.587,0.1023,-0.00425157,3760.344,11.113,50.0051049,41.4949256,1.93E+00,1.02E+00,-7.42E-02,1.393,1.007,-4.61,0.05461,0.03818,-6.68
5,0.9584663,0.01249223,0.0461,0.0142,1.043696,0.0175857,-0.0464,0.0183,-0.004156116,4013.2063,9.1225,50.0057256,41.4914444,1.12E+00,9.75E-01,1.09E-01,1.085,0.957,28.34,0.01934,0.01745,44.01
6,2.379565,0.01249223,-0.9412,0.0057,0.231205,0.02710035,1.59,0.1273,-0.004135321,3824.3706,9.0756,50.0052903,41.4940468,7.81E-01,6.99E-02,4.27E-02,0.885,0.26,3.42,0.01265,0.00622,15.52
7,0.3171223,0.01250492,1.2469,0.0428,0.5233852,0.05406558,0.7029,0.1122,-0.00399635,4097.3604,7.0301,50.0059585,41.4902884,9.61E-01,1.63E+00,-3.94E-01,1.346,0.883,-65.16,0.06171,0.04005,-65.05
8,0.289245,0.0125176,1.3468,0.047,0.2744479,0.02238134,1.4039,0.0886,-0.004173243,3904.7402,7.3912,50.0055069,41.4929422,7.90E-01,2.38E-01,7.13E-02,0.894,0.479,7.24,0.04501,0.02071,8.29
9,0.3543034,0.01247953,1.1266,0.0383,0.7666836,0.06376094,0.2885,0.0903,-0.004009248,4107.0684,3.259,50.0060503,41.4901611,3.53E+00,1.28E+00,-4.60E-01,1.903,1.09,-11.12,0.06873,0.03955,-11.22
10,1.308331,0.01250492,-0.2918,0.0104,-0.005209296,0.004877397,99,99,-0.004193406,3933.9834,6,50.0056001,41.4925416,5.78E-01,8.33E-02,0.00E+00,0.76,0.289,0,0.01272,0.00424,0
11,3.995717,0.01250492,-1.504,0.0034,0.1589517,0.007450347,1.9968,0.0509,-0.003990021,4069.0469,3.0234,50.0059668,41.4906855,8.03E-01,2.29E-02,1.02E-02,0.896,0.151,0.75,0.00888,0.00361,5.59
12,1.067634,0.01250492,-0.0711,0.0127,0.1260926,0.02787585,2.2483,0.2401,-0.004042602,4048.9148,4,50.0059023,41.4909612,7.40E-01,8.33E-02,0.00E+00,0.86,0.289,0,0.02449,0.00576,0
13,0.2808423,0.01162418,1.3788,0.0449,0.4633991,0.02235104,0.8351,0.0524,-0.004015559,4114.6655,2.0641,50.0060898,41.4900585,9.65E-01,5.88E-01,-9.47E-02,0.994,0.752,-13.34,0.05405,0.03814,-15.13
14,1.067291,0.01245409,-0.0707,0.0127,1.081617,0.01516444,-0.0852,0.0152,-0.004168633,3960.8787,18.0524,50.0054405,41.4921501,6.84E-01,8.29E-01,-6.18E-02,0.923,0.813,-69.77,0.01468,0.01229,-78.83
15,0.5216251,0.0125176,0.7066,0.0261,0.584776,0.01824955,0.5825,0.0339,-0.003026338,2661.6533,58.4563,50.0016952,41.5099844,8.51E-01,1.17E+00,-7.27E-02,1.089,0.914,-77.72,0.03244,0.02498,-81.68
16,0.6062042,0.01249223,0.5435,0.0224,0.8726375,0.05509822,0.1479,0.0686,-0.003950399,4149.8169,31.0127,50.0056384,41.489524,9.30E-01,3.48E+00,2.03E-01,1.87,0.956,85.48,0.05307,0.0241,86.01
17,0.1324067,0.01242859,2.1952,0.1019,0.1208224,0.01290438,2.2946,0.116,-0.004166729,3911.6807,12.661,50.005426,41.4928374,2.17E-01,2.24E-01,-1.08E-01,0.574,0.335,-45.89,0.0721,0.04162,-44.98
18,0.2136006,0.01247953,1.676,0.0634,0.3511444,0.02471001,1.1363,0.0764,-0.003978713,4096.9111,15.6285,50.0057993,41.4902797,1.00E+00,4.37E-01,2.85E-01,1.058,0.564,22.64,0.07548,0.03957,23.17
19,0.1470979,0.01244135,2.081,0.0919,0.1216703,0.0168958,2.287,0.1508,-0.004147241,3695.311,13.7044,50.004907,41.4958173,2.14E-01,2.08E-01,9.20E-02,0.551,0.345,44.05,0.07073,0.04115,45.12
20,0.5434682,0.01250492,0.6621,0.025,0.5819249,0.01592951,0.5878,0.0297,-0.004136056,3866.6416,24.8316,50.0050981,41.493437,8.34E-01,9.96E-01,2.74E-01,1.096,0.793,53.22,0.02966,0.02055,58.08
21,0.2259093,0.01249223,1.6152,0.0601,0.2848583,0.01867901,1.3634,0.0712,-0.00409535,3645.521,20.0162,50.0046759,41.4964926,5.71E-01,4.26E-01,-1.11E-02,0.756,0.652,-4.34,0.03735,0.0305,0.08
22,0.9499883,0.01247953,0.0557,0.0143,0.9711754,0.01891141,0.0318,0.0211,-0.003134006,3378.7927,19.5305,50.0040686,41.5001691,8.66E-01,4.09E-01,3.57E-03,0.931,0.639,0.45,0.01623,0.01142,-1.19
23,1.125635,0.01240305,-0.1285,0.012,1.050538,0.02402694,-0.0535,0.0248,-0.003295973,3132.9458,24.9024,50.0034018,41.5035477,9.65E-01,7.83E-01,-1.44E-01,1.022,0.839,-28.88,0.01702,0.01288,-21
24,0.168302,0.01249223,1.9348,0.0806,0.2447732,0.01930529,1.5281,0.0857,-0.004140488,3904.7268,27.0386,50.0051454,41.4929084,4.47E-01,4.56E-01,-1.28E-02,0.682,0.662,-54.61,0.04399,0.04068,89.66
25,0.0542859,0.01244135,3.1633,0.2489,0.08799078,0.007964755,2.6389,0.0983,-0.003241792,3454.2612,25.2749,50.0041373,41.4991191,1.93E-01,1.99E-01,-7.18E-02,0.518,0.353,-46.27,0.06408,0.03839,-44.76
26,0.4379335,0.01242859,0.8965,0.0308,0.4661828,0.01542368,0.8286,0.0359,-0.00336337,3478.7058,32.3355,50.0040639,41.4987701,6.15E-01,8.96E-01,-2.91E-02,0.948,0.782,-84.15,0.02891,0.02521,-70.04
27,0.1515608,0.01249223,2.0485,0.0895,0.1935181,0.01712885,1.7832,0.0961,-0.002904789,2982.0017,29.9904,50.0029594,41.505619,3.46E-01,3.61E-01,1.55E-05,0.601,0.588,89.94,0.05241,0.05241,-80.48
28,0.6658883,0.01250492,0.4415,0.0204,0.718064,0.01780974,0.3596,0.0269,-0.00324104,3408.0103,36.2539,50.0038284,41.4997375,9.45E-01,1.11E+00,1.98E-01,1.115,0.902,56.45,0.02706,0.02147,51.52
29,0.7244126,0.01244135,0.35,0.0187,1.030102,0.02744665,-0.0322,0.0289,-0.00280412,3259.0889,37.3165,50.0034648,41.5017879,8.65E-01,1.01E+00,5.85E-02,1.017,0.919,70.87,0.02225,0.02011,55.79
30,0.1651701,0.01247953,1.9552,0.0821,0.163293,0.01641976,1.9676,0.1092,-0.003909466,3595.4846,31.9761,50.0043403,41.4971614,2.50E-01,4.42E-01,2.21E-01,0.766,0.324,56.75,0.08087,0.03087,58.28
F550M.csv (file 2)
2,1921.566,0.01258874,-8.2091,0,37128.06,0.2618096,-11.4243,0,0.01455503,4617.5225,554.576,49.9887896,41.5264699,6.09E+01,8.09E+02,1.78E+01,28.459,7.779,88.63,0.00054,0.00036,77.04
3,1.055918,0.01256313,-0.0591,0.0129,9.834856,0.1109255,-2.4819,0.0122,-0.002955142,3936.4946,85.3255,49.9949149,41.5370016,3.98E+01,1.23E+01,1.54E+01,6.83,2.336,24.13,0.06362,0.01965,23.98
4,151.2355,0.01260153,-5.4491,0.0001,184.0693,0.03634057,-5.6625,0.0002,-0.002626019,3409.2642,76.9891,49.9931935,41.5442109,4.02E+00,4.35E+00,-1.47E-03,2.086,2.005,-89.75,0.00227,0.00198,66.61
5,0.3506025,0.01258874,1.138,0.039,0.3466277,0.01300407,1.1503,0.0407,-0.002441164,3351.9893,8.9147,49.9942299,41.5451727,4.97E-01,5.07E-01,7.21E-03,0.715,0.702,62.75,0.02,0.01989,82.88
6,1.166133,0.01257594,-0.1669,0.0117,0.005819145,0.009692424,5.5879,1.8089,-0.003201006,3476.9932,10,49.9946543,41.5434658,5.88E-01,8.33E-02,0.00E+00,0.767,0.289,0,0.01497,0.00499,0
7,0.1372164,0.0125503,2.1565,0.0993,0.1238123,0.02608246,2.2681,0.2288,-0.003556473,3535.5281,13.4586,49.9947993,41.5426587,2.49E-01,2.48E-01,-7.69E-03,0.506,0.491,-43.27,0.05264,0.05237,-55.87
8,0.6174777,0.01260153,0.5234,0.0222,0.6206718,0.01300407,0.5178,0.0228,-0.002441164,3357.0044,20.0487,49.9940449,41.5450748,5.10E-01,5.22E-01,-6.28E-03,0.724,0.712,-66.7,0.01194,0.01192,84.82
9,1.46848,0.01260153,-0.4172,0.0093,0.001897994,0.009688255,6.8043,5.5435,-0.003612399,3584.0171,16,49.9949252,41.5419909,5.87E-01,8.33E-02,0.00E+00,0.766,0.289,0,0.01175,0.00392,0
10,1.452348,0.01258874,-0.4052,0.0094,3.124427,0.04807406,-1.2369,0.0167,-0.003148756,3805.6069,39.5791,49.9952831,41.5389075,2.25E+00,3.87E+00,-6.77E-01,2.03,1.416,-70.08,0.0302,0.01891,-67.61
11,0.1548658,0.01260153,2.0251,0.0884,0.1777253,0.01630147,1.8756,0.0996,-0.002919044,3459.7681,25.6248,49.9943085,41.5436591,4.64E-01,2.34E-01,8.40E-02,0.701,0.455,18.09,0.05739,0.03321,18.33
12,0.5046132,0.01253746,0.7426,0.027,0.7798272,0.04462456,0.27,0.0621,-0.00261193,3418.9119,65.5326,49.9934365,41.5441099,6.87E-01,2.77E+00,-2.92E-01,1.678,0.804,-82.19,0.05363,0.02182,-83.28
13,0.380733,0.01260153,1.0484,0.0359,0.4313257,0.01605258,0.913,0.0404,-0.003497544,3548.8484,34.5602,49.9944623,41.542421,8.27E-01,8.51E-01,8.92E-02,0.964,0.865,48.75,0.03776,0.03252,30.61
14,0.1643925,0.01258874,1.9603,0.0832,0.2181225,0.01839054,1.6532,0.0916,-0.003121084,3710.6785,33.3215,49.9950598,41.5402182,2.18E-01,2.18E-01,1.03E-01,0.567,0.339,45,0.0757,0.04376,45
15,0.3959635,0.01260153,1.0059,0.0346,0.9984215,0.0763398,0.0017,0.083,-0.003106286,3805.9988,48.3363,49.995125,41.5388789,1.87E+00,3.12E+00,4.86E-01,1.813,1.304,71.09,0.0559,0.04105,67.61
16,0.1625628,0.01260153,1.9724,0.0842,0.3490304,0.02234424,1.1428,0.0695,-0.002472953,3410.77,38.0388,49.9939083,41.544294,1.77E-01,4.75E-01,8.92E-03,0.689,0.421,88.29,0.0769,0.04707,89.86
17,0.1725209,0.01260153,1.9079,0.0793,0.2965718,0.02357189,1.3197,0.0863,-0.003454017,3629.0247,40.9706,49.9946304,41.541311,3.73E-01,7.91E-01,-3.73E-01,1.004,0.393,-59.65,0.09781,0.03734,-58.27
18,0.3034717,0.01260153,1.2947,0.0451,0.5031242,0.02774418,0.7458,0.0599,-0.003073985,4079.0825,42,49.9962105,41.5351731,6.68E-01,8.33E-02,0.00E+00,0.818,0.289,0,0.06348,0.02106,0
19,1.593927,0.01260153,-0.5062,0.0086,1.860803,0.0219809,-0.6743,0.0128,-0.003038161,4065.9434,58.3703,49.9958657,41.5353087,1.75E+00,1.41E+00,-7.15E-03,1.323,1.188,-1.21,0.01697,0.01464,-0.43
20,0.5464995,0.01258874,0.656,0.025,0.5661472,0.0144696,0.6177,0.0278,-0.003053429,4045.0474,54.439,49.9958631,41.535604,5.43E-01,8.46E-01,-1.22E-03,0.92,0.737,-89.77,0.02257,0.01649,-89.72
21,1.303251,0.01253746,-0.2876,0.0104,1.296672,0.01418861,-0.2821,0.0119,-0.00259741,4240.1406,55.2714,49.9965409,41.5329423,6.05E-01,6.81E-01,7.89E-03,0.826,0.777,84.15,0.00892,0.00852,69.62
22,0.5174786,0.01260153,0.7153,0.0264,0.5260691,0.01390194,0.6974,0.0287,-0.003019847,3828.95,55.19,49.9950817,41.5385478,5.18E-01,7.56E-01,-6.34E-02,0.879,0.709,-75.96,0.0236,0.01643,-75.02
23,0.1551826,0.01260153,2.0229,0.0882,0.166565,0.01726119,1.946,0.1125,-0.003271136,3504.7439,52.7386,49.9939745,41.5429739,1.91E-01,6.86E-01,1.89E-01,0.866,0.356,71.33,0.10376,0.04235,71.56
24,0.2214222,0.01260153,1.6369,0.0618,0.2389908,0.01360924,1.554,0.0618,-0.00285033,3750.3167,54.0027,49.994824,41.5396229,4.32E-01,5.51E-01,1.68E-03,0.742,0.657,89.18,0.04862,0.04505,89.94
25,0.1336059,0.01253746,2.1854,0.1019,0.1320868,0.009830156,2.1979,0.0808,-0.002921393,3459.6851,51.7091,49.9938331,41.5435908,2.16E-01,2.06E-01,-9.16E-02,0.55,0.345,-43.52,0.06231,0.03626,-45.19
26,0.1703959,0.01260153,1.9214,0.0803,0.1577456,0.0152816,2.0051,0.1052,-0.002779523,3446.95,49,49.9938372,41.5437717,7.29E-01,8.33E-02,0.00E+00,0.854,0.289,0,0.11183,0.03721,0
27,1.896325,0.01258874,-0.6948,0.0072,1.941203,0.0152816,-0.7202,0.0085,-0.00306097,3809.6836,57.8143,49.9949655,41.5388035,7.38E-01,6.80E-01,7.46E-03,0.86,0.824,7.18,0.00713,0.00678,59.71
28,0.6522877,0.01260153,0.4639,0.021,0.1713469,0.01312423,1.9153,0.0832,-0.002447558,4271.9614,52,49.9967135,41.5325172,5.92E-01,8.33E-02,0.00E+00,0.77,0.289,0,0.0274,0.00913,0
29,0.1370073,0.0125503,2.1581,0.0995,0.101415,0.02614047,2.4847,0.2799,-0.002207851,4324.667,55.3374,49.99684,41.5317898,2.22E-01,2.24E-01,1.12E-01,0.579,0.332,45.18,0.07753,0.04476,45
30,0.2240251,0.01253746,1.6243,0.0608,0.2254432,0.01360924,1.6174,0.0656,-0.003037372,3960.3042,58.9024,49.9954807,41.5367473,4.18E-01,4.81E-01,-1.07E-02,0.695,0.645,-80.65,0.03802,0.03492,-88.86
You are complicating the program by nesting all the loops and conditionals. Break it down into simple steps.
Do the following.
1. Read both the csv files and convert them into 2d lists.
2. Compare the columns/values of the lists within a loop based on the given index, add the rows from second list to a new output list.
3. Write the output list to a csv file.
def read_file(filepath):
with open(filepath,'r') as f:
x = csv.reader(f)
l = list(x)
return l
l435 = read_file('F435W.csv')
l550 = read_file('F550M.csv')
new_F550M = []
r = 0.002778
for i in l550:
for j in l435:
# I did't exactly get your if condition, so I am putting it down based on what I understood, so if it is wrong, modify it accordingly.
if isfloat(i[12]) and isfloat(j[12]) and abs(float(i[12]) float(j[12])) < r:
if isfloat(i[13]) and isfloat(j[13]) and abs(float(i[13]) float(j[13])) < r:
new_F550M.append(i)
with open('new_F550M.csv','w') as f:
out = csv.writer(f)
out.writerows(new_F550M)

Read structured data with start/end tags

I have a data file portions of which look like
START
vertex 266.36 234.594 14.6145
vertex 268.582 234.968 15.6956
vertex 267.689 232.646 15.7283
END
START
vertex 166.36 23.594 4.6145
vertex 8.582 23.968 5.6956
vertex 67.689 32.646 1.7283
END
# [...]
i.e., blocks of three "vertices". I would now like to read the data as quickly as possible. So far, I'm going through the lines one by one,
data = numpy.empty((n, 3))
flt = numpy.vectorize(float)
for k in range(n):
parts = f.readline().decode('utf-8').split()
assert len(parts) == 4
assert parts[0] == 'vertex'
data[k] = flt(parts[1:])
but that is pretty slow.
Any hints?
Assuming you have just consumed the START line you could try something like
>>> i = iter(file.__next__, 'END\n')
>>> np.loadtxt(i, usecols=(1,2,3))
array([[266.36 , 234.594 , 14.6145],
[268.582 , 234.968 , 15.6956],
[267.689 , 232.646 , 15.7283]])
I'm assuming that loadtxt is reasonably fast, but I don't know what the overhead of iter is.
Firstly, why the need to decode from utf-8? The data you show implies that might not be needed.
The second thought that comes to mind is the array slicing on the last line. Since you have already checked that there are exactly 4 items, of which you plan to skip the first, depending on how numpy works, might it be an option to say:
data[k] = (float(parts[1]), float(parts[2]), float(parts[3]))

Read several lists from the text file properly python

I have a text file which have 541 lists and each list has 280 numbers such as below:
[301.82779832839964, 301.84247725804647, 301.85718673070272, ..., 324.4056396484375, 324.20379638671875, 324.00198364257812]
.
.
[310.6907599572782, 310.68334604280966, 310.67756809346469,..., 324.23541883368551, 324.18277040240207, 324.09177971086382]
To read this text file, I used numpy.genfromtxt making a code to read the first list for the test such as:
pt1 = np.genfromtxt(filn1,dtype=np.float64,delimiter=",")
print pt1[0].shape
print list(pt1[0])
I expected that I could see the full list of the first list but the result list showed 'nan' in the first and the last place as below:
[nan, 301.84247725804647, 301.85718673070272, ..., 324.4056396484375, 324.20379638671875, nan]
I have tried other option in numpy.genfromtxt, I couldn't find why it resulted 'nan' in the first and the last place in the list. This event was not only for the first list, but also for all lists.
Any idea or help would be really appreciated.
Thank you,
Isaac
import numpy as np
from ast import literal_eval
pt1 = np.array(map(literal_eval,open("in.txt")))
For:
[301.82779832839964, 301.84247725804647, 301.85718673070272, 324.4056396484375, 324.20379638671875, 324.00198364257812]
[310.6907599572782, 310.68334604280966, 310.67756809346469, 324.23541883368551, 324.18277040240207, 324.09177971086382]
You will get:
[[ 301.82779833 301.84247726 301.85718673 324.40563965 324.20379639
324.00198364]
[ 310.69075996 310.68334604 310.67756809 324.23541883 324.1827704
324.09177971]]
It looks like the problem is caused by the square brackets in your textfile; the simplest solution would be to remove these characters from your file, either just using find-replace in a text editor, or if you file is too large, by using a command-line tool like sed.
It's applying 'nan' to the [ and ] in your files. As a last resort you could do something like this:
data = []
d = file('filn').read().split('\n')
for line in d:
if line:
data.append(eval(line))
data = np.asarray(data)
Alternatively you can replace the [ and ] for the whole file, and then you can use np.genfromtxt(filn1,dtype=np.float64,delimiter=",") like you were before, without getting and nan elements.

Convert array of dicts to JSON file to graph with flot

I am taking readings from 4 sensors. I get an array like this:
[{"value":0.162512,"number":0,"channel":0},
{"value":0.027835,"number":1,"channel":1},
{"value":0.08361,"number":2,"channel":2},
{"value":0.295788,"number":3,"channel":3},
{"value":0.137746,"number":4,"channel":0},
{"value":0.009403,"number":5,"channel":1},
{"value":0.089616,"number":6,"channel":2},
{"value":0.310242,"number":7,"channel":3},
{"value":0.109047,"number":8,"channel":0},
...
{"value":0.085652,"number":28,"channel":0},
{"value":0.01359,"number":29,"channel":1},
{"value":0.105441,"number":30,"channel":2},
{"value":0.32407,"number":31,"channel":3}]
I need to format and convert it into a JSON object, I guess from reading through here. I then will use flot to draw a graph. That is the goal.
I want a line graph, showing each reading off of the four sensors. I will be using this in Python eventually if that helps the direction I am going.
I have no clue what I am doing, so any direction would be appreciated.
Having no clue is not a good starting point ... See the flot documentation and examples to get started.
What you have there is one array of objects. (Already as JSON from the looks of it. If that is still on the python side, put it as a string in your javascript and call JSON.parse() on it, it is already valid JSON.)
What you need is an array of arrays (dataseries) of arrays (datapoints). Something like
[
[ // dataseries for channel 0
[0, 0.162515],
[4, 0.137746],
...
],
[ // dataseries for channel 1
[1, 0.027835],
[5, 0.009403],
...
],
...
]
To convert you can loop over your original array and put the datapoints in the right dataseries with something like this:
var dataAsArrays = [
[], [], [], [] // one empty array for each dataseries / channel
];
$.each(dataAsObjects, function (index, item) {
dataAsArrays[item.channel].push([item.number, item.value]);
});
See this fiddle for a working example of the above code.

Categories