############points (108 ea)##################
[[362. 437. 0.]
[418. 124. 0.]
[452. 64. 0.]
...
[256. 512. 0.]
[ 0. 256. 0.]
[512. 256. 0.]]
##########triangles (205 ea)#################
[[ 86 106 100]
[104 95 100]
[ 41 104 101]
...
[ 0 84 36]
[ 84 6 36]
[ 6 84 0]]
################triangle_colours (205 ea)##############
[[0.69140625 0.2734375 0.3203125 1. ]
[0.8046875 0.37109375 0.36328125 1. ]
[0.83203125 0.48046875 0.40234375 1. ]
...
[0.46875 0.13671875 0.26171875 1. ]
[0.49609375 0.1796875 0.28515625 1. ]
[0.91796875 0.796875 0.71484375 1. ]]
Code:
import meshio
cells = [
("triangle", triangles)
]
mesh = meshio.Mesh(
points,
cells,
cell_data={"a": triangle_colours},
)
mesh.write(
"foo.vtk",
)
Above code gives
ValueError: Incompatible cell data. 1 cell blocks, but 'a' has 205 blocks.
I just want to add colors to triangles. triangle_colours array has the same size as triangles as per the example in here: https://github.com/nschloe/meshio .(Both has 205 elements) How can I correct this error?
cell_data corresponds to cells, so it needs to have the same "blocked" structure.
import meshio
cells = [("triangle", triangles)]
mesh = meshio.Mesh(
points,
cells,
cell_data={"a": [triangle_colours]},
)
mesh.write("foo.vtk")
I have a sample DataFrame as below:
First column consists of 2 years, for each year, 2 track exist and each track includes pairs of longitude and latitude coordinated. How can I extract every track for each year separately to obtain an array of tracks with lat and long?
df = pd.DataFrame(
{'year':[0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1],
'track_number':[0,0,0,0,1,1,1,1,0,0,0,0,1,1,1,1],
'lat': [11.7,11.8,11.9,11.9,12.0,12.1,12.2,12.2,12.3,12.3,12.4,12.5,12.6,12.6,12.7,12.8],
'long':[-83.68,-83.69,-83.70,-83.71,-83.71,-83.73,-83.74,-83.75,-83.76,-83.77,-83.78,-83.79,-83.80,-83.81,-83.82,-83.83]})
You can groupby year and then extract a numpy.array from the created dataframes with .to_numpy().
>>> years = []
>>> for _, df2 in df.groupby(["year"]):
years.append(df2.to_numpy()[:, 1:])
>>> years[0]
array([[ 0. , 11.7 , -83.68],
[ 0. , 11.8 , -83.69],
[ 0. , 11.9 , -83.7 ],
[ 0. , 11.9 , -83.71],
[ 1. , 12. , -83.71],
[ 1. , 12.1 , -83.73],
[ 1. , 12.2 , -83.74],
[ 1. , 12.2 , -83.75]])
>>> years[1]
array([[ 0. , 12.3 , -83.76],
[ 0. , 12.3 , -83.77],
[ 0. , 12.4 , -83.78],
[ 0. , 12.5 , -83.79],
[ 1. , 12.6 , -83.8 ],
[ 1. , 12.6 , -83.81],
[ 1. , 12.7 , -83.82],
[ 1. , 12.8 , -83.83]])
Where years[0] would have the desired information for the year 0. And so on. Inside the array, the positions of the original dataframe are preserved. That is, the first element is the track; the second, the latitude, and the third, the longitude.
If you wish to do the same for the track, i.e, have an array of only latitude and longitude, you can groupby(["year", "track_number"]) as well.
I have a dataset in a numpy array in the below format. Each "column" is a separate criteria. I want to display a heatmap where each "column" would correspond to the score range within that column:
[[ 226 600 3.33 915. 92.6 98.6 ]
[ 217 700 3.34 640. 93.7 98.5 ]
[ 213 900 3.35 662. 88.8 96. ]
...
[ 108 600 2.31 291. 64. 70.4 ]
[ 125 800 3.36 1094. 65.5 84.1 ]
[ 109 400 2.44 941. 52.3 68.7 ]]
I have written a function to generate a heatmap:
def HeatMap(data):
#generate heatmap figure
figure = plt.figure()
sub_figure = figure.add_subplot(111)
heatmap = sub_figure.imshow(data, interpolation='nearest',cmap='jet', aspect=0.05)
#generate color bar
cbar = figure.colorbar(ax=sub_figure, mappable=heatmap, orientation='horizontal')
cbar.set_label('Scores')
plt.show()
This is what the function generates:
As per above, it can be seen that the problem lies in my function somewhere as the Scores range from 0 to a maximum value in the dataset of 2500. How can I amend my function so that the heatmap displays the scores in the columns according to their range rather than the range of the whole dataset? My first thoughts are to change the array dimensions to something like [[226],[600]] etc. but not sure if that's the solution
Thanks for your help
You cannot have a separate cmap for each column.
If you want to see the variation in each column as per their own range, you can normalize the data by column before plotting the heatmap.
Code
import numpy as np
x = np.array([[1000, 10, 0.5],
[ 765, 5, 0.35],
[ 800, 7, 0.09]])
x_normed = x / x.max(axis=0)
print(x_normed)
# [[ 1. 1. 1. ]
# [ 0.765 0.5 0.7 ]
# [ 0.8 0.7 0.18 ]]
# Plot the heatmap for x_normed.
This will preserve the variation in each column.
I have an extremely basic problem with the numpy.genfromtxt function. I'm using the Enthought Canopy package: where shall I save the file.txt I want to use, or how shall I tell Python where to look for it? When using IDLE I simply save the file in a preset folder such as C:\Users\Davide\Python\data.txt and what I get is
>>> import numpy as np
>>> np.genfromtxt('data.txt')
array([[ 33.1 , 32.6 , 18.2 , 17.9 ],
[ 32.95, 32.7 , 17.95, 17.9 ],
[ 32.9 , 32.6 , 18. , 17.9 ],
[ 33. , 32.65, 18. , 17.9 ],
[ 32.95, 32.65, 18.05, 17.9 ],
[ 33. , 32.6 , 18. , 17.9 ],
[ 33.05, 32.7 , 18. , 17.9 ],
[ 33.05, 32.5 , 18.1 , 17.9 ],
[ 33. , 32.6 , 18.05, 17.9 ],
[ 33. , 32.55, 18. , 17.95]])
while working with Canopy the same code gives IOError: data.txt not found, nor something like np.genfromtxt('C:\Users\Davide\Python\data.txt') works. I'm sorry for the question's banality but I'm really going crazy with this. Thanks for help.
You can pass a fully qualified path but this:
np.genfromtxt('C:\Users\Davide\Python\data.txt')
won't work because back slashes need to be escaped:
np.genfromtxt('C:\\Users\\Davide\\Python\\data.txt')
or you could use a raw string:
np.genfromtxt(r'C:\Users\Davide\Python\data.txt')
As to where the currect saved location is you can query this using os.getcwd():
In [269]:
import os
os.getcwd()
Out[269]:
'C:\\WinPython-64bit-3.4.3.1\\notebooks\\docs'
I just implemented a hierarchical clustering by following the documentation here: http://www.mathworks.com/help/stats/hierarchical-clustering.html?s_tid=doc_12b
So, let me try to put down what I am trying to do.
Take a look at the following figure:
Now, this dendogram is generated from the following data:
node1 node2 dist(node1,node2) num_elems
assigning index **37 to [ 16. 26**. 1.14749118 2. ]
assigning index 38 to [ 4. 7. 1.20402602 2. ]
assigning index 39 to [ 13. 29. 1.44708015 2. ]
assigning index 40 to [ 12. 18. 1.45827365 2. ]
assigning index 41 to [ 10. 34. 1.49607538 2. ]
assigning index 42 to [ 17. 38. 1.52565922 3. ]
assigning index 43 to [ 8. 25. 1.58919037 2. ]
assigning index 44 to [ 3. 40. 1.60231007 3. ]
assigning index 45 to [ 6. 42. 1.65755731 4. ]
assigning index 46 to [ 15. 23. 1.77770844 2. ]
assigning index 47 to [ 24. 33. 1.77771082 2. ]
assigning index 48 to [ 20. 35. 1.81301111 2. ]
assigning index 49 to [ 19. 48. 1.9191061 3. ]
assigning index 50 to [ 0. 44. 1.94238609 4. ]
assigning index 51 to [ 2. 36. 2.0444266 2. ]
assigning index 52 to [ 39. 45. 2.11667375 6. ]
assigning index 53 to [ 32. 43. 2.17132916 3. ]
assigning index 54 to [ 21. 41. 2.2882061 3. ]
assigning index 55 to [ 9. 30. 2.34492327 2. ]
assigning index 56 to [ 5. 51. 2.38383321 3. ]
assigning index 57 to [ 46. 52. 2.42100025 8. ]
assigning index 58 to [ **28. 37**. 2.48365024 3. ]
assigning index 59 to [ 50. 53. 2.57305009 7. ]
assigning index 60 to [ 49. 57. 2.69459675 11. ]
assigning index 61 to [ 11. 54. 2.75669475 4. ]
assigning index 62 to [ 22. 27. 2.77163751 2. ]
assigning index 63 to [ 47. 55. 2.79303418 4. ]
assigning index 64 to [ 14. 60. 2.88015327 12. ]
assigning index 65 to [ 56. 59. 2.95413905 10. ]
assigning index 66 to [ 61. 65. 3.12615829 14. ]
assigning index 67 to [ 64. 66. 3.28846304 26. ]
assigning index 68 to [ 31. 58. 3.3282066 4. ]
assigning index 69 to [ 63. 67. 3.47397104 30. ]
assigning index 70 to [ 62. 68. 3.63807605 6. ]
assigning index 71 to [ 1. 69. 4.09465969 31. ]
assigning index 72 to [ 70. 71. 4.74129435 37.
So basically, there are 37 points in my data same indexed from 0-36..Now, when I see the first element in this list... I assign i + len(thiscompletelist) + 1
So for example, when the id is 37 seen again in future iterations, then that basically means that it is linked to a branch as well.
I used matlab to generate this image. But I want to query this information as query_node(node_id) such that it returns me a list by level.. such that... on query_node(37) I get
{ "left": {"level":1 {"id": 28}} , "right":{"level":0 {"left" :"id":16},"right":{"id":26}}}
Actually.. I dont even know what is the right data structure to do this..
Basically I want to query by node and gain some insight on what does the structure of this dendogram looks like when I am standing on that node and looking below. :(
EDIT 1:
*OOH I didn't knew that you wont be able to zoom the image.. basically the fourth element from the left is 28 and the green entry is the first row of the data..
So fourth vertical line on dendogram represents 28
Next to that line (the first green line) represents 16
and next to that line (the second green line) represents 26*
Well it's always good to build upon something already existing so take a look at dendrogram in scipy.