I'd like to print a series of ticks on a scatter plot, the pairs of x and y points are stored in two nx2 arrays. Instead of small ticks between the pairs of points, it is printing lines between all the points. Do I need to create n lines?
xs.round(2)
Out[212]:
array([[ 555.59, 557.17],
[ 867.64, 869. ],
[ 581.95, 583.25],
[ 822.08, 823.47],
[ 198.46, 199.91],
[ 887.29, 888.84],
[ 308.68, 310.06],
[ 340.1 , 341.52],
[ 351.68, 353.21],
[ 789.45, 790.89]])
ys.round(2)
Out[213]:
array([[ 737.55, 738.78],
[ 404.7 , 406.17],
[ 7.17, 8.69],
[ 276.72, 278.16],
[ 84.71, 86.1 ],
[ 311.89, 313.14],
[ 615.63, 617.08],
[ 653.9 , 655.32],
[ 76.33, 77.62],
[ 858.54, 859.93]])
plt.plot(xs, ys)
The easiest solution is indeed to plot n lines.
import numpy as np
import matplotlib.pyplot as plt
xs =np.array([[ 555.59, 557.17],
[ 867.64, 869. ],
[ 581.95, 583.25],
[ 822.08, 823.47],
[ 198.46, 199.91],
[ 887.29, 888.84],
[ 308.68, 310.06],
[ 340.1 , 341.52],
[ 351.68, 353.21],
[ 789.45, 790.89]])
ys = np.array([[ 737.55, 738.78],
[ 404.7 , 406.17],
[ 7.17, 8.69],
[ 276.72, 278.16],
[ 84.71, 86.1 ],
[ 311.89, 313.14],
[ 615.63, 617.08],
[ 653.9 , 655.32],
[ 76.33, 77.62],
[ 858.54, 859.93]])
for (x,y) in zip(xs,ys):
plt.plot(x,y, color="crimson")
plt.show()
If n is very large, a more efficient solution would be to use a single LineCollection to show all lines. The advantage is that this can be drawn faster, since only a single collection is used instead of n line plots.
# data as above.
seq = np.concatenate((xs[:,:,np.newaxis],ys[:,:,np.newaxis]), axis=2)
c= matplotlib.collections.LineCollection(seq)
plt.gca().add_collection(c)
plt.gca().autoscale()
plt.show()
You need to iterate over the end points of the arrays xs and ys:
import matplotlib.pyplot as plt
import numpy as np
xs = np.array([[ 555.59, 557.17],
[ 867.64, 869. ],
[ 581.95, 583.25],
[ 822.08, 823.47],
[ 198.46, 199.91],
[ 887.29, 888.84],
[ 308.68, 310.06],
[ 340.1 , 341.52],
[ 351.68, 353.21],
[ 789.45, 790.89]])
ys = np.array([[ 737.55, 738.78],
[ 404.7 , 406.17],
[ 7.17, 8.69],
[ 276.72, 278.16],
[ 84.71, 86.1 ],
[ 311.89, 313.14],
[ 615.63, 617.08],
[ 653.9 , 655.32],
[ 76.33, 77.62],
[ 858.54, 859.93]])
for segment in zip(xs, ys):
plt.plot(segment)
plt.show()
Related
I have the following rotation x, y, z, w (where w is the cosine of half of the rotation angle.):
[1,0,0,-8.940696716308594e-08]
I want to rotate the following axis coordinates from the following array using the rotation given:
[array([[ 0.27050799, -0.027344 , -0.073242 ],
[ 0.27050799, -0.027344 , -0.073242 ],
[ 0.45117199, -0.021484 , -0.203125 ],
[ 0.45117199, -0.021484 , -0.203125 ],
[ 0.65234399, -0.038086 , 0.12988301]])]
How would I go about this?
You can use scipy for this task as follows:
import numpy as np
from scipy.spatial.transform import Rotation
q = np.array([1,0,0,-8.940696716308594e-08])
rotation = Rotation.from_quat(q)
vectors = np.array(
[
[ 0.27050799, -0.027344 , -0.073242 ],
[ 0.27050799, -0.027344 , -0.073242 ],
[ 0.45117199, -0.021484 , -0.203125 ],
[ 0.45117199, -0.021484 , -0.203125 ],
[ 0.65234399, -0.038086 , 0.12988301]
]
)
rotated_vectors = rotation.apply(vectors)
I've been using the Ellipsoid fit python module from https://github.com/aleksandrbazhin/ellipsoid_fit_python and I've mostly found it to be relatively good, but I've recently been running some data through it and I notice that I'm getting lots of negative radii:
points = np.array([[ 0.09149729, 0.03684962, -0.02292631],
[ 0.09248848, 0.03587991, -0.02036695],
[ 0.09290258, 0.03932948, -0.02168421],
[ 0.11715488, 0.02191344, -0.03957262],
[ 0.09938425, 0.02479092, -0.01535327],
[ 0.09911977, 0.02794963, -0.01118133],
[ 0.12063151, 0.03880141, -0.01510232],
[ 0.11984777, 0.02508288, -0.02870339],
[ 0.10012223, 0.02373475, -0.02195443],
[ 0.09790555, 0.02624265, -0.01190708],
[ 0.10180188, 0.02583424, -0.01340349],
[ 0.12224249, 0.02299428, -0.03712141],
[ 0.12637239, 0.03043518, -0.02760782],
[ 0.12438858, 0.02703345, -0.02828939],
[ 0.0974825 , 0.02577809, -0.01916746],
[ 0.12031736, 0.02822308, -0.03366493],
[ 0.1021885 , 0.02674174, -0.03242179],
[ 0.10101997, 0.03994928, -0.01519449],
[ 0.12693756, 0.03200349, -0.02941957],
[ 0.09250743, 0.0386544 , -0.02030381],
[ 0.11748721, 0.02688126, -0.02310617],
[ 0.11888266, 0.03919276, -0.01614771],
[ 0.1175726 , 0.02390139, -0.03775631],
[ 0.09802308, 0.02690862, -0.02278864],
[ 0.0974572 , 0.02665273, -0.0109419 ],
[ 0.11867452, 0.03764389, -0.01400771],
[ 0.10302589, 0.04016999, -0.01659405],
[ 0.12613943, 0.03701292, -0.02291183],
[ 0.12622967, 0.03926508, -0.01887258]])
centre3, radii3, evecs3, v3 = ellipsoid_fit(points )
radii3 = [-0.00490022, 0.05778404, -0.01372089]
The ellipsoid_fit function for some reason applies a sign to the radii - I don't understand why the radius would have a sign, should it not just be absolute values?
Can I simply just ignore these signs and take the absolute values? If not, what does a negative radius mean?
I can manually create a chart of kmeans data, with 5 centroids (code below).
# computing K-Means with K = 5 (5 clusters)
centroids,_ = kmeans(data,5)
# assign each sample to a cluster
idx,_ = vq(data,centroids)
# some plotting using numpy's logical indexing
plot(data[idx==0,0],data[idx==0,1],'ob',
data[idx==1,0],data[idx==1,1],'oy',
data[idx==2,0],data[idx==2,1],'or',
data[idx==3,0],data[idx==3,1],'og',
data[idx==4,0],data[idx==4,1],'om')
plot(centroids[:,0],centroids[:,1],'sg',markersize=15)
show()
Now, I am trying to figure out how to dynamically create a chart in Python. I thin it should be something like this (below) but it doesn't actually work.
for i in range(2, 20):
plot(data[idx==[i],0],data[idx==[i],1],'some_dynamic_color'
plot(centroids[:,0],centroids[:,1],'sg',markersize=15)
show()
Finally, here is my array of data, for reference. Not sure it's even relevant to the problem at hand.
array([[ 0.01160815, 0.28552583],
[ 0.01495681, 0.24965798],
[ 0.52218559, 0.26969486],
[ 0.16408791, 0.30713289],
[ 0.35037607, 0.28401598],
[-0.32413957, 0.53144262],
[ 0.10853278, 0.19756793],
[ 0.08275109, 0.18140047],
[-0.04350157, 0.26407197],
[-0.04789838, 0.31644537],
[-0.03852801, 0.21557165],
[ 0.02213885, 0.20033466],
[-0.80612714, 0.35888803],
[-0.27971428, 0.3195602 ],
[ 0.21359135, 0.14144335],
[ 0.09936109, 0.22313638],
[ 0.15504834, 0.17022939],
[ 0.47012351, 0.41452523],
[ 0.28616062, 0.23098198],
[ 0.25941178, 0.14843141],
[ 0.20049158, 0.23769455],
[-0.19766684, 0.39110416],
[-0.29619519, 0.53520109],
[ 0.29319037, 0.23907492],
[ 0.16644319, 0.18737667],
[ 0.37407685, 0.22463339],
[-0.34262982, 0.40264906],
[ 0.52658291, 0.3542729 ],
[ 0.5747167 , 0.50042607],
[ 0.15607962, 0.20861585],
[-0.50769188, 0.34266008],
[ 0.43373588, 0.22526141],
[ 0.1624051 , 0.29859298],
[ 0.22789948, 0.20157262],
[-0.1179015 , 0.21471169],
[ 0.26108742, 0.26604149],
[ 0.10019146, 0.25547835],
[ 0.18906467, 0.19078555],
[-0.02575308, 0.2877592 ],
[-0.45292564, 0.51866493],
[ 0.11516754, 0.21504329],
[ 0.10020043, 0.23943587],
[ 0.21402611, 0.34297039],
[ 0.24574342, 0.15734118],
[ 0.58083355, 0.22886509],
[ 0.33975699, 0.33309233],
[ 0.19002609, 0.14372212],
[ 0.35220577, 0.23879166],
[ 0.27427999, 0.1529184 ],
[ 0.06261825, 0.18908223],
[ 0.25005859, 0.21363957],
[ 0.1676683 , 0.26111871],
[ 0.14703364, 0.25532777],
[ 0.26130579, 0.14012819],
[-0.14897454, 0.23037735],
[-0.26827493, 0.23193457],
[ 0.51701526, 0.17887009],
[-0.05870745, 0.18040883],
[ 0.25651599, 0.227289 ],
[ 0.06881783, 0.28114007],
[ 0.43079653, 0.21510341]])
Any thoughts on how I can create the chart dynamically?
Thanks.
Index i of for loop should be from 0 to 4 (Thiere are 5 centroids).
for i in range(0, 5):
plot(data[idx==[i],0],data[idx==[i],1],'some_dynamic_color' ...
I reproduced like below. Using matplotlib and scipy.
import matplotlib.pyplot as plt
from matplotlib import cm
import numpy as np
from sklearn.cluster import KMeans
data = np.array(#your data)
kmeans = KMeans(n_clusters=5)
kmeans.fit(data)
y_kmeans = kmeans.predict(data)
viridis = cm.get_cmap('viridis', 5)
for i in range(0, len(data)):
plt.scatter(data[i,0], data[i,1], c=viridis(y_kmeans[i]), s= 50)
centers = kmeans.cluster_centers_
plt.scatter(centers[:, 0], centers[:, 1], c='black', s=200, alpha=0.5)
k-Means ref https://jakevdp.github.io/PythonDataScienceHandbook/05.11-k-means.html
I have the following to plot, two arrays of shapes (120,) and (120,). For the second array, I am trying to get a smooth plot, but unable to do so.
The following plots a normal plot:
add_z = array([ 22.39409055, 20.91765398, 19.80805759, 19.14836638, 23.54310977, 19.68638808, 21.25143616, 21.32550146, 18.80392599, 17.37016759, 19.21143494, 18.27464661, 21.25150385, 20.61853909 ])
dataNew = array([[ 26.69], [ 24.94], [ 22.37], [ 23.5 ], [ 22.69], [ 22.62], [ 18.5 ], [ 20.87], [ 19. ], [ 19.75], [ 20.72], [ 19.78], [ 20.38], [ 22.06]])
import matplotlib.pyplot as plt
plt.figure(figsize = (10,5))
plt.plot(dataNew[:],'g')
plt.plot(add_z[:],'b');
I tried using scipy's interpolation methods but, I am really not familiar with splines. I am trying to get dataNew as a normal plot and add_z as a smooth curve to go along in the same plot window. Both are numpy arrays.
This is just patching on another stackoverflow answer which I have embarassingly misplaced:
import matplotlib.pyplot as plt
import numpy as np
add_z = np.array([ 22.39409055, 20.91765398, 19.80805759, 19.14836638, 23.54310977, 19.68638808, 21.25143616, 21.32550146, 18.80392599, 17.37016759, 19.21143494, 18.27464661, 21.25150385, 20.61853909, 22.89028155, 22.3965408 ])
dataNew = np.array([[ 26.69], [ 24.94], [ 22.37], [ 23.5 ], [ 22.69], [ 22.62], [ 18.5 ], [ 20.87], [ 19. ], [ 19.75], [ 20.72], [ 19.78], [ 20.38], [ 22.06]])
plt.figure(figsize = (10,5))
plt.plot(dataNew[:],'g')
plt.plot(add_z[:],'b');
from scipy import interpolate
f = interpolate.interp1d(np.arange(len(add_z)), add_z, kind='cubic')
xnew = np.arange(0, len(add_z)-1, 0.1)
ynew = f(xnew)
plt.plot(xnew, ynew, 'b:')
How do I generate a data set consisting of N = 100 2-dimensional samples x = (x1,x2)T ∈ R2 drawn from a 2-dimensional Gaussian distribution, with mean
µ = (1,1)T
and covariance matrix
Σ = (0.3 0.2
0.2 0.2)
I'm told that you can use a Matlab function randn, but don't know how to implement it in Python?
Just to elaborate on #EamonNerbonne's answer: the following uses Cholesky decomposition of the covariance matrix to generate correlated variables from uncorrelated normally distributed random variables.
import numpy as np
import matplotlib.pyplot as plt
linalg = np.linalg
N = 1000
mean = [1,1]
cov = [[0.3, 0.2],[0.2, 0.2]]
data = np.random.multivariate_normal(mean, cov, N)
L = linalg.cholesky(cov)
# print(L.shape)
# (2, 2)
uncorrelated = np.random.standard_normal((2,N))
data2 = np.dot(L,uncorrelated) + np.array(mean).reshape(2,1)
# print(data2.shape)
# (2, 1000)
plt.scatter(data2[0,:], data2[1,:], c='green')
plt.scatter(data[:,0], data[:,1], c='yellow')
plt.show()
The yellow dots were generated by np.random.multivariate_normal. The green dots were generated by multiplying normally distributed points by the Cholesky decomposition matrix L.
You are looking for numpy.random.multivariate_normal
Code
>>> import numpy
>>> print numpy.random.multivariate_normal([1,1], [[0.3, 0.2],[0.2, 0.2]], 100)
[[ 0.02999043 0.09590078]
[ 1.35743021 1.08199363]
[ 1.15721179 0.87750625]
[ 0.96879114 0.94503228]
[ 1.23989167 1.13473083]
[ 1.55917608 0.81530847]
[ 0.89985651 0.7071519 ]
[ 0.37494324 0.739433 ]
[ 1.45121732 1.17168444]
[ 0.69680785 1.2727178 ]
[ 0.35600769 0.46569276]
[ 2.14187488 1.8758589 ]
[ 1.59276393 1.54971412]
[ 1.71227009 1.63429704]
[ 1.05013136 1.1669758 ]
[ 1.34344004 1.37369725]
[ 1.82975724 1.49866636]
[ 0.80553877 1.26753018]
[ 1.74331784 1.27211784]
[ 1.23044292 1.18110192]
[ 1.07675493 1.05940509]
[ 0.15495771 0.64536509]
[ 0.77409745 1.0174171 ]
[ 1.20062726 1.3870498 ]
[ 0.39619719 0.77919884]
[ 0.87209168 1.00248145]
[ 1.32273339 1.54428262]
[ 2.11848535 1.44338789]
[ 1.45226461 1.42061198]
[ 0.33775737 0.24968543]
[ 1.06982557 0.64674411]
[ 0.92113229 1.0583153 ]
[ 0.54987592 0.73198037]
[ 1.06559727 0.77891362]
[ 0.84371805 0.72957046]
[ 1.83614557 1.40582746]
[ 0.53146009 0.72294094]
[ 0.98927818 0.73732053]
[ 1.03984002 0.89426628]
[ 0.38142362 0.32471126]
[ 1.44464929 1.15407227]
[-0.22601279 0.21045592]
[-0.01995875 0.45051782]
[ 0.58779449 0.44486237]
[ 1.31335981 0.92875936]
[ 0.42200098 0.6942829 ]
[ 0.10714426 0.11083002]
[ 1.44997839 1.19052704]
[ 0.78630506 0.45877582]
[ 1.63432202 1.95066539]
[ 0.56680926 0.92203111]
[ 0.08841491 0.62890576]
[ 1.4703602 1.4924649 ]
[ 1.01118864 1.44749407]
[ 1.19936276 1.02534702]
[ 0.67893239 0.8482461 ]
[ 0.71537211 0.53279103]
[ 1.08031573 1.00779064]
[ 0.66412568 0.57121041]
[ 0.96098528 0.72318386]
[ 0.7690299 0.76058713]
[ 0.77466896 0.77559282]
[ 0.47906664 0.58602633]
[ 0.52481326 0.78486453]
[-0.40240438 0.17374116]
[ 0.75730444 0.22365892]
[ 0.67811008 1.17730408]
[ 1.62245699 1.71775386]
[ 1.12317847 1.04252136]
[-0.06461117 0.23557416]
[ 0.46299482 0.51585414]
[ 0.88125676 1.23284201]
[ 0.57920534 0.63765861]
[ 0.88239858 1.32092112]
[ 0.63500551 0.94788141]
[ 1.76588148 1.63856465]
[ 0.65026599 0.6899672 ]
[ 0.06854287 0.29712499]
[ 0.61575737 0.87526625]
[ 0.30057552 0.54475194]
[ 0.66578769 0.21034844]
[ 0.94670438 0.7699764 ]
[ 0.39870371 0.91681577]
[ 1.37531351 1.62337899]
[ 1.92350877 1.34382017]
[ 0.56631877 0.77456137]
[ 1.18702642 0.63700271]
[ 0.74002244 1.04535471]
[ 0.3272063 0.75097037]
[ 1.57583435 1.55809705]
[ 0.44325124 0.39620769]
[ 0.59762516 0.58304621]
[ 0.72253698 0.68302097]
[ 0.93459597 1.01101948]
[ 0.50139577 0.52500942]
[ 0.84696441 0.68679341]
[ 0.63483432 0.22205385]
[ 1.43642478 1.34724612]
[ 1.58663111 1.49941374]
[ 0.73832806 0.95690866]]
>>>
Although numpy has handy utility functions, you can always "rescale" multiple independant normally distributed variables to match your given covariance matrix. So if you can generate a column-vector x (or many vectors grouped in a matrix) in which each element is normally distributed, and you scale by matrix M, the result will have covariance M M^T. Conversely, if you decompose your covariance C into the form M M^T then it's really simple to generate such a distribution even without the utility functions numpy provides (just multiply your bunch of normally distributed vectors by M).
This is perhaps not the answer you're directly looking for, but it's useful to keep in mind e.g.:
if you ever find yourself scaling the result of the random generation, you could instead combine the scaling with your initial covariance
if you need to ever port code to libraries that don't directly support such a utility method it's very easy to implement yourself.