Compare similarity methods of images - python
i would like to compare several similarity measurement methods to each other. so first of all let us consider two image of the same person from the tv series Klon, here is two image
this character is called Lucas and his clone Leo
i know that there are different comparison methods, like mse,ssim(Structural Similarity Index), here is snapshot of the code for both images and corresponding output :
def mse(imageA, imageB):
err = np.sum((imageA.astype("float") - imageB.astype("float")) ** 2)
err /= float(imageA.shape[0] * imageA.shape[1])
return err
def compare_images(imageA, imageB, title):
m = mse(imageA, imageB)
s = ssim(imageA, imageB)
fig = plt.figure(title)
plt.suptitle("MSE: %.2f, SSIM: %.2f" % (m, s))
ax = fig.add_subplot(1, 2, 1)
plt.imshow(imageA, cmap = plt.cm.gray)
plt.axis("off")
ax = fig.add_subplot(1, 2, 2)
plt.imshow(imageB, cmap = plt.cm.gray)
plt.axis("off")
plt.show()
when i compare those methods, i got following result :
as you see ssim is very low 0.19, then i found following documentation :
rootsift in python
but i dont know exactly how to calculate score or how display keypoints and description on each image, here is corresponding full code :
import cv2
import matplotlib.pyplot as plt
import numpy as np
from skimage.metrics import structural_similarity as ssim
class RootSIFT:
#def __int__(self):
#self.extractor = cv2.DescriptorExtractor_create("SIFT")
def compute(self, image, eps=1e-7):
kps =cv2.SIFT_create().detect(image)
(kps, descs) = cv2.SIFT_create().compute(image, kps)
if len(kps) == 0:
return ([], None)
descs /= (descs.sum(axis=1, keepdims=True) + eps)
descs = np.sqrt(descs)
return (kps, descs)
def mse(imageA, imageB):
err = np.sum((imageA.astype("float") - imageB.astype("float")) ** 2)
err /= float(imageA.shape[0] * imageA.shape[1])
return err
def compare_images(imageA, imageB, title):
m = mse(imageA, imageB)
s = ssim(imageA, imageB)
fig = plt.figure(title)
plt.suptitle("MSE: %.2f, SSIM: %.2f" % (m, s))
ax = fig.add_subplot(1, 2, 1)
plt.imshow(imageA, cmap = plt.cm.gray)
plt.axis("off")
ax = fig.add_subplot(1, 2, 2)
plt.imshow(imageB, cmap = plt.cm.gray)
plt.axis("off")
plt.show()
def set_equal(image1,image2):
if image1.size >= image2.size:
image1 = cv2.resize(image1, (image2.shape[1], image2.shape[0]))
else:
image2 = cv2.resize(image2, (image1.shape[1], image1.shape[0]))
return image1,image2
image1 =cv2.imread("leo.jpg",0)
image2 =cv2.imread("lucas.jpg",0)
#image1 =cv2.resize(image2,(image2.shape[1],image2.shape[0]))
image1,image2 =set_equal(image1,image2)
compare_images(image1, image2,"compare images")
print(RootSIFT().compute(image1))
print(RootSIFT().compute(image2))
last two line
print(RootSIFT().compute(image1))
print(RootSIFT().compute(image2))
calculates and prints keypoints and descriptions like this :
((<KeyPoint 0000028F9DA8E310>, <KeyPoint 0000028F9DA8E340>, <KeyPoint 0000028F9DA8E370>, <KeyPoint 0000028F9DA8E3A0>, <KeyPoint 0000028F9DA8E3D0>, <KeyPoint 0000028F9DA8E400>, <KeyPoint 0000028F9DA8E430>, <KeyPoint 0000028F9DA8E460>, <KeyPoint 0000028F9DA8E490>, <KeyPoint 0000028F9DA8E4C0>, <KeyPoint 0000028F9DA8E4F0>, <KeyPoint 0000028F9DA8E520>, <KeyPoint 0000028F9DA8E550>, <KeyPoint 0000028F9DA8E580>, <KeyPoint 0000028F9DA8E5B0>, <KeyPoint 0000028F9DA8E5E0>, <KeyPoint 0000028F9DA8E610>, <KeyPoint 0000028F9DA8E640>, <KeyPoint 0000028F9DA8E670>, <KeyPoint 0000028F9DA8E6A0>, <KeyPoint 0000028F9DA8E6D0>, <KeyPoint 0000028F9DA8E700>, <KeyPoint 0000028F9DA8E730>, <KeyPoint 0000028F9DA8E760>, <KeyPoint 0000028F9DA8E790>, <KeyPoint 0000028F9DA8E7C0>, <KeyPoint 0000028F9DA8E7F0>, <KeyPoint 0000028F9DA8E820>, <KeyPoint 0000028F9DA8E850>, <KeyPoint 0000028F9DA8E880>, <KeyPoint 0000028F9DA8E8B0>, <KeyPoint 0000028F9DA8E8E0>, <KeyPoint 0000028F9DA8E910>, <KeyPoint 0000028F9DA8E940>, <KeyPoint 0000028F9DA8E970>, <KeyPoint 0000028F9DA8E9A0>, <KeyPoint 0000028F9DA8E9D0>, <KeyPoint 0000028F9DA8EA00>, <KeyPoint 0000028F9DA8EA30>, <KeyPoint 0000028F9DA8EA60>, <KeyPoint 0000028F9DA8EA90>, <KeyPoint 0000028F9DA8EAC0>, <KeyPoint 0000028F9DA8EAF0>, <KeyPoint 0000028F9DA8EB20>, <KeyPoint 0000028F9DA8EB50>, <KeyPoint 0000028F9DA8EB80>, <KeyPoint 0000028F9DA8EBB0>, <KeyPoint 0000028F9DA8EBE0>, <KeyPoint 0000028F9DA8EC10>, <KeyPoint 0000028F9DA8EC40>, <KeyPoint 0000028F9DA8EC70>, <KeyPoint 0000028F9DA8ECA0>, <KeyPoint 0000028F9DA8ECD0>, <KeyPoint 0000028F9DA8ED00>, <KeyPoint 0000028F9DA8ED30>, <KeyPoint 0000028F9DA8ED60>, <KeyPoint 0000028F9DA8ED90>, <KeyPoint 0000028F9DA8EDC0>, <KeyPoint 0000028F9DA8EDF0>, <KeyPoint 0000028F9DA8EE20>, <KeyPoint 0000028F9DA8EE50>, <KeyPoint 0000028F9DA8EE80>, <KeyPoint 0000028F9DA8EEB0>, <KeyPoint 0000028F9DA8EEE0>, <KeyPoint 0000028F9DA8EF10>, <KeyPoint 0000028F9DA8EF40>, <KeyPoint 0000028F9DA8EF70>, <KeyPoint 0000028F9DA8EFA0>, <KeyPoint 0000028F9DA8EFD0>, <KeyPoint 0000028F9DA8F000>, <KeyPoint 0000028F9DA8F030>, <KeyPoint 0000028F9DA8F060>, <KeyPoint 0000028F9DA8F090>, <KeyPoint 0000028F9DA8F0C0>, <KeyPoint 0000028F9DA8F0F0>, <KeyPoint 0000028F9DA8F120>, <KeyPoint 0000028F9DA8F150>, <KeyPoint 0000028F9DA8F180>, <KeyPoint 0000028F9DA8F1B0>, <KeyPoint 0000028F9DA8F1E0>, <KeyPoint 0000028F9DA8F210>, <KeyPoint 0000028F9DA8F240>, <KeyPoint 0000028F9DA8F270>, <KeyPoint 0000028F9DA8F2A0>, <KeyPoint 0000028F9DA8F2D0>, <KeyPoint 0000028F9DA8F300>, <KeyPoint 0000028F9DA8F330>, <KeyPoint 0000028F9DA8F360>, <KeyPoint 0000028F9DA8F390>, <KeyPoint 0000028F9DA8F3C0>, <KeyPoint 0000028F9DA8F3F0>, <KeyPoint 0000028F9DA8F420>, <KeyPoint 0000028F9DA8F450>, <KeyPoint 0000028F9DA8F480>, <KeyPoint 0000028F9DA8F4B0>, <KeyPoint 0000028F9DA8F4E0>, <KeyPoint 0000028F9DA8F510>, <KeyPoint 0000028F9DA8F540>, <KeyPoint 0000028F9DA8F570>, <KeyPoint 0000028F9DA8F5A0>, <KeyPoint 0000028F9DA8F5D0>, <KeyPoint 0000028F9DA8F600>, <KeyPoint 0000028F9DA8F630>, <KeyPoint 0000028F9DA8F660>, <KeyPoint 0000028F9DA8F690>, <KeyPoint 0000028F9DA8F6C0>, <KeyPoint 0000028F9DA8F6F0>, <KeyPoint 0000028F9DA8F720>, <KeyPoint 0000028F9DA8F750>, <KeyPoint 0000028F9DA8F780>, <KeyPoint 0000028F9DA8F7B0>, <KeyPoint 0000028F9DA8F7E0>, <KeyPoint 0000028F9DA8F810>, <KeyPoint 0000028F9DA8F840>, <KeyPoint 0000028F9DA8F870>, <KeyPoint 0000028F9DA8F8A0>, <KeyPoint 0000028F9DA8F8D0>, <KeyPoint 0000028F9DA8F900>, <KeyPoint 0000028F9DA8F930>, <KeyPoint 0000028F9DA8F960>, <KeyPoint 0000028F9DA8F990>, <KeyPoint 0000028F9DA8F9C0>, <KeyPoint 0000028F9DA8F9F0>, <KeyPoint 0000028F9DA8FA20>, <KeyPoint 0000028F9DA8FA50>, <KeyPoint 0000028F9DA8FA80>, <KeyPoint 0000028F9DA8FAB0>, <KeyPoint 0000028F9DA8FAE0>, <KeyPoint 0000028F9DA8FB10>, <KeyPoint 0000028F9DA8FB40>, <KeyPoint 0000028F9DA8FB70>, <KeyPoint 0000028F9DA8FBA0>, <KeyPoint 0000028F9DA8FBD0>, <KeyPoint 0000028F9DA8FC00>, <KeyPoint 0000028F9DA8FC30>, <KeyPoint 0000028F9DA8FC60>, <KeyPoint 0000028F9DA8FC90>, <KeyPoint 0000028F9DA8FCC0>, <KeyPoint 0000028F9DA8FCF0>, <KeyPoint 0000028F9DA8FD20>, <KeyPoint 0000028F9DA8FD50>, <KeyPoint 0000028F9DA8FD80>, <KeyPoint 0000028F9DA8FDB0>, <KeyPoint 0000028F9DA8FDE0>, <KeyPoint 0000028F9DA8FE10>, <KeyPoint 0000028F9DA8FE40>, <KeyPoint 0000028F9DA8FE70>, <KeyPoint 0000028F9DA8FEA0>, <KeyPoint 0000028F9DA8FED0>, <KeyPoint 0000028F9DA8FF00>, <KeyPoint 0000028F9DA8FF30>, <KeyPoint 0000028F9DA8FF60>, <KeyPoint 0000028F9DA8FF90>, <KeyPoint 0000028F9DA8FFC0>, <KeyPoint 0000028F9DA90030>, <KeyPoint 0000028F9DA90060>, <KeyPoint 0000028F9DA90090>, <KeyPoint 0000028F9DA900C0>, <KeyPoint 0000028F9DA900F0>, <KeyPoint 0000028F9DA90120>, <KeyPoint 0000028F9DA90150>, <KeyPoint 0000028F9DA90180>, <KeyPoint 0000028F9DA901B0>, <KeyPoint 0000028F9DA901E0>, <KeyPoint 0000028F9DA90210>, <KeyPoint 0000028F9DA90240>, <KeyPoint 0000028F9DA90270>, <KeyPoint 0000028F9DA902A0>, <KeyPoint 0000028F9DA902D0>, <KeyPoint 0000028F9DA90300>, <KeyPoint 0000028F9DA90330>, <KeyPoint 0000028F9DA90360>, <KeyPoint 0000028F9DA90390>, <KeyPoint 0000028F9DA903C0>, <KeyPoint 0000028F9DA903F0>, <KeyPoint 0000028F9DA90420>, <KeyPoint 0000028F9DA90450>, <KeyPoint 0000028F9DA90480>, <KeyPoint 0000028F9DA904B0>, <KeyPoint 0000028F9DA904E0>, <KeyPoint 0000028F9DA90510>, <KeyPoint 0000028F9DA90540>, <KeyPoint 0000028F9DA90570>, <KeyPoint 0000028F9DA905A0>, <KeyPoint 0000028F9DA905D0>, <KeyPoint 0000028F9DA90600>, <KeyPoint 0000028F9DA90630>, <KeyPoint 0000028F9DA90660>, <KeyPoint 0000028F9DA90690>, <KeyPoint 0000028F9DA906C0>, <KeyPoint 0000028F9DA906F0>, <KeyPoint 0000028F9DA90720>, <KeyPoint 0000028F9DA90750>, <KeyPoint 0000028F9DA90780>, <KeyPoint 0000028F9DA907B0>, <KeyPoint 0000028F9DA907E0>, <KeyPoint 0000028F9DA90810>, <KeyPoint 0000028F9DA90840>, <KeyPoint 0000028F9DA90870>, <KeyPoint 0000028F9DA908A0>, <KeyPoint 0000028F9DA908D0>, <KeyPoint 0000028F9DA90900>, <KeyPoint 0000028F9DA90930>, <KeyPoint 0000028F9DA90960>, <KeyPoint 0000028F9DA90990>, <KeyPoint 0000028F9DA909C0>, <KeyPoint 0000028F9DA909F0>, <KeyPoint 0000028F9DA90A20>, <KeyPoint 0000028F9DA90A50>, <KeyPoint 0000028F9DA90A80>, <KeyPoint 0000028F9DA90AB0>, <KeyPoint 0000028F9DA90AE0>, <KeyPoint 0000028F9DA90B10>, <KeyPoint 0000028F9DA90B40>, <KeyPoint 0000028F9DA90B70>, <KeyPoint 0000028F9DA90BA0>, <KeyPoint 0000028F9DA90BD0>, <KeyPoint 0000028F9DA90C00>, <KeyPoint 0000028F9DA90C30>, <KeyPoint 0000028F9DA90C60>, <KeyPoint 0000028F9DA90C90>, <KeyPoint 0000028F9DA90CC0>, <KeyPoint 0000028F9DA90CF0>, <KeyPoint 0000028F9DA90D20>, <KeyPoint 0000028F9DA90D50>, <KeyPoint 0000028F9DA90D80>, <KeyPoint 0000028F9DA90DB0>, <KeyPoint 0000028F9DA90DE0>, <KeyPoint 0000028F9DA90E10>, <KeyPoint 0000028F9DA90E40>, <KeyPoint 0000028F9DA90E70>, <KeyPoint 0000028F9DA90EA0>, <KeyPoint 0000028F9DA90ED0>, <KeyPoint 0000028F9DA90F00>, <KeyPoint 0000028F9DA90F30>, <KeyPoint 0000028F9DA90F60>, <KeyPoint 0000028F9DA90F90>, <KeyPoint 0000028F9DA90FC0>, <KeyPoint 0000028F9DA90FF0>, <KeyPoint 0000028F9DA91020>, <KeyPoint 0000028F9DA91050>, <KeyPoint 0000028F9DA91080>, <KeyPoint 0000028F9DA910B0>, <KeyPoint 0000028F9DA910E0>, <KeyPoint 0000028F9DA91110>, <KeyPoint 0000028F9DA91140>, <KeyPoint 0000028F9DA91170>, <KeyPoint 0000028F9DA911A0>, <KeyPoint 0000028F9DA911D0>, <KeyPoint 0000028F9DA91200>, <KeyPoint 0000028F9DA91230>, <KeyPoint 0000028F9DA91260>, <KeyPoint 0000028F9DA91290>, <KeyPoint 0000028F9DA912C0>, <KeyPoint 0000028F9DA912F0>, <KeyPoint 0000028F9DA91320>, <KeyPoint 0000028F9DA91350>, <KeyPoint 0000028F9DA91380>, <KeyPoint 0000028F9DA913B0>, <KeyPoint 0000028F9DA913E0>, <KeyPoint 0000028F9DA91410>, <KeyPoint 0000028F9DA91440>, <KeyPoint 0000028F9DA91470>, <KeyPoint 0000028F9DA914A0>, <KeyPoint 0000028F9DA914D0>, <KeyPoint 0000028F9DA91500>, <KeyPoint 0000028F9DA91530>, <KeyPoint 0000028F9DA91560>, <KeyPoint 0000028F9DA91590>, <KeyPoint 0000028F9DA915C0>, <KeyPoint 0000028F9DA915F0>, <KeyPoint 0000028F9DA91620>), array([[0. , 0. , 0.01875146, ..., 0. , 0. ,
0. ],
[0.03097891, 0.0593201 , 0.06195782, ..., 0. , 0. ,
0. ],
[0. , 0. , 0. , ..., 0. , 0.191556 ,
0.2333692 ],
...,
[0.17920329, 0.25262845, 0. , ..., 0. , 0. ,
0.04508348],
[0. , 0. , 0. , ..., 0.06996126, 0.01940376,
0.07761505],
[0.1062388 , 0.06993493, 0.01939646, ..., 0. , 0. ,
0. ]], dtype=float32))
((<KeyPoint 0000028F9DA8E430>, <KeyPoint 0000028F9DA8E460>, <KeyPoint 0000028F9DA8E490>, <KeyPoint 0000028F9DA8E4C0>, <KeyPoint 0000028F9DA8E4F0>, <KeyPoint 0000028F9DA8E520>, <KeyPoint 0000028F9DA8E550>, <KeyPoint 0000028F9DA8E580>, <KeyPoint 0000028F9DA8E5B0>, <KeyPoint 0000028F9DA8E5E0>, <KeyPoint 0000028F9DA8E610>, <KeyPoint 0000028F9DA8E640>, <KeyPoint 0000028F9DA8E670>, <KeyPoint 0000028F9DA8E6A0>, <KeyPoint 0000028F9DA8E6D0>, <KeyPoint 0000028F9DA8E700>, <KeyPoint 0000028F9DA8E730>, <KeyPoint 0000028F9DA8E760>, <KeyPoint 0000028F9DA8E790>, <KeyPoint 0000028F9DA8E7C0>, <KeyPoint 0000028F9DA8E7F0>, <KeyPoint 0000028F9DA8E820>, <KeyPoint 0000028F9DA8E850>, <KeyPoint 0000028F9DA8E880>, <KeyPoint 0000028F9DA8E8B0>, <KeyPoint 0000028F9DA8E8E0>, <KeyPoint 0000028F9DA8E910>, <KeyPoint 0000028F9DA8E940>, <KeyPoint 0000028F9DA8E970>, <KeyPoint 0000028F9DA8E9A0>, <KeyPoint 0000028F9DA8E9D0>, <KeyPoint 0000028F9DA8EA00>, <KeyPoint 0000028F9DA8EA30>, <KeyPoint 0000028F9DA8EA60>, <KeyPoint 0000028F9DA8EA90>, <KeyPoint 0000028F9DA8EAC0>, <KeyPoint 0000028F9DA8EAF0>, <KeyPoint 0000028F9DA8EB20>, <KeyPoint 0000028F9DA8EB50>, <KeyPoint 0000028F9DA8EB80>, <KeyPoint 0000028F9DA8EBB0>, <KeyPoint 0000028F9DA8EBE0>, <KeyPoint 0000028F9DA8EC10>, <KeyPoint 0000028F9DA8EC40>, <KeyPoint 0000028F9DA8EC70>, <KeyPoint 0000028F9DA8ECA0>, <KeyPoint 0000028F9DA8ECD0>, <KeyPoint 0000028F9DA8ED00>, <KeyPoint 0000028F9DA8ED30>, <KeyPoint 0000028F9DA8ED60>, <KeyPoint 0000028F9DA8ED90>, <KeyPoint 0000028F9DA8EDC0>, <KeyPoint 0000028F9DA8EDF0>, <KeyPoint 0000028F9DA8EE20>, <KeyPoint 0000028F9DA8EE50>, <KeyPoint 0000028F9DA8EE80>, <KeyPoint 0000028F9DA8EEB0>, <KeyPoint 0000028F9DA8EEE0>, <KeyPoint 0000028F9DA8EF10>, <KeyPoint 0000028F9DA8EF40>, <KeyPoint 0000028F9DA8EF70>, <KeyPoint 0000028F9DA8EFA0>, <KeyPoint 0000028F9DA8EFD0>, <KeyPoint 0000028F9DA8F000>, <KeyPoint 0000028F9DA8F030>, <KeyPoint 0000028F9DA8F060>, <KeyPoint 0000028F9DA8F090>, <KeyPoint 0000028F9DA8F0C0>, <KeyPoint 0000028F9DA8F0F0>, <KeyPoint 0000028F9DA8F120>, <KeyPoint 0000028F9DA8F150>, <KeyPoint 0000028F9DA8F180>, <KeyPoint 0000028F9DA8F1B0>, <KeyPoint 0000028F9DA8F1E0>, <KeyPoint 0000028F9DA8F210>, <KeyPoint 0000028F9DA8F240>, <KeyPoint 0000028F9DA8F270>, <KeyPoint 0000028F9DA8F2A0>, <KeyPoint 0000028F9DA8F2D0>, <KeyPoint 0000028F9DA8F300>, <KeyPoint 0000028F9DA8F330>, <KeyPoint 0000028F9DA8F360>, <KeyPoint 0000028F9DA8F390>, <KeyPoint 0000028F9DA8F3C0>, <KeyPoint 0000028F9DA8F3F0>, <KeyPoint 0000028F9DA8F420>, <KeyPoint 0000028F9DA8F450>, <KeyPoint 0000028F9DA8F480>, <KeyPoint 0000028F9DA8F4B0>, <KeyPoint 0000028F9DA8F4E0>, <KeyPoint 0000028F9DA8F510>, <KeyPoint 0000028F9DA8F540>), array([[0. , 0. , 0. , ..., 0.07758909, 0.07082883,
0.12130836],
[0.06322448, 0.07743386, 0.06829025, ..., 0. , 0. ,
0. ],
[0.12649968, 0.13974288, 0.07898186, ..., 0.05207909, 0.02329048,
0.02329048],
...,
[0.07124705, 0.02794539, 0. , ..., 0. , 0. ,
0.06553776],
[0.02778314, 0. , 0.02778314, ..., 0.043929 , 0.05893693,
0.06805451],
[0.13091446, 0. , 0. , ..., 0. , 0.01930229,
0.04316122]], dtype=float32))
Process finished with exit code 0
but how to make a conclusion from given result? how to show similarity on images? or maybe you have a better advices and i will appreciate it highly. thanks
Related
np.array of np.arrays into one n-dimensional array
I have a multiple .npy files and want to have them in one n-dimentional np.array np.load(file) has shape (n, 3). There are m files, so result shape should be (m, n, 3) Now i have this: np.array([np.load(file) for file in h_files]) output: array([array([[ 3.40040e+00, -1.48372e-02, -6.52934e-01], [ 3.37660e+00, -1.53226e-02, -5.28748e-01], [ 3.36828e+00, -1.58727e-02, -4.08290e-01], ..., [ 3.35563e+00, -2.34267e-03, 2.89650e-01], [ 3.35869e+00, -2.93101e-03, 1.74017e-01], [ 3.36274e+00, -3.52146e-03, 5.80292e-02]], dtype=float32), array([[ 3.40534 , -0.00772648, -0.653887 ], [ 3.37169 , -0.0082386 , -0.527966 ], [ 3.36334 , -0.00880522, -0.407682 ], ..., What result i would like to have: array( [[[ 3.40040e+00, -1.48372e-02, -6.52934e-01], [ 3.37660e+00, -1.53226e-02, -5.28748e-01], [ 3.36828e+00, -1.58727e-02, -4.08290e-01], ..., [ 3.35563e+00, -2.34267e-03, 2.89650e-01], [ 3.35869e+00, -2.93101e-03, 1.74017e-01], [ 3.36274e+00, -3.52146e-03, 5.80292e-02]], [[ 3.40534 , -0.00772648, -0.653887 ], [ 3.37169 , -0.0082386 , -0.527966 ], [ 3.36334 , -0.00880522, -0.407682 ], ...,
I have not checked it, but you could try the following: np.array([np.load(file).tolist() for file in h_files])
Dimensions of array don't match
I have a numpy array and when I print it i get this output. But I expected to get (105835, 99, 13) as output when printing the print(feat.shape) and was expecting feat to have 3 dimensions. print(feat.ndim) print(feat.shape) print(feat.size) print(feat[1].ndim) print(feat[1].shape) print(feat[1].size)` 1 (105835,) 105835 2 (99, 13) 1287 I don't know how to reduce this. But feat is a MFCC feature. If I print feat this is what I get. array([array([[-1.0160675e+01, -1.3804866e+01, 9.1880971e-01, ..., 1.5415058e+00, 1.1875046e-02, -5.8664594e+00], [-9.9697800e+00, -1.3823588e+01, -7.0778362e-02, ..., 1.5948311e+00, 4.3481258e-01, -5.1646194e+00], [-9.9518738e+00, -1.2771760e+01, -1.2623003e-01, ..., 3.4290311e+00, 2.7361808e+00, -6.0621500e+00], ..., [-11.605266 , -7.1909204, -33.44656 , ..., -11.974911 , 12.825395 , 10.635098 ], [-11.769397 , -9.340318 , -34.413307 , ..., -10.077869 , 8.821722 , 7.704534 ], [-12.301968 , -10.67318 , -32.46104 , ..., -6.829077 , 15.29837 , 13.100596 ]], dtype=float32)], dtype=object)
the same structure can be create in a more simple way : ain=rand(2,2) a=ndarray(3,dtype=object) a[:] = [ain]*3 #array([array([[ 0.14, 0.56], # [ 0.9 , 0.9 ]]), # array([[ 0.14, 0.56], # [ 0.9 , 0.9 ]]), # array([[ 0.14, 0.56], # [ 0.9 , 0.9 ]])], dtype=object) The problem arise because a.dtype is object. You can reconstruct your data by : a= array(list(a)) #array([ # [[ 0.14, 0.56], # [ 0.9 , 0.9 ]], # [[ 0.14, 0.56], # [ 0.9 , 0.9 ]], # [[ 0.14, 0.56], # [ 0.9 , 0.9 ]]]) With will have the float type inherited from the base dtype.
Creating a Pandas rolling-window series of arrays
Suppose I have the following code: import numpy as np import pandas as pd x = np.array([1.0, 1.1, 1.2, 1.3, 1.4]) s = pd.Series(x, index=[1, 2, 3, 4, 5]) This produces the following s: 1 1.0 2 1.1 3 1.2 4 1.3 5 1.4 Now what I want to create is a rolling window of size n, but I don't want to take the mean or standard deviation of each window, I just want the arrays. So, suppose n = 3. I want a transformation that outputs the following series given the input s: 1 array([1.0, nan, nan]) 2 array([1.1, 1.0, nan]) 3 array([1.2, 1.1, 1.0]) 4 array([1.3, 1.2, 1.1]) 5 array([1.4, 1.3, 1.2]) How do I do this?
Here's one way to do it In [294]: arr = [s.shift(x).values[::-1][:3] for x in range(len(s))[::-1]] In [295]: arr Out[295]: [array([ 1., nan, nan]), array([ 1.1, 1. , nan]), array([ 1.2, 1.1, 1. ]), array([ 1.3, 1.2, 1.1]), array([ 1.4, 1.3, 1.2])] In [296]: pd.Series(arr, index=s.index) Out[296]: 1 [1.0, nan, nan] 2 [1.1, 1.0, nan] 3 [1.2, 1.1, 1.0] 4 [1.3, 1.2, 1.1] 5 [1.4, 1.3, 1.2] dtype: object
Here's a vectorized approach using NumPy broadcasting - n = 3 # window length idx = np.arange(n)[::-1] + np.arange(len(s))[:,None] - n + 1 out = s.get_values()[idx] out[idx<0] = np.nan This gets you the output as a 2D array. To get a series with each element holding each window as a list - In [40]: pd.Series(out.tolist()) Out[40]: 0 [1.0, nan, nan] 1 [1.1, 1.0, nan] 2 [1.2, 1.1, 1.0] 3 [1.3, 1.2, 1.1] 4 [1.4, 1.3, 1.2] dtype: object If you wish to have a list of 1D arrays split arrays, you can use np.split on the output, like so - out_split = np.split(out,out.shape[0],axis=0) Sample run - In [100]: s Out[100]: 1 1.0 2 1.1 3 1.2 4 1.3 5 1.4 dtype: float64 In [101]: n = 3 In [102]: idx = np.arange(n)[::-1] + np.arange(len(s))[:,None] - n + 1 ...: out = s.get_values()[idx] ...: out[idx<0] = np.nan ...: In [103]: out Out[103]: array([[ 1. , nan, nan], [ 1.1, 1. , nan], [ 1.2, 1.1, 1. ], [ 1.3, 1.2, 1.1], [ 1.4, 1.3, 1.2]]) In [104]: np.split(out,out.shape[0],axis=0) Out[104]: [array([[ 1., nan, nan]]), array([[ 1.1, 1. , nan]]), array([[ 1.2, 1.1, 1. ]]), array([[ 1.3, 1.2, 1.1]]), array([[ 1.4, 1.3, 1.2]])] Memory-efficiency with strides For memory efficiency, we can use a strided one - strided_axis0, similar to #B. M.'s solution, but a bit more generic one. So, to get 2D array of values with NaNs precedding the first element - In [35]: strided_axis0(s.values, fillval=np.nan, L=3) Out[35]: array([[nan, nan, 1. ], [nan, 1. , 1.1], [1. , 1.1, 1.2], [1.1, 1.2, 1.3], [1.2, 1.3, 1.4]]) To get 2D array of values with NaNs as fillers coming after the original elements in each row and the order of elements being flipped, as stated in the problem - In [36]: strided_axis0(s.values, fillval=np.nan, L=3)[:,::-1] Out[36]: array([[1. , nan, nan], [1.1, 1. , nan], [1.2, 1.1, 1. ], [1.3, 1.2, 1.1], [1.4, 1.3, 1.2]]) To get a series with each element holding each window as a list, simply wrap the earlier methods with pd.Series(out.tolist()) with out being the 2D array outputs - In [38]: pd.Series(strided_axis0(s.values, fillval=np.nan, L=3)[:,::-1].tolist()) Out[38]: 0 [1.0, nan, nan] 1 [1.1, 1.0, nan] 2 [1.2, 1.1, 1.0] 3 [1.3, 1.2, 1.1] 4 [1.4, 1.3, 1.2] dtype: object
Your data look like a strided array : data=np.lib.stride_tricks.as_strided(np.concatenate(([NaN]*2,s))[2:],(5,3),(8,-8)) """ array([[ 1. , nan, nan], [ 1.1, 1. , nan], [ 1.2, 1.1, 1. ], [ 1.3, 1.2, 1.1], [ 1.4, 1.3, 1.2]]) """ Then transform in Series : pd.Series(map(list,data)) """" 0 [1.0, nan, nan] 1 [1.1, 1.0, nan] 2 [1.2, 1.1, 1.0] 3 [1.3, 1.2, 1.1] 4 [1.4, 1.3, 1.2] dtype: object """"
If you attach the missing nans at the beginning and the end of the series, you use a simple window def wndw(s,size=3): stretched = np.hstack([ np.array([np.nan]*(size-1)), s.values.T, np.array([np.nan]*size) ]) for begin in range(len(stretched)-size): end = begin+size yield stretched[begin:end][::-1] for arr in wndw(s, 3): print arr
Working with multiple columns from a data file
I have a file in which I need to use the first column. The remaining columns need to be integrated with respect to the first. Lets say my file looks like this: 100 1.0 1.1 1.2 1.3 0.9 110 1.8 1.9 2.0 2.1 2.2 120 1.8 1.9 2.0 2.1 2.2 130 2.0 2.1 2.3 2.4 2.5 Could I write a piece of code that takes the second column and integrates with the first then the third and integrates with respect to the first and so on? For my code I have: import scipy as sp first_col=dat[:,0] #first column from data file cols=dat[:,1:] #other columns from data file col2 = cols[:,0] # gets the first column from variable cols I = sp.integrate.cumtrapz(col2, first_col, initial = 0) #integration step This works only for the first row from the variable col, however, I don't want to write this out for all the other columns, it would look discussing (the thought of it makes me shiver). I have seen similar questions but haven't been able to relate the answers to mine and the ones that are more or less the same have vague answers. Any ideas?
The function cumtrapz accepts an axis argument. For example, suppose you put your first column in x and the remaining columns in y, and they have these values: In [61]: x Out[61]: array([100, 110, 120, 130]) In [62]: y Out[62]: array([[ 1.1, 2.1, 2. , 1.1, 1.1], [ 2. , 2.1, 1. , 1.2, 2.1], [ 1.2, 1. , 1.1, 1. , 1.2], [ 2. , 1.1, 1.2, 2. , 1.2]]) You can integrate each column of y with respect to x as follows: In [63]: cumtrapz(y, x=x, axis=0, initial=0) Out[63]: array([[ 0. , 0. , 0. , 0. , 0. ], [ 15.5, 21. , 15. , 11.5, 16. ], [ 31.5, 36.5, 25.5, 22.5, 32.5], [ 47.5, 47. , 37. , 37.5, 44.5]])
Python numpy broadcasting 3 dimensions (multiple weighted sums)
I've become sort of used to broadcasting with 2 dimensional arrays, but I can't get my head around this 3-dimensional thing I want to do. I have two 2-dimensional arrays: >>> a = np.array([[0.01,.2,.3,.4],[.2,.03,.4,.5],[.9,.8,.7,.06]]) >>> b= np.array([[1,2,3],[3.,4,5]]) >>> a array([[ 0.01, 0.2 , 0.3 , 0.4 ], [ 0.2 , 0.03, 0.4 , 0.5 ], [ 0.9 , 0.8 , 0.7 , 0.06]]) >>> b array([[ 1., 2., 3.], [ 3., 4., 5.]]) Now, what I want is the sum all rows in a, where each row is weighted by the column values in b. So, I want 1. * a[0,:] + 2. * a[1,:] + 3. * a[2,:] and the same for the second row of b. So, I know how to do this step-by-step: >>> (np.array([b[0]]).T * a).sum(0) array([ 3.11, 2.66, 3.2 , 1.58]) >>> (np.array([b[1]]).T * a).sum(0) array([ 5.33, 4.72, 6. , 3.5 ]) But I have the feeling that if I knew how to broadcast the two correctly as 3-dimensional arrays I could get the result I want in one go. The result being: array([[ 3.11, 2.66, 3.2 , 1.58], [ 5.33, 4.72, 6. , 3.5 ]]) I guess this shouldn't be too hard..?!?
You want to do matrix multiplication: >>> b.dot(a) array([[ 3.11, 2.66, 3.2 , 1.58], [ 5.33, 4.72, 6. , 3.5 ]])