I am working with a 4-D array input to a CNN network. The input array has the following shape
print('X_train shape: ', X_train.shape)
X_train shape: (47204, 1, 100, 4)
Data description:
The input data consists of a 47204 instances (fixed-length segments as far CNN requirement). Each instance (1, 100, 4) i.e. 1 segment contains 100-GPS points, and for each point, 4-corresponding point kinematics (max_speed, avg_speed, max_acc, avg_acc) are stored, thus the (1, 100, 4). Labels are stored in a separate y_train array of shape (47204,) for 5 classes [0..4].
print(y_train)
[3 3 0 ... 2 3 4]
To get a better sense of my X_train array, I show the first 3 elements below:
print(X_train[1:3])
[
[[[ 3.82280987e+00 2.16802350e-01 7.49917451e-02 3.44416369e-04]
[ 3.38707371e+00 2.02210055e-01 1.61751110e-03 1.93745950e-03]
[ 2.49202215e+00 1.60605262e-01 8.43561351e-03 2.40057917e-03]
...
[ 2.00022316e+00 2.70020923e-01 5.40441673e-02 3.57212151e-03]
[ 3.25199744e-01 9.06990382e-02 1.46808316e-02 1.65841315e-03]
[2.96587589e-01 0.00000000e+00 6.13293351e-04 4.16518187e-03]]]
[[[ 1.07209176e+00 7.27038312e-02 6.62777026e-03 2.04611951e-04]
[ 1.06194285e+00 5.05005456e-02 4.05676569e-03 3.72293433e-04]
[ 1.02849748e+00 2.12558178e-02 2.95477005e-03 5.56584054e-04]
...
[ 4.51962909e-03 5.63125736e-04 5.98474074e-04 1.63036715e-05]
[ 2.83026181e-03 2.35855075e-03 1.25789358e-03 2.15331510e-06]
[8.49078543e-03 2.16840434e-19 9.43423077e-04 1.29198906e-05]]]
[[[ 7.51127665e+00 3.14033478e-01 6.85170617e-02 7.73415075e-04]
[ 7.42307262e+00 1.33868251e-01 4.10564823e-02 1.16131460e-03]
[ 7.35818066e+00 1.23886976e-02 3.02312582e-02 1.28312101e-03]
...
[ 7.40826167e+00 1.19388656e-01 4.00874715e-02 2.04909489e-04]
[ 7.23779176e+00 1.33269965e-01 1.20430502e-02 1.58195900e-04]
[ 7.11697001e+00 4.68002105e-02 5.42478400e-02 3.58101318e-05]]]
]
Task:
I am required to create a machine learning model (e.g. random forest) using the 4 kinematics (max_speed, avg_speed, max_acc, avg_acc) as features. This requires navigating each instance and getting these features for the 100-points in the instance.
Clearly, the number of samples will then be 4720400 (i.e. 47204 x 100), so would also match each value to the corresponding label of its instances, i.e. y_train will then be (4720400,).
The expected input would then be like:
max_speed avg_speed max_acc avg_acc class
0 3.82280987e+00 2.16802350e-01 7.49917451e-02 3.44416369e-04 3
1 3.38707371e+00 2.02210055e-01 1.61751110e-03 1.93745950e-03 3
2 2.49202215e+00 1.60605262e-01 8.43561351e-03 2.40057917e-03 3
...
I have being thinking about how to do this all through the week, all ideas evaporated. How do I do this, please?
You can reshape your X_train array from (47204, 1, 100, 4) to (4720400, 4) simply with:
X_train_reshaped = X_train.reshape(4720400, 4)
It preserves the data order and the total number of elements will be the same.
Similarly, you can expand y_train array using repeat command:
Y_train_reshaped = numpy.repeat(Y_train, 100)
Note the 100 for repeat command. Since you had one label for 100 data points, we will expand these items 100 times. This command will preserve data order too so all instances will have the same original label.
I have a 4D array with shape (4, 320, 528, 279) which in fact is a data set of 4, 3D image stacks.
What I am trying to achieve is to normalize each pixel of each 3D image between all the samples. So let's say the first pixel values with coordinates (0,0,0) in the four images are [140., 20., 10., 220.]. I would like to change those values so that they become : [0.61904762, 0.04761905, 0., 1.].
I wrote a script that supposedly achieves this :
def NormalizeMatrix(mat) :
mat = np.array(mat);
sink = mat.copy();
for i in np.arange(mat.shape[1]) :
for j in np.arange(mat.shape[2]) :
for k in np.arange(mat.shape[3]) :
PixelValues = mat[:,i,j,k];
Min = float(PixelValues.min());
Max = float(PixelValues.max());
if Max-Min != 0. :
sink[:,i,j,k] = (PixelValues - Min) / (Max - Min);
else :
sink[:,i,j,k] = np.full_like(PixelValues, 0.);
return sink;
But this is really REALLY slow !
How can I make this faster ?
Any ideas ?
Tom
I think I found a pretty fast way in the end which actually goes in the way of user3483203 :
def NormalizeMatrix(mat) :
mat = np.array(mat);
minMat = np.min(mat, axis=0, keepdims=1);
maxMat = np.max(mat, axis=0, keepdims=1);
sink = (mat - minMat)/ (maxMat - minMat);
return sink;
This takes 5-10s instead of hours on my machine :)
I am using Python with OpenCV 3.4.
I have a system composed of 2 cameras that I want to use to track an object and get its trajectory, then its speed.
I am currently able to calibrate intrinsically and extrinsically each of my cameras. I can track my object through the video and get the 2d coordinates in my video plan.
My problem now is that I would like to project my points from my both 2D plan into 3D points.
I've tried functions as triangulatePoints but it seems it's not working in a proper way.
Here is my actual function to get 3d coords. It returns some coordinates that seems a little bit off compared to the actual coordinates
def get_3d_coord(left_two_d_coords, right_two_d_coords):
pt1 = left_two_d_coords.reshape((len(left_two_d_coords), 1, 2))
pt2 = right_two_d_coords.reshape((len(right_two_d_coords), 1, 2))
extrinsic_left_camera_matrix, left_distortion_coeffs, extrinsic_left_rotation_vector, \
extrinsic_left_translation_vector = trajectory_utils.get_extrinsic_parameters(
1)
extrinsic_right_camera_matrix, right_distortion_coeffs, extrinsic_right_rotation_vector, \
extrinsic_right_translation_vector = trajectory_utils.get_extrinsic_parameters(
2)
#returns arrays of the same size
(pt1, pt2) = correspondingPoints(pt1, pt2)
projection1 = computeProjMat(extrinsic_left_camera_matrix,
extrinsic_left_rotation_vector, extrinsic_left_translation_vector)
projection2 = computeProjMat(extrinsic_right_camera_matrix,
extrinsic_right_rotation_vector, extrinsic_right_translation_vector)
out = cv2.triangulatePoints(projection1, projection2, pt1, pt2)
oc = []
for idx, elem in enumerate(out[0]):
oc.append((out[0][idx], out[1][idx], out[2][idx], out[3][idx]))
oc = np.array(oc, dtype=np.float32)
point3D = []
for idx, elem in enumerate(oc):
W = out[3][idx]
obj = [None] * 4
obj[0] = out[0][idx] / W
obj[1] = out[1][idx] / W
obj[2] = out[2][idx] / W
obj[3] = 1
pt3d = [obj[0], obj[1], obj[2]]
point3D.append(pt3d)
return point3D
Here are some screenshot of the 2d trajectory that I get for both my cameras :
Here are some screenshot of the 3d trajectory that we get for the same camera.
As you can see the 2d trajectory doesn't look as the 3d one, and I am not able to get a accurate distance between two points.
I just would like getting real coordinates, it means knowing the (almost) exact real distance walked by a person even in a curved road.
EDIT to add reference data and examples
Here is some example and input data to reproduce the problem.
First, here are some data.
2D points for camera1
546,357
646,351
767,357
879,353
986,360
1079,365
1152,364
corresponding 2D for camera2
236,305
313,302
414,308
532,308
647,314
752,320
851,323
3D points that we get from triangulatePoints
"[0.15245444, 0.30141047, 0.5444277]"
"[0.33479974, 0.6477136, 0.25396818]"
"[0.6559921, 1.0416716, -0.2717265]"
"[1.1381898, 1.5703914, -0.87318224]"
"[1.7568599, 1.9649554, -1.5008119]"
"[2.406788, 2.302272, -2.0778883]"
"[3.078426, 2.6655817, -2.6113863]"
In these following images, we can see the 2d trajectory (top line) and the 3d projection reprojected in 2d (bottom line). Colors are alternating to show which 3d points correspond to 2d point.
And finally here are some data to reproduce.
camera 1 : camera matrix
5.462001610064596662e+02 0.000000000000000000e+00 6.382260289544193483e+02
0.000000000000000000e+00 5.195528638702176067e+02 3.722480290221320161e+02
0.000000000000000000e+00 0.000000000000000000e+00 1.000000000000000000e+00
camera 2 : camera matrix
4.302353276501239066e+02 0.000000000000000000e+00 6.442674231451971991e+02
0.000000000000000000e+00 4.064124751062329324e+02 3.730721752718034736e+02
0.000000000000000000e+00 0.000000000000000000e+00 1.000000000000000000e+00
camera 1 : distortion vector
-1.039009381799949928e-02 -6.875769941694849507e-02 5.573643708806085006e-02 -7.298826373638074051e-04 2.195279856716004369e-02
camera 2 : distortion vector
-8.089289768586239993e-02 6.376634681503455396e-04 2.803641672679824115e-02 7.852965318823987989e-03 1.390248981867302919e-03
camera 1 : rotation vector
1.643658457134109296e+00
-9.626823326237364531e-02
1.019865700311696488e-01
camera 2 : rotation vector
1.698451227150894471e+00
-4.734769748661146055e-02
5.868343803315514279e-02
camera 1 : translation vector
-5.004031689969588026e-01
9.358682517577661120e-01
2.317689087311113116e+00
camera 2 : translation vector
-4.225788801112133619e+00
9.519952012307866251e-01
2.419197507326224184e+00
camera 1 : object points
0 0 0
0 3 0
0.5 0 0
0.5 3 0
1 0 0
1 3 0
1.5 0 0
1.5 3 0
2 0 0
2 3 0
camera 2 : object points
4 0 0
4 3 0
4.5 0 0
4.5 3 0
5 0 0
5 3 0
5.5 0 0
5.5 3 0
6 0 0
6 3 0
camera 1 : image points
5.180000000000000000e+02 5.920000000000000000e+02
5.480000000000000000e+02 4.410000000000000000e+02
6.360000000000000000e+02 5.910000000000000000e+02
6.020000000000000000e+02 4.420000000000000000e+02
7.520000000000000000e+02 5.860000000000000000e+02
6.500000000000000000e+02 4.430000000000000000e+02
8.620000000000000000e+02 5.770000000000000000e+02
7.000000000000000000e+02 4.430000000000000000e+02
9.600000000000000000e+02 5.670000000000000000e+02
7.460000000000000000e+02 4.430000000000000000e+02
camera 2 : image points
6.080000000000000000e+02 5.210000000000000000e+02
6.080000000000000000e+02 4.130000000000000000e+02
7.020000000000000000e+02 5.250000000000000000e+02
6.560000000000000000e+02 4.140000000000000000e+02
7.650000000000000000e+02 5.210000000000000000e+02
6.840000000000000000e+02 4.150000000000000000e+02
8.400000000000000000e+02 5.190000000000000000e+02
7.260000000000000000e+02 4.160000000000000000e+02
9.120000000000000000e+02 5.140000000000000000e+02
7.600000000000000000e+02 4.170000000000000000e+02
Assuming both your resolutions are 1280x720 I calculated the left camera rotation and translation.
left_obj = np.array([[
[0, 0, 0],
[0, 3, 0],
[0.5, 0, 0],
[0.5, 3, 0],
[1, 0, 0],
[1 ,3, 0],
[1.5, 0, 0],
[1.5, 3, 0],
[2, 0, 0],
[2, 3, 0]
]], dtype=np.float32)
left_img = np.array([[
[5.180000000000000000e+02, 5.920000000000000000e+02],
[5.480000000000000000e+02, 4.410000000000000000e+02],
[6.360000000000000000e+02, 5.910000000000000000e+02],
[6.020000000000000000e+02, 4.420000000000000000e+02],
[7.520000000000000000e+02, 5.860000000000000000e+02],
[6.500000000000000000e+02, 4.430000000000000000e+02],
[8.620000000000000000e+02, 5.770000000000000000e+02],
[7.000000000000000000e+02, 4.430000000000000000e+02],
[9.600000000000000000e+02, 5.670000000000000000e+02],
[7.460000000000000000e+02, 4.430000000000000000e+02]
]], dtype=np.float32)
left_camera_matrix = np.array([
[4.777926320579549042e+02, 0.000000000000000000e+00, 5.609694925007885331e+02],
[0.000000000000000000e+00, 2.687583555325996372e+02, 5.712247987054799978e+02],
[0.000000000000000000e+00, 0.000000000000000000e+00, 1.000000000000000000e+00]
])
left_distortion_coeffs = np.array([
-8.332059138465927606e-02,
-1.402986394998156472e+00,
2.843132503678651168e-02,
7.633417606366312003e-02,
1.191317644548635979e+00
])
ret, left_camera_matrix, left_distortion_coeffs, rot, trans = cv2.calibrateCamera(left_obj, left_img, (1280, 720),
left_camera_matrix, left_distortion_coeffs, None, None, cv2.CALIB_USE_INTRINSIC_GUESS)
print(rot[0])
print(trans[0])
I got different results:
[[ 2.7262137 ] [-0.19060341] [-0.30345874]]
[[-0.48068581] [ 0.75257108] [ 1.80413094]]
The same for right camera:
[[ 2.1952522 ] [ 0.20281459] [-0.46649734]]
[[-2.96484428] [-0.0906817 ] [ 3.84203022]]
You can check rotations approximately this way: calculate relative rotation between computed results and compare against relative rotation between real camera positions. Translations: calculate relative normalized translation vector between computed results and compare against normalized relative translation between real camera positions. What coordinate system OpenCV uses is depicted here .
I am using MXNet on IRIS dataset which has 4 features and it classifies the flowers as -'setosa', 'versicolor', 'virginica'. My training data has 89 rows. My label data is a row vector of 89 columns. I encoded the flower names into number -0,1,2 as it seems mx.io.NDArrayIter does not accept numpy ndarray with string values. Then I tried to predict using
re = mod.predict(test_iter)
I get a result which has the shape 14 * 10.
Why am I getting 10 columns when I have only 3 labels and how do I map these results to my labels. The result of predict is shown below:
[[ 0.11760861 0.12082944 0.1207106 0.09154381 0.09155304 0.09155869
0.09154817 0.09155204 0.09154914 0.09154641] [ 0.1176083 0.12082954 0.12071151 0.09154379 0.09155323 0.09155825
0.0915481 0.09155164 0.09154923 0.09154641] [ 0.11760829 0.1208293 0.12071083 0.09154385 0.09155313 0.09155875
0.09154838 0.09155186 0.09154932 0.09154625] [ 0.11760861 0.12082901 0.12071037 0.09154388 0.09155303 0.09155875
0.09154829 0.09155209 0.09154959 0.09154641] [ 0.11760896 0.12082863 0.12070955 0.09154405 0.09155299 0.09155875
0.09154839 0.09155225 0.09154996 0.09154646] [ 0.1176089 0.1208287 0.1207095 0.09154407 0.09155297 0.09155882
0.09154844 0.09155232 0.09154989 0.0915464 ] [ 0.11760896 0.12082864 0.12070941 0.09154408 0.09155297 0.09155882
0.09154844 0.09155234 0.09154993 0.09154642] [ 0.1176088 0.12082874 0.12070983 0.09154399 0.09155302 0.09155872
0.09154837 0.09155215 0.09154984 0.09154641] [ 0.11760852 0.12082904 0.12071032 0.09154394 0.09155304 0.09155876
0.09154835 0.09155209 0.09154959 0.09154631] [ 0.11760963 0.12082832 0.12070873 0.09154428 0.09155257 0.09155893
0.09154856 0.09155177 0.09155051 0.09154671] [ 0.11760966 0.12082829 0.12070868 0.09154429 0.09155258 0.09155892
0.09154858 0.0915518 0.09155052 0.09154672] [ 0.11760949 0.1208282 0.12070852 0.09154446 0.09155259 0.09155893
0.09154854 0.09155205 0.0915506 0.09154666] [ 0.11760952 0.12082817 0.12070853 0.0915444 0.09155261 0.09155891
0.09154853 0.09155206 0.09155057 0.09154668] [ 0.1176096 0.1208283 0.12070892 0.09154423 0.09155267 0.09155882
0.09154859 0.09155172 0.09155044 0.09154676]]
Using "y = mod.predict(val_iter,num_batch=1)" instead of "y = mod.predict(val_iter)", then you can get only one batch labels. For example,if you batch_size is 10, then you will only get the 10 labels.
I want to translate the following group coloring octave function to python and use it with pyplot.
Function input:
x - Data matrix (m x n)
a - A parameter.
index - A vector of size "m" with values in range [: a]
(For example if a = 4, index can be [random.choice(range(4)) for i in range(m)]
The values in "index" indicate the number of the group the "m"th data point belongs to.
The function should plot all the data points from x and color them in different colors (Number of different colors is "a").
The function in octave:
p = hsv(a); % This is a x 3 metrix
colors = p(index, :); % ****This is m x 3 metrix****
scatter(X(:,1), X(:,2), 10, colors);
I couldn't find a function like hsv in python, so I wrote it myself (I think I did..):
p = colors.hsv_to_rgb(numpy.column_stack((
numpy.linspace(0, 1, a), numpy.ones((a ,2)) )) )
But I can't figure out how to do the matrix selection p(index, :) in python (numpy).
Specially because the size of "index" is bigger then "a".
Thanks in advance for your help.
So, you want to take an m x 3 of HSV values, and convert each row to RGB?
import numpy as np
import colorsys
mymatrix = np.matrix([[11,12,13],
[21,22,23],
[31,32,33]])
def to_hsv(x):
return colorsys.rgb_to_hsv(*x)
#Apply the to_hsv function to each matrix row.
print np.apply_along_axis(to_hsv, axis=1, arr=mymatrix)
This produces:
[[ 0.5 0. 13. ]
[ 0.5 0. 23. ]
[ 0.5 0. 33. ]]
Follow through on your comment:
If I understand you have a matrix p that is an a x 3 matrix, and you want to randomly select rows from the matrix over and over again, until you have a new matrix that is m x 3?
Ok. Let's say you have a matrix p defined as follows:
a = 5
p = np.random.randint(5, size=(a, 3))
Now, make a list of random integers between the range 0 -> 3 (index starts at 0 and ends to a-1), That is m in length:
m = 20
index = np.random.randint(a, size=m)
Now access the right indexes and plug them into a new matrix:
p_prime = np.matrix([p[i] for i in index])
Produces a 20 x 3 matrix.