Tensorflow vs Numpy math functions - python
Is there any real difference between the math functions performed by numpy and tensorflow. For example, exponential function, or the max function?
The only difference I noticed is that tensorflow takes input of tensors, and not numpy arrays.
Is this the only difference, and no difference in the results of the function, by value?
As has been mentioned, there is the performance difference. TensorFlow has the advantage that it has been designed to work on both CPUs or GPUs, so if you have a CUDA-enabled GPU, chances are TensorFlow is going to be much faster. You can find several benchmarks on the web with different comparisons, and also with other packages such as Numba or Theano.
However, I think that you are talking about whether NumPy and TensorFlow operations are exactly equivalent. The answer is basically yes, that is, the meaning of the operations is the same. However, since they are completely separate libraries with different implementations for everything, you will find small differences in the results. Take this code, for example (TensorFlow 1.2.0, NumPy 1.13.1):
# Force TensorFlow to run on CPU only
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
import numpy as np
import tensorflow as tf
# float32 NumPy array
a = np.arange(100, dtype=np.float32)
# The same array with the same dtype in TensorFlow
a_tf = tf.constant(a, dtype=tf.float32)
# Square root with NumPy
sqrt = np.sqrt(a)
# Square root with TensorFlow
with tf.Session() as sess:
sqrt_tf = sess.run(tf.sqrt(a_tf))
You would expect to get pretty much the same output from both, I mean, a square root doesn't sound like an extremely complex operation after all. However, printing these arrays in my computer I get:
print(sqrt)
>>> array([ 0. , 1. , 1.41421354, 1.73205078, 2. ,
2.23606801, 2.44948983, 2.64575124, 2.82842708, 3. ,
3.1622777 , 3.31662488, 3.46410155, 3.60555124, 3.7416575 ,
3.87298346, 4. , 4.12310553, 4.2426405 , 4.35889912,
4.47213602, 4.5825758 , 4.69041586, 4.79583168, 4.89897966,
5. , 5.09901953, 5.19615221, 5.29150248, 5.38516474,
5.47722578, 5.56776428, 5.65685415, 5.74456263, 5.83095169,
5.91608 , 6. , 6.08276272, 6.16441393, 6.24499798,
6.3245554 , 6.40312433, 6.48074055, 6.55743837, 6.63324976,
6.70820379, 6.78233004, 6.85565472, 6.92820311, 7. ,
7.07106781, 7.14142847, 7.21110249, 7.28010988, 7.34846926,
7.41619825, 7.48331499, 7.54983425, 7.6157732 , 7.68114567,
7.74596691, 7.81024981, 7.8740077 , 7.93725395, 8. ,
8.06225777, 8.1240387 , 8.18535233, 8.24621105, 8.30662346,
8.36660004, 8.42614937, 8.48528099, 8.54400349, 8.60232544,
8.66025448, 8.71779823, 8.77496433, 8.83176041, 8.88819408,
8.94427204, 9. , 9.05538559, 9.11043358, 9.1651516 ,
9.21954441, 9.2736187 , 9.32737923, 9.38083172, 9.43398094,
9.48683262, 9.53939247, 9.59166336, 9.64365101, 9.69536018,
9.7467947 , 9.79795933, 9.84885788, 9.89949512, 9.94987392], dtype=float32)
print(sqrt_tf)
>>> array([ 0. , 0.99999994, 1.41421342, 1.73205078, 1.99999988,
2.23606801, 2.44948959, 2.64575124, 2.82842684, 2.99999976,
3.1622777 , 3.31662488, 3.46410155, 3.60555077, 3.74165726,
3.87298322, 3.99999976, 4.12310553, 4.2426405 , 4.35889864,
4.47213602, 4.58257532, 4.69041538, 4.79583073, 4.89897919,
5. , 5.09901857, 5.19615221, 5.29150248, 5.38516474,
5.47722483, 5.56776428, 5.65685368, 5.74456215, 5.83095121,
5.91607952, 5.99999952, 6.08276224, 6.16441393, 6.24499846,
6.3245554 , 6.40312433, 6.48074055, 6.5574379 , 6.63324976,
6.70820427, 6.78233004, 6.85565472, 6.92820311, 6.99999952,
7.07106733, 7.14142799, 7.21110153, 7.28010893, 7.34846973,
7.41619825, 7.48331451, 7.54983425, 7.61577368, 7.68114567,
7.74596643, 7.81025028, 7.8740077 , 7.93725395, 7.99999952,
8.06225681, 8.12403774, 8.18535233, 8.24621105, 8.30662346,
8.36660004, 8.42614937, 8.48528099, 8.54400253, 8.60232449,
8.66025352, 8.71779728, 8.77496433, 8.83176041, 8.88819408,
8.94427204, 8.99999905, 9.05538464, 9.11043262, 9.16515064,
9.21954441, 9.27361774, 9.32737923, 9.38083076, 9.43398094,
9.48683357, 9.53939152, 9.59166145, 9.64365005, 9.69535923,
9.7467947 , 9.79795837, 9.84885788, 9.89949417, 9.94987392], dtype=float32)
So, okay, it's similar, but there are obvious differences. TensorFlow couldn't even get right the square roots of 1, 4 or 9, for example. And you would probably get yet a different result if you run it on a GPU (due to the GPU kernels being different from the CPU kernels and the dependence on CUDA routines implemented by NVIDIA, another player in the field).
My impression (although I may be wrong) is that TensorFlow is more willing to sacrifice a bit of precision in exchange of performance (which would make sense considering its typical use case). I have even seen some more complicated operations to produce (very slightly) different results just running it twice (on the same hardware), probably due to unspecified order in aggregation and averaging operations causing rounding errors (I generally use float32, so that's a factor too I guess).
Of course there is a real difference. Numpy works on arrays which can use highly optimized vectorized computations and it's doing pretty well on CPU whereas tensorflow's math functions are optimized for GPU where many matrix multiplications are much more important. So the question is where you want to use what. For CPU, I would just go with numpy whereas for GPU, it makes sense to use TF operations.
Related
SciPy Rotation only gives correct answer when I use inv()
What am I doing wrong here? I have a function that rotates Euler angle inputs using rotation matrices and NumPy. This is proven to be correct because it is currently running on drones that are using it in real-time in their flight control loop. I'm trying to implement the same thing using SciPy Rotation. When I do it, the result is gibberish, but if I stick an inv() in there for seemingly no reason, it gives the correct answer. Where did stuff up? VF, BUL and VTOL are just different reference frames. I load the rotation matrices from a yaml. My input (vehicle_rph) is a time-series array of vectors measured in VF frame but I need to know the answer in VTOL frame. Everything in the yaml is compared to BUL so in order to get to VTOL from VF, if have to go through VF -> BUL -> VTOL Below I do everything twice - once in NumPy and once in SciPy. As you can see, the NumPy (proven to be correct) matches up with the SciPy when I put a random inv() in there. Bonus points if you can tell my why the Z axis differs slightly, but it's close enough for my needs as is. Thanks in advance!! INPUT: #using numpy arrays R_VF_from_BUL = frames['frames']['VF']['R_vf_from_bul'] R_VTOL_from_BUL = frames['frames']['VTOL-B']['R_vtolb_from_bul'] R_BUL_from_VF = np.transpose(R_VF_from_BUL) R_VTOL_from_VF = np.matmul(R_VTOL_from_BUL, R_BUL_from_VF) #using scipy rotations r_vf_from_bul = R.from_matrix(frames['frames']['VF']['R_vf_from_bul']) r_vtol_from_bul = R.from_matrix(frames['frames']['VTOL-B']['R_vtolb_from_bul']) r_bul_from_vf = r_vf_from_bul.inv() r_vtol_from_vf = r_vtol_from_bul*r_bul_from_vf print(f'Using numpy:\n{R_VTOL_from_VF}') print('') print(f'Using scipy:\n{r_vtol_from_vf.as_matrix()}') vehicle_rph_vtol_sp = (r_vtol_from_vf*R.from_euler('XYZ', vehicle_rph)).as_euler('XYZ') vehicle_rph_vtol_sp_inv = (r_vtol_from_vf.inv()*R.from_euler('XYZ', vehicle_rph)).as_euler('XYZ') vehicle_rph_vtol_np = transform_euler_angles_signal_to_vtol_from_vf_frame(vehicle_rph) print('') print(f'sp reg: {vehicle_rph_vtol_sp[0]}') print(f'sp inv: {vehicle_rph_vtol_sp_inv[0]}') print(f'np reg: {vehicle_rph_vtol_np[0]}') OUTPUT: Rotation matrix using NumPy: [[ 0.52205926 0. 0.85290921] [ 0. 1. 0. ] [-0.85290921 0. 0.52205926]] Rotation matrix using SciPy: [[ 0.52205926 0. 0.85290921] [-0. 1. 0. ] [-0.85290921 -0. 0.52205926]] sp reg: [-3.11319763 1.13894171 -1.90741245] sp inv: [-0.01189337 -0.04056495 1.25948718] np reg: [-0.01189337 -0.04056495 1.29595719]
How do I run expit function on an array on a GPU?
I am trying to benchmark a simple neural network based handwritten digit recognition application. It currently uses Numpy for matrices, and scipy's expit function for activation. As good as this was ( pretty basic network really), I wanted to run this whole thing on GPU, and hence decided to use Cupy library. Unfortunately, I am unable to get the expit function to run on GPU. I keep getting an error message stating "not implemented". self.activation_function = lambda x: scipy.special.expit(x) error message TypeError: operand type(s) all returned NotImplemented from array_ufunc(<ufunc 'expit'>, 'call', array([[ 0.96079161], [ 1.37400426], [-0.46329254]])): 'ndarray'
I've just met the exactly same problem today, you could try defining functions with CuPy's User-Defined Kernels. For the sigmoid function: import cupy as cp expit = cp.ElementwiseKernel( 'float64 x', 'float64 y', 'y = 1 / (1 + exp(-x))', 'expit') >>> x = cp.arange(10, dtype=np.float64).reshape(2, 5) >>> expit(x) array([[0.5 , 0.73105858, 0.88079708, 0.95257413, 0.98201379], [0.99330715, 0.99752738, 0.99908895, 0.99966465, 0.99987661]]) >>> scipy.special.expit(x.get()) array([[0.5 , 0.7310586 , 0.880797 , 0.95257413, 0.98201376], [0.9933072 , 0.9975274 , 0.999089 , 0.99966466, 0.9998766 ]], dtype=float32)
Why is RNG different for TensorFlow 2 and 1?
import numpy as np np.random.seed(1) import random random.seed(2) import tensorflow as tf tf.compat.v1.set_random_seed(3) # graph-level seed if tf.__version__[0] == '2': tf.random.set_seed(4) # global seed else: tf.set_random_seed(4) # global seed from tensorflow.keras.initializers import glorot_uniform as GlorotUniform from tensorflow.keras import backend as K init = GlorotUniform(seed=5)(shape=(4, 4)) print(K.eval(init)) [[-0.75889236 0.5744677 0.82025963 -0.26889956] [ 0.0180248 -0.24747121 -0.0666492 0.23440498] [ 0.61886185 0.05548459 0.39713246 0.126324 ] [ 0.6639387 -0.58397514 0.39671892 0.67872125]] # TF 2 [[ 0.2515846 -0.41902617 -0.7859829 0.41573995] [ 0.8099498 -0.6861247 -0.46198446 -0.7579694 ] [ 0.29976922 0.0310365 0.5031274 0.314076 ] [-0.62062943 -0.01889879 0.7725797 -0.65635633]] # TF 1 Why the difference? This is creating severe reproducibility problems between the two versions - and this or something else, within the same version's (TF2) Graph vs. Eager. More importantly, can TF1's RNG sequence be used in TF2?
With enough digging - yes. TL;DR: TF2 behavior in TF1: from tensorflow.python.keras.initializers import GlorotUniformV2 as GlorotUniform TF1 behavior in TF2: from tensorflow.python.keras.initializers import GlorotUniform TF2 essentially executes the first bullet under the hood; GlorotUniform is actually GlorotUniformV2. Some details: Found docs - but code itself terminates at some pywrapped compiled code (TF1 -- TF2 -- for some reason Github refuses to show gen_stateless_random_ops for TF2 and gen_random_ops for TF1, but you can find both in the local install): tensorflow.python.ops.gen_random_ops.truncated_normal Outputs random values from a truncated normal distribution. The generated values follow a normal distribution with mean 0 and standard deviation 1, except that values whose magnitude is more than 2 standard deviations from the mean are dropped and re-picked. tensorflow.python.ops.gen_stateless_random_ops.truncated_normal Outputs deterministic pseudorandom values from a truncated normal distribution. The generated values follow a normal distribution with mean 0 and standard deviation 1, except that values whose magnitude is more than 2 standard deviations from the mean are dropped and re-picked. The outputs are a deterministic function of shape and seed. The first and second are ultimately where GlorotUniform and GlorotUniformV2 route to, respectively. TF2's from tensorflow.keras.initializers imports from init_ops_v2 (i.e. V2), whereas TF1's from init_ops.
How to get a good cv2.stereoCalibrate after successful cv2.calibrateCamera
Hi everyone I've been digging a bit into computer vision using Python and OpenCV and was trying to calibrate two cameras I've bought in order to do some 3D stereo reconstruction but I'm having some problems with it. I've followed mostly this tutorial in order to calibrate the cameras separately (I apply it to both of them) and then I intend to use the cv2.stereoCalibrate to get the relative calibration. With the single camera calibration everything seems to be working correctly, I get a very low re-proyect error and as far as my knowledge goes the matrices seems to look OK. Here I leave the results of the single camera calibration. cameraMatrix1 and distCoeffs1: [[ 951.3607329 0. 298.74117671] [ 0. 954.23088299 219.20548594] [ 0. 0. 1. ]] [[ -1.07320015e-01 -5.56147908e-01 -1.13339913e-03 1.85969704e-03 2.24131322e+00]] cameraMatrix2 and distCoeffs2: [[ 963.41078117 0. 362.85971342] [ 0. 965.66793023 175.63216871] [ 0. 0. 1. ]] [[ -3.31491728e-01 2.26020466e+00 3.86190151e-03 -2.32988011e-03 -9.82275646e+00]] So after having those I do the following (I fix the intrinsics as I already know them from the previous calibrations): stereocalibration_criteria = (cv2.TERM_CRITERIA_MAX_ITER + cv2.TERM_CRITERIA_EPS, 100, 1e-5) stereocalibration_flags = cv2.CALIB_FIX_INTRINSIC stereocalibration_retval, cameraMatrix1, distCoeffs1, cameraMatrix2, distCoeffs2, R, T, E, F = cv2.stereoCalibrate(objpoints,imgpoints_left,imgpoints_right,cameraMatrix1,distCoeffs1,cameraMatrix2,distCoeffs2,gray_left.shape[::-1],criteria = stereocalibration_criteria, flags = stereocalibration_flags) I've tried several times to change the flags of the stereoCalibrate and switch the matrices to see if I was mistaken in the order and that mattered but I'm still blocked with this and get a retval of around 30 (and after that I try to rectify the images and of course the result is a disaster). I've also tried using some calibration images from the internet and I do get the same result so I assume that the problem is not with the images I've taken. If anyone can point me in the right direction or knows what could be it will be very very welcome.
Turns out that the order of the images I was using was not the same for right and left camera... I was using images_left = glob.glob('Calibration/images/set1/left*' + images_format) images_right = glob.glob('Calibration/images/set1/right*' + images_format) When I should have been using something more like: images_left = sorted(glob.glob('Calibration/images/set1/left*' + images_format)) images_right = sorted(glob.glob('Calibration/images/set1/right*' + images_format)) This is because glob gets the images in an apparently random order so I was trying to match the wrong images. Now I finally get a 0.4 retval, which is not that bad.
Replace numbers below threshold # in numpy array with zeroes
So I have a very big Numpy array (2560x1920). Its actually from a grayscale picture where every pixel is given a number from 0-1 indicating its brightness. I'm trying to replace all values below a threshold, say 0.5, with zeroes. This is probably a simple task but I'm a beginner with Numpy and I've searched around and still can't figure it out. This is what I've attempted and I know its wrong... for x in np.nditer(Image): if x < .5: x == 0 plt.imshow(Image, cmap=plt.cm.gray) plt.show() It just outputs the normal image without changing anything. Also the array looks something like this (abbreviated obviously): [[ 0.24565263 0.24565263 0.24902149 ..., 0.27528678 0.27265316 0.27606536] [ 0.24565263 0.24565263 0.24902149 ..., 0.27870309 0.27606536 0.27948296] [ 0.24228902 0.24228902 0.24565263 ..., 0.28212482 0.27948296 0.282906 ] ..., [ 0.29706944 0.29706944 0.29706944 ..., 0.17470162 0.17144636 0.17144636] [ 0.29362457 0.29362457 0.29362457 ..., 0.17144636 0.16495056 0.16170998] [ 0.2901852 0.2901852 0.2901852 ..., 0.16819602 0.16170998 0.15847427]]
There is numpy's builtin indexing which can be used for replacing elements. This is can be done as: Image[Image<0.5] = 0
I have returned from the future with suggestions. The above approaches work great for a simple global threshold. I'm posting this answer to warn that non-adaptive thresholding, like this, may be too naïve depending on your application. Without adaptation to an image's average brightness or other qualities, your output won't be consistent if you are analyzing multiple pictures taken in different conditions. There are much more accurate approaches for this, but they are a tiny bit more complicated. Scikit-Image makes this easy. One of the most popular approaches is Otsu's (I can't say which is the most accurate for each situation, I haven't researched enough). https://en.wikipedia.org/wiki/Otsu%27s_method Scikit-Image has this and a few other algorithms built into their modules. The answer to the question above, using this approach, is as simple as this: import matplotlib.pyplot as plt from skimage.filters import threshold_otsu thresh = threshold_otsu(Image) binary = Image > thresh plt.imshow(Image, cmap=plt.cm.gray) plt.show() Read an example here: http://scikit-image.org/docs/dev/auto_examples/plot_otsu.html And about usage here: http://scikit-image.org/docs/dev/api/skimage.filters.html?highlight=local%20otsu