I was trying to interpolate data using Rbf. The output data that I need is actually a single value. So I used something like
x=numpy.array([100])
y=numpy.array([200])
d=numpy.array([300])
rbfi=scipy.interpolate.Rbf(x,y,d)
But there was an error:
ValueError: array must not contain infs or NaNs
Does anybody know how to solve this problem? Thanks a lot!
Quite outdated but in case anyone wonders:
Rbf requires at least 2 data points.
x=numpy.array([100,120])
y=numpy.array([200,220])
d=numpy.array([300,100])
rbfi=scipy.interpolate.Rbf(x,y,d)
>>> print(rbfi)
<scipy.interpolate.rbf.Rbf object at 0x7fdac142b240>
Related
I have 3 columns: id, sentiment, review. I crating vectors and I am putting it thru a RandomForest in order to make a prediction of the sentiment.
On the following line:
forest = forest.fit(trainDataVecs, train["sentiment"])
I keep getting the following error:
Error is: ValueError: Input contains NaN, infinity or a value too large for dtype('float32').
I got it working in a very small sample file but it refuses to work on my large main one. I have checked and I am 100% certain there are no NULL entries. Some of the reviews are very long and I thing what must be happening is that the review length is a problem somewhere.
Please help!
The issue seems to be when you're reading one of the numerical columns. I would suggest that when you're reading the data from the source, you change the type to something more precise like np.float64, or greater, and also remove an invalid values like follows:
# A is the vector you want to clean
A[~np.isnan(A)] = 0.0
I need to use CubicSpline to interpolated between points. This is my function
cs = CubicSpline(aTime, aControl)
u = cs(t) # u is a ndarray of one element.
I cannot convert u to a float. uu = float(u) or uu = float(u[0]) doesn't work in the function.
I can convert u to a float in the shell by float(u). This shouldn't work because I have not provided an index but I get an error if I use u[0].
I have read something about np.squeeze. I tried it but it didn't help.
I added a print ("u=",u) statement after the u=cs(t). The result was
u= [ 1.88006889e+09 5.39398193e-01 5.39398193e-01]
How can this be? I expect 1 value. The second and third numbers look about right.
I found the problem. Programming error, of course but the error messages I got were very misleading. I was calling the interpolate function with 3 values so it returned three vales. Why I couldn't get just the one afterwards is still a mystery but now that I call the interpolate with just one value I get one float as expected. Overall this still didn't help as the interpolate1d function is too slow. I wrote my own cubic interpolate function that is MUCH faster.
Again, programming error and poor error messages were the problem.
I need to feed an image and a vector sampled from normal distribution simultaneously. As the image dataset I'm using is too large, I create a ImageDeserializer for that part. But I also need to add random vector (sampled from numpy normal distribution), to the input map before feed it to the network. Is there any way to achieve this?
I also test:
mb_data = reader_train.next_minibatch(mb_size, input_map=input_map)
mb_data[random_input_node] = np.random.normal((mb_size, 100))
but get the following error:
TypeError: cannot convert value of dictionary to N4CNTK13MinibatchDataE
The problem solved with the following snippet to feed data to trainer:
mb_data = reader_train.next_minibatch(mb_size, input_map=input_map)
z = np.random.normal(mb_size)
my_trainer.train_minibatch({feature_image: mb_data[image].data, feature_z: z})
Also thanks to #mewahl. Defining new reader is another suitable way to solve the problem, and I think it must be faster than what I have done.
I am using scipy's curve_fit to fit a function to some data, and receive the following error;
Cannot cast array data from dtype('O') to dtype('float64') according to the rule 'safe'
which points me to this line in my code;
popt_r, pcov = curve_fit(
self.rightFunc, np.array(wavelength)[beg:end][edgeIndex+30:],
np.dstack(transmitted[:,:,c][edgeIndex+30:])[0][0],
p0=[self.m_right, self.a_right])
rightFunc is defined as follows;
def rightFunc(self, x, m, const):
return np.exp(-(m*x + const))
As I understand it, the 'O' type refers to a python object, but I can't see what is causing this error.
Complete Error:
Any ideas for what I should investigate to get to the bottom of this?
Just in case it could help someone else, I used numpy.array(wavelength,dtype='float64') to force the conversion of objects in the list to numpy's float64. Works well for me.
Typically these scipy functions require parameters like:
curvefit( function, initial_values, (aux_values,), ...)
where the tuple of aux_values is passed through to your function along with the current value of the main variable.
Is the dstack expression this aux_values? Or a concatenation of several. It may need to be wrapped in a tuple.
(np.dstack(transmitted[:,:,c][edgeIndex+30:])[0][0],)
We may need to know exactly where this error arises, not just which line of your code does it. We need to know what value is being converted. Where is there an array with dtype object?
Just to clarify, I had the same problem, did not see the right answers in the comments, before solving it on my own. So I just repeat them here:
I have resolved the issue. I was passing an array with one element to the p0 list, rather than the element itself. Thank you for your help – Jacobadtr Sep 12 at 17:51
An O dtype often results when constructing an array from a list of sublists that differ in size. If np.array(...) can't make a clean n-d array of numbers, it resorts to making an array of objects. – hpaulj Sep 12 at 17:15
That is, make sure that the tuple of parameters you pass to curve_fit can be properly casted to an numpy array
From here, apparently numpy struggles with index type. The proposed solution is:
One thing you can do is use np.intp as dtype whenever things have to do with indexing or are logically related to indexing/array sizes. This is the natural dtype for it and it will normally also be the fastest one.
Does this help?
I am trying to slice a variable from a netcdf file and plot it but I am running into problems.
This is from my code:
import numpy as np
from netCDF4 import Dataset
Raw= "filename.nc"
data = Dataset(Raw)
u=data.variables['u'][:,:,:,:]
print u.shape
U=u([0,0,[200:500],[1:300]])
#The print statement yields (2, 17, 900, 2600) as u's dimensions.
#U Is the slice of the dataset I am interested inn. A small subset of the 4-dimensional vector. This last line of code gives me a syntax error and I cannot figure out why.
Trying to pick out a single value from the array ( u(0,0,0,1)) gives me an Type error: TypeError: 'MaskedArray' The program's aim is to perform simple algebra on a subset of this subset and to plot this data. Any help is appreciated.
I think the comment by Spencer Hill is correct. Without seeing the full error message, I can't be sure, but I'm pretty sure that the TypeError results from you (through the use of parenthesis) trying to call the array as a function. Try:
U=u[0,0,200:500,1:300]