Having Trouble with numpy.histogramdd - python

I am trying to create N-Dimensional histogram from 2D array which has complex values. I want to count the number of occurrences in real and imaginary parts of the array given the bins and store the result in a 3D array. It only runs for the first iteration when I hard code i=0 and remove the for loop. I have never used histograms in python before and I just cannot understand the error. The code is given below.
xsoft is defined as 2d array of complex type and I somehow compute bnd_edges by finding max, min values from xsoft and create edges to be given as bins.
xsoft = np.empty((M, MAX,), dtype=complex) # e.g has dims 4*100
xsoft[:] = np.nan
edges = np.linspace(-bnd_edges, bnd_edges, numbin) #numbin=10
pSOFT = np.empty((len(edges)-1, M, len(edges)-1)) # len(edges)= 10
pSOFT[:] = np.nan
for i in range(M):
pSOFT[:, i, :], edges = np.histogramdd((xsoft[i, :].real, xsoft[i, :].imag), bins=(edges, edges))
The code results in the following error
Traceback (most recent call last):
File " ", line 194, in <module>
pSOFT[:, i, :], edges = np.histogramdd((xsoft[i, :].real, xsoft[i, :].imag), bins=(edges, edges))
File "<__array_function__ internals>", line 5, in histogramdd
File " " line 1066, in histogramdd
raise ValueError(
ValueError: `bins[0]` must be a scalar or 1d array
Process finished with exit code 1

You are getting this error because you are overriding the original definition of edges with the second return value of histogramdd.
Replace the last line in your code with this:
pSOFT[:, i, :], edges_i = np.histogramdd((xsoft[i, :].real, xsoft[i, :].imag), bins=(edges, edges))

Related

Python equivalent of Matlab's hist3

for i=1:n
centersX(:,i)=linspace(min(xData)+dX/2,max(xData)-dX/2,nbins)';
centersY(:,i)=linspace(min(yData)+dY/2,max(phase)-dY/2,nbins)';
centers = {centersX(:,i),centersY(:,i)};
H(:,:,i) = hist3([xData yData],centers);
end
In each iteration, I construct centersX and centersY with linspace function. I then store them in a 2x1 cell array called centers. H is a nbins X nbins X n struct. In each iteration I fill a nbins X nbins slice of H with the data from hist3.
I'm looking for the Python equivalent. I'm having trouble with passing the arguments for numpy.histogram2d:
H[:,:,i] = numpy.histogram2d(xData,yData,centers)
I get the following error:
Traceback (most recent call last):
line 714, in histogramdd
N, D = sample.shape
AttributeError: 'list' object has no attribute 'shape'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
line 36, in <module>
H[:,:,i] = numpy.histogram2d(xData, yData, centers)
line 714, in histogram2d
hist, edges = histogramdd([x, y], bins, range, normed, weights)
line 718, in histogramdd
N, D = sample.shape
ValueError: too many values to unpack (expected 2)
Since Python doesn't have cell arrays, I changed centers to be an array of arrays where centers[0] = centersX and centers[1] = centersY. What do I need to change such that that assuming the data are the same between matlab and python that the outputs will match?
EDIT:
I have also tried H[:,:,i] = numpy.histogram2d(xData,yData, bins=(centersX,centersY)) to cutout the combining step into centers but no luck.
Have you tried combing them with square brackets?
Maybe you can also use matplotlib.pyplot.hist2d.
H[:,:,i], *_ = numpy.histogram2d(xData,yData,bins=[centers[0], centers[1]])
H[:,:,i], *_ = matplotlib.pyplot.hist2d(xData,yData,bins=[centers[0], centers[1]])
In both, the values in centers are the bin edges, not the centers. You have to adjust the calculation. I think it is enough to remove the dX/2:
centersX(:,i)=linspace(min(xData),max(xData),nbins)';
centersY(:,i)=linspace(min(yData),max(phase),nbins)';

Length-1 Arrays and Python Scalars Via plt.text

I'm trying to use plt.text to plot temperature values at their associated lat/lon points on a plot.
After reviewing the plt.text documentation, it appears that the plotted value (third arg) has to be a number and that the number has to be a whole number, NOT a number with decimals.
Below is the code that I'm trying to work with and the associated traceback error that I'm receiving:
Script Code:
data = np.loadtxt('/.../.../.../tmax_day0', delimiter=',', skiprows=1)
grid_x, grid_y = np.mgrid[-85:64:dx, 34:49:dx]
temp = data[:,2]
#print temp
grid_z = griddata((data[:,1],data[:,0]), data[:,2], (grid_x,grid_y), method='linear')
x,y = m(data[:,1], data[:,0]) # flip lat/lon
grid_x,grid_y = m(grid_x,grid_y)
#m.plot(x,y, 'ko', markersize=2)
def str_to_float(str):
try:
number = float(str)
except ValueError:
number = 0.0
return number
fmt = str_to_float(temp)
#annotate point temperature on plot
plt.text(grid_x, grid_y, fmt, fontdict=None)
Traceback Error:
Traceback (most recent call last):
File "plotpoints.py", line 56, in <module>
fmt = str_to_float(temp)
File "plotpoints.py", line 51, in str_to_float
number = float(str)
TypeError: only length-1 arrays can be converted to Python scalars
Data sample from text file tmax_day0:
latitude,longitude,value
36.65408,-83.21783,90
41.00928,-74.73628,92.02
43.77714,-71.75598,90
44.41944,-72.01944,88.8
39.5803,-79.3394,79
38.3154,-76.5501,86
38.91444,-82.09833,94
40.64985,-75.44771,92.6
41.25389,-70.05972,81.2
39.45202,-74.56699,90.88
I was able to achieve plotting data values only by using the following code:
for i in range(len(temp)):
plt.text(x[i], y[i], temp[i], va="top", family="monospace")
Result:
You aren't using a "proper" array, and are instead using a numpy array. Numpy arrays don't play well with non-numpy functions.
Going from your comment, this has been edited.
You would first need to fix the string so it's a proper array.
fmt = fmt[0].split()
I think should work to create a new (normal) array of strings. And then this to map that to an array of floats:
list_of_floats = np.array(map(float, fmt))

Python: create multiple boxplots in one pannel

I have been using R for long time and I am recently learning Python.
I would like to create multiple box plots in one panel in Python.
My dataset is in a vector form and a label vector indicates which box plot each element of data corresponds. The example looks like this:
N = 50
data = np.random.lognormal(size=N, mean=1.5, sigma=1.75)
label = np.repeat([1,2,3,4,5],N/5)
From various websites (e.g., matplotlib: Group boxplots), Creating multiple boxplots requires a matrix object input whose column contains samples for one boxplot. So I created a list object based on data and label:
savelist = data[ label == 1]
for i in [2,3,4,5]:
savelist = [savelist, data[ label == i]]
However, the code below gives me an error:
boxplot(savelist)
Traceback (most recent call last):
File "<ipython-input-222-1a55d04981c4>", line 1, in <module>
boxplot(savelist)
File "/Users/yumik091186/anaconda/lib/python2.7/site-packages/matplotlib/pyplot.py", line 2636, in boxplot
meanprops=meanprops, manage_xticks=manage_xticks)
File "/Users/yumik091186/anaconda/lib/python2.7/site-packages/matplotlib/axes/_axes.py", line 3045, in boxplot labels=labels)
File "/Users/yumik091186/anaconda/lib/python2.7/site-packages/matplotlib/cbook.py", line 1962, in boxplot_stats
stats['mean'] = np.mean(x)
File "/Users/yumik091186/anaconda/lib/python2.7/site-packages/numpy/core/fromnumeric.py", line 2727, in mean
out=out, keepdims=keepdims)
File "/Users/yumik091186/anaconda/lib/python2.7/site-packages/numpy/core/_methods.py", line 66, in _mean
ret = umr_sum(arr, axis, dtype, out, keepdims)
ValueError: operands could not be broadcast together with shapes (2,) (10,)
Can anyone explain what is going on?
You're ending up with a nested list instead of a flat list. Try this instead:
savelist = [data[label == 1]]
for i in [2,3,4,5]:
savelist.append(data[label == i])
And it should work.

MemoryError during Fast Fourier Transform on an image using NumPy arrays under Windows

The code could compute Fourier transform from a .tiff image on my Ubuntu 11.04. On Windows XP it produces memory error. What to change? Thank you.
def fouriertransform(result): #function for Fourier transform computation
for filename in glob.iglob ('*.tif')
imgfourier = scipy.misc.imread(filename) #read the image
arrayfourier = numpy.array([imgfourier])#make an array
# Take the fourier transform of the image.
F1 = fftpack.fft2(arrayfourier)
# Now shift so that low spatial frequencies are in the center.
F2 = fftpack.fftshift(F1)
# the 2D power spectrum is:
psd2D = np.abs(F2)**2
L = psd2D
np.set_printoptions(threshold=3)
#np.set_printoptions(precision = 3, threshold = None, edgeitems = None, linewidth = 3, suppress = True, nanstr = None, infstr = None, formatter = None)
for subarray in L:
for array in subarray:
for array in subarray:
for elem in array:
print '%3.10f\n' % elem
The error output is:
Traceback (most recent call last):
File "C:\Documents and Settings\HrenMudak\Мои документы\Моя музыка\fourier.py", line 27, in <module>
F1 = fftpack.fft2(arrayfourier)
File "C:\Python27\lib\site-packages\scipy\fftpack\basic.py", line 571, in fft2
return fftn(x,shape,axes,overwrite_x)
File "C:\Python27\lib\site-packages\scipy\fftpack\basic.py", line 521, in fftn
return _raw_fftn_dispatch(x, shape, axes, overwrite_x, 1)
File "C:\Python27\lib\site-packages\scipy\fftpack\basic.py", line 535, in _raw_fftn_dispatch
return _raw_fftnd(tmp,shape,axes,direction,overwrite_x,work_function)
File "C:\Python27\lib\site-packages\scipy\fftpack\basic.py", line 463, in _raw_fftnd
x, copy_made = _fix_shape(x, s[i], waxes[i])
File "C:\Python27\lib\site-packages\scipy\fftpack\basic.py", line 134, in _fix_shape
z = zeros(s,x.dtype.char)
MemoryError
I've tried to run your code, except that I replaced the mahotas.imread with the scipy.misc.imread function, because I don't have that library, and I could not reproduce your error.
Some further remarks:
can you try to use the scipy.misc.imread function instead of the mahotas function? I suppose the issue could be there
what is the actual exception that is thrown? (+other output?)
what are the dimensions of your image? Gray-scale / RGB? Printing all values for a large image could indeed take up quite some memory, so it might be better to visualize the results with e.g. matplotlibs imshow function.

Python zero-size array to ufunc.reduce without identity

I'm trying to make a histogram of some data that is being stored in an ndarray. The histogram is part of a set of analysis which I've made into a class in a python program. The part of the code that isn't working is below.
def histogram(self, iters):
samples = T.MCMC(iters) #Returns an [iters,3,4] ndarray
histAC = plt.figure(self.ip) #plt is matplotlib's pyplot
self.ip+=1 #defined at the beginning of the class to start at 0
for l in range(0,4):
h = histAC.add_subplot(2,(iters+1)/2,l+1)
for i in range(0,0.5*self.chan_num):
intAvg = mean(samples[:,i,l])
print intAvg
for k in range(0,iters):
samples[k,i,l]=samples[k,i,l]-intAvg
print "Samples is ",samples
h.hist(samples,bins=5000,range=[-6e-9,6e-9],histtype='step')
h.legend(loc='upper right')
h.set_title("AC Pulse Integral Histograms: "+str(l))
figname = 'ACHistograms.png'
figpath = 'plot'+str(self.ip)
print "Finished!"
#plt.savefig(figpath + figname, format = 'png')
This gives me the following error message:
File "johnmcmc.py", line 257, in histogram
h.hist(samples,bins=5000,range=[-6e-9,6e-9],histtype='step') #removed label=apdlabel
File "/x/tsfit/local/lib/python2.6/site-packages/matplotlib/axes.py", line 7238, in hist
ymin = np.amin(m[m!=0]) # filter out the 0 height bins
File "/x/tsfit/local/lib/python2.6/site-packages/numpy/core/fromnumeric.py", line 1829, in amin
return amin(axis, out)
ValueError: zero-size array to ufunc.reduce without identity
The only search results I've found have been multiple copies of the same two conversations, from which the only thing I learned was that python histograms don't like getting fed empty arrays, which is why I added the print statement right above the line that's giving me trouble to make sure the array isn't empty.
Has anyone else come across this error before?

Categories