Having Trouble with numpy.histogramdd

Having Trouble with numpy.histogramdd - python

I am trying to create N-Dimensional histogram from 2D array which has complex values. I want to count the number of occurrences in real and imaginary parts of the array given the bins and store the result in a 3D array. It only runs for the first iteration when I hard code i=0 and remove the for loop. I have never used histograms in python before and I just cannot understand the error. The code is given below.
xsoft is defined as 2d array of complex type and I somehow compute bnd_edges by finding max, min values from xsoft and create edges to be given as bins.
xsoft = np.empty((M, MAX,), dtype=complex) # e.g has dims 4*100
xsoft[:] = np.nan
edges = np.linspace(-bnd_edges, bnd_edges, numbin) #numbin=10
pSOFT = np.empty((len(edges)-1, M, len(edges)-1)) # len(edges)= 10
pSOFT[:] = np.nan
for i in range(M):
pSOFT[:, i, :], edges = np.histogramdd((xsoft[i, :].real, xsoft[i, :].imag), bins=(edges, edges))
The code results in the following error
Traceback (most recent call last):
File " ", line 194, in <module>
pSOFT[:, i, :], edges = np.histogramdd((xsoft[i, :].real, xsoft[i, :].imag), bins=(edges, edges))
File "<__array_function__ internals>", line 5, in histogramdd
File " " line 1066, in histogramdd
raise ValueError(
ValueError: `bins[0]` must be a scalar or 1d array
Process finished with exit code 1

You are getting this error because you are overriding the original definition of edges with the second return value of histogramdd.
Replace the last line in your code with this:
pSOFT[:, i, :], edges_i = np.histogramdd((xsoft[i, :].real, xsoft[i, :].imag), bins=(edges, edges))

Related

Python equivalent of Matlab's hist3

for i=1:n
centersX(:,i)=linspace(min(xData)+dX/2,max(xData)-dX/2,nbins)';
centersY(:,i)=linspace(min(yData)+dY/2,max(phase)-dY/2,nbins)';
centers = {centersX(:,i),centersY(:,i)};
H(:,:,i) = hist3([xData yData],centers);
end
In each iteration, I construct centersX and centersY with linspace function. I then store them in a 2x1 cell array called centers. H is a nbins X nbins X n struct. In each iteration I fill a nbins X nbins slice of H with the data from hist3.
I'm looking for the Python equivalent. I'm having trouble with passing the arguments for numpy.histogram2d:
H[:,:,i] = numpy.histogram2d(xData,yData,centers)
I get the following error:
Traceback (most recent call last):
line 714, in histogramdd
N, D = sample.shape
AttributeError: 'list' object has no attribute 'shape'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
line 36, in <module>
H[:,:,i] = numpy.histogram2d(xData, yData, centers)
line 714, in histogram2d
hist, edges = histogramdd([x, y], bins, range, normed, weights)
line 718, in histogramdd
N, D = sample.shape
ValueError: too many values to unpack (expected 2)
Since Python doesn't have cell arrays, I changed centers to be an array of arrays where centers[0] = centersX and centers[1] = centersY. What do I need to change such that that assuming the data are the same between matlab and python that the outputs will match?
EDIT:
I have also tried H[:,:,i] = numpy.histogram2d(xData,yData, bins=(centersX,centersY)) to cutout the combining step into centers but no luck.

Have you tried combing them with square brackets?
Maybe you can also use matplotlib.pyplot.hist2d.
H[:,:,i], *_ = numpy.histogram2d(xData,yData,bins=[centers[0], centers[1]])
H[:,:,i], *_ = matplotlib.pyplot.hist2d(xData,yData,bins=[centers[0], centers[1]])
In both, the values in centers are the bin edges, not the centers. You have to adjust the calculation. I think it is enough to remove the dX/2:
centersX(:,i)=linspace(min(xData),max(xData),nbins)';
centersY(:,i)=linspace(min(yData),max(phase),nbins)';

Length-1 Arrays and Python Scalars Via plt.text

I'm trying to use plt.text to plot temperature values at their associated lat/lon points on a plot.
After reviewing the plt.text documentation, it appears that the plotted value (third arg) has to be a number and that the number has to be a whole number, NOT a number with decimals.
Below is the code that I'm trying to work with and the associated traceback error that I'm receiving:
Script Code:
data = np.loadtxt('/.../.../.../tmax_day0', delimiter=',', skiprows=1)
grid_x, grid_y = np.mgrid[-85:64:dx, 34:49:dx]
temp = data[:,2]
#print temp
grid_z = griddata((data[:,1],data[:,0]), data[:,2], (grid_x,grid_y), method='linear')
x,y = m(data[:,1], data[:,0]) # flip lat/lon
grid_x,grid_y = m(grid_x,grid_y)
#m.plot(x,y, 'ko', markersize=2)
def str_to_float(str):
try:
number = float(str)
except ValueError:
number = 0.0
return number
fmt = str_to_float(temp)
#annotate point temperature on plot
plt.text(grid_x, grid_y, fmt, fontdict=None)
Traceback Error:
Traceback (most recent call last):
File "plotpoints.py", line 56, in <module>
fmt = str_to_float(temp)
File "plotpoints.py", line 51, in str_to_float
number = float(str)
TypeError: only length-1 arrays can be converted to Python scalars
Data sample from text file tmax_day0:
latitude,longitude,value
36.65408,-83.21783,90
41.00928,-74.73628,92.02
43.77714,-71.75598,90
44.41944,-72.01944,88.8
39.5803,-79.3394,79
38.3154,-76.5501,86
38.91444,-82.09833,94
40.64985,-75.44771,92.6
41.25389,-70.05972,81.2
39.45202,-74.56699,90.88

I was able to achieve plotting data values only by using the following code:
for i in range(len(temp)):
plt.text(x[i], y[i], temp[i], va="top", family="monospace")
Result:

You aren't using a "proper" array, and are instead using a numpy array. Numpy arrays don't play well with non-numpy functions.
Going from your comment, this has been edited.
You would first need to fix the string so it's a proper array.
fmt = fmt[0].split()
I think should work to create a new (normal) array of strings. And then this to map that to an array of floats:
list_of_floats = np.array(map(float, fmt))

Python: create multiple boxplots in one pannel

I have been using R for long time and I am recently learning Python.
I would like to create multiple box plots in one panel in Python.
My dataset is in a vector form and a label vector indicates which box plot each element of data corresponds. The example looks like this:
N = 50
data = np.random.lognormal(size=N, mean=1.5, sigma=1.75)
label = np.repeat([1,2,3,4,5],N/5)
From various websites (e.g., matplotlib: Group boxplots), Creating multiple boxplots requires a matrix object input whose column contains samples for one boxplot. So I created a list object based on data and label:
savelist = data[ label == 1]
for i in [2,3,4,5]:
savelist = [savelist, data[ label == i]]
However, the code below gives me an error:
boxplot(savelist)
Traceback (most recent call last):
File "<ipython-input-222-1a55d04981c4>", line 1, in <module>
boxplot(savelist)
File "/Users/yumik091186/anaconda/lib/python2.7/site-packages/matplotlib/pyplot.py", line 2636, in boxplot
meanprops=meanprops, manage_xticks=manage_xticks)
File "/Users/yumik091186/anaconda/lib/python2.7/site-packages/matplotlib/axes/_axes.py", line 3045, in boxplot labels=labels)
File "/Users/yumik091186/anaconda/lib/python2.7/site-packages/matplotlib/cbook.py", line 1962, in boxplot_stats
stats['mean'] = np.mean(x)
File "/Users/yumik091186/anaconda/lib/python2.7/site-packages/numpy/core/fromnumeric.py", line 2727, in mean
out=out, keepdims=keepdims)
File "/Users/yumik091186/anaconda/lib/python2.7/site-packages/numpy/core/_methods.py", line 66, in _mean
ret = umr_sum(arr, axis, dtype, out, keepdims)
ValueError: operands could not be broadcast together with shapes (2,) (10,)
Can anyone explain what is going on?

You're ending up with a nested list instead of a flat list. Try this instead:
savelist = [data[label == 1]]
for i in [2,3,4,5]:
savelist.append(data[label == i])
And it should work.

MemoryError during Fast Fourier Transform on an image using NumPy arrays under Windows

The code could compute Fourier transform from a .tiff image on my Ubuntu 11.04. On Windows XP it produces memory error. What to change? Thank you.
def fouriertransform(result): #function for Fourier transform computation
for filename in glob.iglob ('*.tif')
imgfourier = scipy.misc.imread(filename) #read the image
arrayfourier = numpy.array([imgfourier])#make an array
# Take the fourier transform of the image.
F1 = fftpack.fft2(arrayfourier)
# Now shift so that low spatial frequencies are in the center.
F2 = fftpack.fftshift(F1)
# the 2D power spectrum is:
psd2D = np.abs(F2)**2
L = psd2D
np.set_printoptions(threshold=3)
#np.set_printoptions(precision = 3, threshold = None, edgeitems = None, linewidth = 3, suppress = True, nanstr = None, infstr = None, formatter = None)
for subarray in L:
for array in subarray:
for array in subarray:
for elem in array:
print '%3.10f\n' % elem
The error output is:
Traceback (most recent call last):
File "C:\Documents and Settings\HrenMudak\Мои документы\Моя музыка\fourier.py", line 27, in <module>
F1 = fftpack.fft2(arrayfourier)
File "C:\Python27\lib\site-packages\scipy\fftpack\basic.py", line 571, in fft2
return fftn(x,shape,axes,overwrite_x)
File "C:\Python27\lib\site-packages\scipy\fftpack\basic.py", line 521, in fftn
return _raw_fftn_dispatch(x, shape, axes, overwrite_x, 1)
File "C:\Python27\lib\site-packages\scipy\fftpack\basic.py", line 535, in _raw_fftn_dispatch
return _raw_fftnd(tmp,shape,axes,direction,overwrite_x,work_function)
File "C:\Python27\lib\site-packages\scipy\fftpack\basic.py", line 463, in _raw_fftnd
x, copy_made = _fix_shape(x, s[i], waxes[i])
File "C:\Python27\lib\site-packages\scipy\fftpack\basic.py", line 134, in _fix_shape
z = zeros(s,x.dtype.char)
MemoryError

I've tried to run your code, except that I replaced the mahotas.imread with the scipy.misc.imread function, because I don't have that library, and I could not reproduce your error.
Some further remarks:
can you try to use the scipy.misc.imread function instead of the mahotas function? I suppose the issue could be there
what is the actual exception that is thrown? (+other output?)
what are the dimensions of your image? Gray-scale / RGB? Printing all values for a large image could indeed take up quite some memory, so it might be better to visualize the results with e.g. matplotlibs imshow function.

Python zero-size array to ufunc.reduce without identity

I'm trying to make a histogram of some data that is being stored in an ndarray. The histogram is part of a set of analysis which I've made into a class in a python program. The part of the code that isn't working is below.
def histogram(self, iters):
samples = T.MCMC(iters) #Returns an [iters,3,4] ndarray
histAC = plt.figure(self.ip) #plt is matplotlib's pyplot
self.ip+=1 #defined at the beginning of the class to start at 0
for l in range(0,4):
h = histAC.add_subplot(2,(iters+1)/2,l+1)
for i in range(0,0.5*self.chan_num):
intAvg = mean(samples[:,i,l])
print intAvg
for k in range(0,iters):
samples[k,i,l]=samples[k,i,l]-intAvg
print "Samples is ",samples
h.hist(samples,bins=5000,range=[-6e-9,6e-9],histtype='step')
h.legend(loc='upper right')
h.set_title("AC Pulse Integral Histograms: "+str(l))
figname = 'ACHistograms.png'
figpath = 'plot'+str(self.ip)
print "Finished!"
#plt.savefig(figpath + figname, format = 'png')
This gives me the following error message:
File "johnmcmc.py", line 257, in histogram
h.hist(samples,bins=5000,range=[-6e-9,6e-9],histtype='step') #removed label=apdlabel
File "/x/tsfit/local/lib/python2.6/site-packages/matplotlib/axes.py", line 7238, in hist
ymin = np.amin(m[m!=0]) # filter out the 0 height bins
File "/x/tsfit/local/lib/python2.6/site-packages/numpy/core/fromnumeric.py", line 1829, in amin
return amin(axis, out)
ValueError: zero-size array to ufunc.reduce without identity
The only search results I've found have been multiple copies of the same two conversations, from which the only thing I learned was that python histograms don't like getting fed empty arrays, which is why I added the print statement right above the line that's giving me trouble to make sure the array isn't empty.
Has anyone else come across this error before?

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Having Trouble with numpy.histogramdd - python

You are getting this error because you are overriding the original definition of edges with the second return value of histogramdd. Replace the last line in your code with this: pSOFT[:, i, :], edges_i = np.histogramdd((xsoft[i, :].real, xsoft[i, :].imag), bins=(edges, edges))

Related

Python equivalent of Matlab's hist3

Length-1 Arrays and Python Scalars Via plt.text

Python: create multiple boxplots in one pannel

MemoryError during Fast Fourier Transform on an image using NumPy arrays under Windows

Python zero-size array to ufunc.reduce without identity

Categories

Resources