How would I translate the following into Python from Matlab? I'm still trying to wrap my head around lists/matrices and arrays in numpy, etc.
outframe(:,[4:4:nout-1]) = 0.25*inframe(:,[1:n-1]) + 0.75*inframe(:,[2:n])
pos=(beamnum>0)*(beamnum<=nbeams)*(binnum>0)*(binnum<=nbins)*((beamnum-1)*nbins+binnum)
for index =1:512:
outarray(index,:) =uint8(interp1([1:n],inarray64(index,:),[1:.25:n],method))
(There's other stuff, these are just the particular statements I'm not sure how to make sense of. I have numpy imported,
The main workhorse in numpy is the ndarray (or array). It will for the most part replace matlab matrices when you translate code. Like a matlab matrix, the ndarray stores homogeneous data (ie float64) and is optimized for numerical operations.
The numpy matrix is a subclass of the ndarray which can be convenient for some linear algebra intensive applications. Here is more info about the differences between the two.
The python list is more like a matlab cell array (though not exactly the same). It's one of the basic python data structures, but in scientific applications I find that it comes up most often when you need to hold heterogeneous data. (Or when you're doing something very simple and don't want to go to the trouble of creating a numpy array).
Your code above can be converted almost verbatim to python using the ndarray and replacing () with [] for indexing and taking into account that indexing starts at 1 in MATLAB and 0 in python
i.e. : the first element in MATLAB is element 1, and in python it is element 0.
Let's try this line by line:
outframe(:,[4:4:nout-1]) = 0.25*inframe(:,[1:n-1]) + 0.75*inframe(:,[2:n])
would translate in "English" to: all rows of outframe, but only every 4th column starting from 4 to nout-1 (i.e.4,8..). I assume you understand what inframe references mean.
pos=(beamnum>0)*(beamnum<=nbeams)*(binnum>0)*(binnum<=nbins)*((beamnum-1)*nbins+binnum)
Possibly beamnum is a vector and (beamnum >0) returns a vector of {0,1} such that the elements are '1' where the respective beamnum element is >0, else 0. The rest of it is clear, i hope.
The second last line is a for-loop and the last line should hopefully be clear.
Related
Following the answer in How to efficiently convert Matlab engine arrays to numpy ndarray?, it seems much more efficient to access the matlab engine array through the _data property.
However it appears that there is no _data property when the array returned by Matlab is a 'complex single' one. Is there an equivalent fast access to the array of complex numbers ?
A possible workaround is to return from Matlab two real arrays (one containing the real part, the other the imaginary part) and build back the complex value in Python
M_real, M_imag = myMatlabFunction()
M_real_np = np.array(M_real._data)
M_imag_np = np.array(M_imag._data)
M_np = M_real_np + M_imag_np*np.complex(0,1)
Then we can profit from the fast access to the _data member of each array.
I am still interested in more straightforward solution.
Many functions like in1d and setdiff1d are designed for 1-d array. One workaround to apply these methods on N-dimensional arrays is to make numpy to treat each row (something more high dimensional) as a value.
One approach I found to do so is in this answer Get intersecting rows across two 2D numpy arrays by Joe Kington.
The following code is taken from this answer. The task Joe Kington faced was to detect common rows in two arrays A and B while trying to use in1d.
import numpy as np
A = np.array([[1,4],[2,5],[3,6]])
B = np.array([[1,4],[3,6],[7,8]])
nrows, ncols = A.shape
dtype={'names':['f{}'.format(i) for i in range(ncols)],
'formats':ncols * [A.dtype]}
C = np.intersect1d(A.view(dtype), B.view(dtype))
# This last bit is optional if you're okay with "C" being a structured array...
C = C.view(A.dtype).reshape(-1, ncols)
I am hoping you to help me with any of the following three questions. First, I do not understand the mechanisms behind this method. Can you try to explain it to me?
Second, is there other ways to let numpy treat an subarray as one object?
One more open question: dose Joe's approach have any drawbacks? I mean whether treating rows as a value might cause some problems? Sorry this question is pretty broad.
Try to post what I have learned. The method Joe used is called structured arrays. It will allow users to define what is contained in a single cell/element.
We take a look at the description of the first example the documentation provided.
x = np.array([(1,2.,'Hello'), (2,3.,"World")], ...
dtype=[('foo', 'i4'),('bar', 'f4'), ('baz', 'S10')])
Here we have created a one-dimensional array of length 2. Each element
of this array is a structure that contains three items, a 32-bit
integer, a 32-bit float, and a string of length 10 or less.
Without passing in dtype, however, we will get a 2 by 3 matrix.
With this method, we would be able to let numpy treat a higher dimensional array as an single element with properly set dtype.
Another trick Joe showed is that we don't need to really form a new numpy array to achieve the purpose. We can use the view function (See ndarray.view) to change the way numpy view data. There is a section of Note section in ndarray.view that I think you should take a look before utilizing the method. I have no guarantee that there would not be side effects. The paragraph below is from the note section and seems to call for caution.
For a.view(some_dtype), if some_dtype has a different number of bytes per entry than the previous dtype (for example, converting a regular array to a structured array), then the behavior of the view cannot be predicted just from the superficial appearance of a (shown by print(a)). It also depends on exactly how a is stored in memory. Therefore if a is C-ordered versus fortran-ordered, versus defined as a slice or transpose, etc., the view may give different results.
Other reference
https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.dtypes.html
https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.dtype.html
I am building a python application where I retrieve a list of objects and I want to plot them (for ploting I use matplotlib). Each object in the list contains two properties.
For example let's say I have the list rawdata and the objects stored in it have the properties timestamp and power
rawdata[0].timestamp == 1
rawdata[1].timestamp == 2
rawdata[2].timestamp == 3
etc
rawdata[0].power == 1232.547
rawdata[1].power == 2525.423
rawdata[2].power == 1125.253
etc
I want to be able to plot those two dimensions, that the two properties represent, and I want to do it a time and space efficient way. That means that I want to avoid iterating over the list and sequentially constructing something like a numpy array out it.
Is there a way that to apply an on-the-fly transformation of the list? Or somehow plot it as it is? Since all the information is already included in the list I believe there should be a way.
The closest answer I found was this, but it includes sequential iteration over the list.
update
As pointed out by Antonio Ragagnin I can use the map builtin function to construct a numpy array efficiently. But that also means that I will have to create a second data structure. Can I use map to transform the list on the fly to a two dimensional numpy array?
From the matplotlib tutorial (emphasis mine):
If matplotlib were limited to working with lists, it would be fairly useless for numeric processing. Generally, you will use numpy arrays. In fact, all sequences are converted to numpy arrays internally.
So you lose nothing by converting it to a numpy array, if you don't do it matplotlib will.
I have what I thought would be a simple task in numpy, but I'm having trouble.
I have a function which takes an index in the array and returns the value that belongs at that index. I would like to, efficiently, write the values into a numpy array.
I have found numpy.fromfunction, but it doesn't behave remotely like the documentation suggests. It seems to "vectorise" the function, which means that instead of passing the actual indices it passes a numpy array of indices:
def vsin(i):
return float(round(A * math.sin((2 * pi * wf) * i)))
numpy.fromfunction(vsin, (len,), dtype=numpy.int16)
# TypeError: only length-1 arrays can be converted to Python scalars
(if we use a debugger to inspect i, it is a numpy.array instance.)
So, if we try to use numpy's vectorised sin function:
def vsin(i):
return (A * numpy.sin((2 * pi * wf) * i)).astype(numpy.int16)
numpy.fromfunction(vsin, (len,), dtype=numpy.int16)
We don't get a type error, but if len > 2**15 we get discontinuities chopping accross our oscillator, because numpy is using int16_t to represent the index!
The point here isn't about sin in particular: I want to be able to write arbitrary python functions like this (whether a numpy vectorised version exists or not) and be able to run them inside a tight C loop (rather than a roundabout python one), and not have to worry about integer wraparound.
Do I really have to write my own cython extension in order to be able to do this? Doesn't numpy have support for running python functions once per item in an array, with access to the index?
It doesn't have to be a creation function: I can use numpy.empty (or indeed, reuse an existing array from somewhere else.) So a vectorised transformation function would also do.
I think the issue of integer wraparound is unrelated to numpy's vectorized sin implementation and even the use of python or C.
If you use a 2-byte signed integer and try to generate an array of integer values ranging from 0 to above 32767, you will get a wrap-around error. The array will look like:
[0, 1, 2, ... , 32767, -32768, -32767, ...]
The simplest solution, assuming memory is not too tight, is to use more bytes for your integer array generated by fromfunction so you don't have a wrap-around problem in the first place (up to a few billion):
numpy.fromfunction(vsin, (len,), dtype=numpy.int32)
numpy is optimized to work fast on arrays by passing the whole array around between vectorized functions. I think in general the numpy tools are inconvenient for trying to run scalar functions once per array element.
I am currently writing some code which is supposed to perform FFT on a set of data. I have a python list of points and I can easily create a time list. When I run fft(datalist), I get the 'TypeError: 'numpy.ndarray' object is not callable' error. I think (but please correct me) the issue is that the list is one dimension and they have no attachment to time at all by using that one line of code above. My question is, do I have to input a two dimensional array with time and data points? or am I completely wrong and have to rethink?
Thanks, Mike
Edit - forgot to add some code. The t=time. Could it be because the number of entries in the array isnt equal to 2^n where N is an integer?
sample_rate=10.00
t=r_[0:191.6:1/sample_rate]
S = fft([mylist])
print S
The Numpy and SciPy fft functions are looking to have numpy arrays as input, not native python lists. Also they work just fine with lengths that are not powers of two. You probably just need to cast your list as an array before passing it to the fft.
From your example code above try:
from numpy.fftpack import fft
from numpy import array
""" However you generate your list goes here """
S = fft(array([mylist]))