function output of a tuple of 2 arrays, into a 2d array - python

new to Python (matlab background).
I have a function (np.unique) that can output either 1 or 2 arrays:
array of unique values.
counts for each value (enabled by setting an argument return_counts=true)
When the function is set to return a single array only, assigning the result into the undefined variable "uni" makes it an ndarray type:
uni=np.unique(iris_2d['species'],return_counts=False)
But when the function is set to return 2 arrays the variable "uni" is created as a tuple containing 2 ndarrays.
Is there a way to force the output directly into a 2d array (and multidimensional in general), without predefine the variable "uni" or using a a second function like numpy.stack/numpy.asarray?
import numpy as np
url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'
names = ('sepallength', 'sepalwidth', 'petallength', 'petalwidth', 'species')
dtype=np.dtype({'names':names, 'formats':np.append(np.repeat('float',4),'<U16')})
iris_2d = np.genfromtxt(url, delimiter=',', dtype=dtype, usecols=[0,1,2,3,4])
uni_isTuple=np.unique(iris_2d['species'],return_counts=True)
uni_isNdArray=np.unique(iris_2d['species'],return_counts=False)

I'm unaware of a way to force np.unique() to return a ndarray instead of a tuple. I realize you asked for a solution that doesn't call another function, but if you'll tolerate passing the tuple to np.array() to build a ndarray from the tuple that might give you what you want.
uni_isTuple = np.array(np.unique(iris_2d['species'],return_counts=True))

Related

numpy: get ndarray's value from an index array

I have high dimension numpy array, the dimension of the array is not fixed. I need to retrieve the value with a index list, the length of the index list is the same as the dimension of the numpy array.
In other words, I need a function:
def get_value_by_list_index(target_array, index_list):
# len(index_list) = target_array.ndim
# target array can be in any dimension
# return the element at index specified on index list
For example, for a 3-dimension array, data and a list [i1, i2, i3], I the function should return data[i1][i2][i3].
Is there a good way to achieve this task?
If you know the ndarray is actually containing a type that is just a number well-representable by python types:
source_array.item(*index_iterable)
will do the same.
If you need to work with ndarrays of more complex types that might not have a python built-in type representation, things are harder.
You could implement exactly what you sketch in your comment:
data[i1][i2][i3]
# note that I didn't like the name of your function
def get_value_by_index_iterable(source_array, index_iterable):
subarray = source_array
for index in index_iterable:
subarray = subarray[index]
return subarray

python element wise operation in function

In python, I am trying to change the values of np array inside the function
def function(array):
array = array + 1
array = np.zeros((10, 1))
function(array)
For array as function parameter, it is supposed to be a reference, and I should be able to modify its content inside function.
array = array + 1 performs element wise operation that adds one to every element in the array, so it changes inside values.
But the array actually does not change after the function call. I am guessing that the program thinks I am trying to change the reference itself, not the content of the array, because of the syntax of the element wise operation. Is there any way to make it do the intended behavior? I don't want to loop through individual elements or make the function return the new array.
This line:
array = array + 1
… does perform an elementwise operation, but the operation it performs is creating a new array with each element incremented. Assigning that array back to the local variable array doesn't do anything useful, because that local variable is about to go away, and you haven't done anything to change the global variable of the same name,
On the other hand, this line:
array += 1
… performs the elementwise operation of incrementing all of the elements in-place, which is probably what you want here.
In Python, mutable collections are only allowed, not required, to handle the += statement this way; they could handle it the same way as array = array + 1 (as immutable types like str do). But builtin types like list, and most popular third-party types like np.array, do what you want.
Another solution if you want to change the content of your array is to use this:
array[:] = array + 1

Random array from list of arrays by numpy.random.choice()

I have list of arrays similar to lstB and want to pick random collection of 2D arrays. The problem is that numpy somehow does not treat objects in lists equally:
lstA = [numpy.array(0), numpy.array(1)]
lstB = [numpy.array([0,1]), numpy.array([1,0])]
print(numpy.random.choice(lstA)) # returns 0 or 1
print(numpy.random.choice(lstB)) # returns ValueError: must be 1-dimensional
Is there an ellegant fix to this?
Let's call it semi-elegant...
# force 1d object array
swap = lstB[0]
lstB[0] = None
arrB = np.array(lstB)
# reinsert value
arrB[0] = swap
# and clean up
lstB[0] = swap
# draw
numpy.random.choice(arrB)
# array([1, 0])
Explanation: The problem you encountered appears to be that numpy when converting the input list to an array will make as deep an array as it can. Since all your list elements are sequences of the same length this will be 2d. The hack shown here forces it to make a 1d array of object dtype instead by temporarily inserting an incompatible element.
However, I personally would not use this. Because if you draw multiple subarrays with this method you'll get a 1d array of arrays which is probably not what you want and tedious to convert.
So I'd actually second what one of the comments recommends, i.e. draw ints and then use advanced indexing into np.array(lstB).

Is there any way to use the "out" argument of a Numpy function when modifying an array in place?

If I want to get the dot product of two arrays, I can get a performance boost by specifying an array to store the output in instead of creating a new array (if I am performing this operation many times)
import numpy as np
a = np.array([[1.0,2.0],[3.0,4.0]])
b = np.array([[2.0,2.0],[2.0,2.0]])
out = np.empty([2,2])
np.dot(a,b, out = out)
Is there any way I can take advantage of this feature if I need to modify an array in place? For instance, if I want:
out = np.array([[3.0,3.0],[3.0,3.0]])
out *= np.dot(a,b)
Yes, you can use the out argument to modify an array (e.g. array=np.ones(10)) in-place, e.g. np.multiply(array, 3, out=array).
You can even use in-place operator syntax, e.g. array *= 2.
To confirm if the array was updated in-place, you can check the memory address array.ctypes.data before and after the modification.

numpy loadtxt single line/row as list

I have a data file with only one line like:
1.2 2.1 3.2
I used numpy version 1.3.0 loadtxt to load it
a,b,c = loadtxt("data.dat", usecols(0,1,2), unpack=True)
The output was a float instead of array like
a = 1.2
I expect it would be:
a = array([1.2])
If i read a file with multiple lines, it's working.
Simply use the numpy's inbuit loadtxt parameter ndmin.
a,b,c=np.loadtxt('data.dat',ndmin=2,unpack=True)
output
a=[1.2]
What is happening is that when you load the array you obtain a monodimensional one. When you unpack it, it obtain a set of numbers, i.e. array without dimension. This is because when you unpack an array, it decrease it's number of dimension by one. starting with a monodimensional array, it boil down to a simple number.
If you test for the type of a, it is not a float, but a numpy.float, that has all the properties of an array but a void tuple as shape. So it is an array, just is not represented as one.
If what you need is a monodimensional array with just one element, the simplest way is to reshape your array before unpacking it:
#note the reshape function to transform the shape
a,b,c = loadtxt("text.txt").reshape((-1,1))
This gives you the expected result. What is happening is that whe reshaped it into a bidimensional array, so that when you unpack it, the number of dimensions go down to one.
EDIT:
If you need it to work normally for multidimensional array and to keep one-dimensional when you read onedimensional array, I thik that the best way is to read normally with loadtxt and reshape you arrays in a second phase, converting them to monodimensional if they are pure numbers
a,b,c = loadtxt("text.txt",unpack=True)
for e in [a,b,c]
e.reshape(e.shape if e.shape else (-1,))
The simple way without using reshape is, to explicitly typecast the list
a,b,c = loadtxt("data.dat", usecols(0,1,2), unpack=True)
a,b,c = (a,b,c) if usi.shape else ([a], [b], [c])
This works faster than the reshape!

Categories