Splitting a array in python - python

How do you split an array in python in terms of the number of elements in the array. Im doing knn classification and I need to take into account of the first k elements of the 2D array.

import numpy as np
x = np.array([1, 2, 4, 4, 6, 7])
print(x[range(0, 4)])
You can also split it up by taking the range of elements that you want to work with. You could store x[range(x, x)]) in a variable and work with those particular elements of the array as well. The output as you can see splits the array up:
[1 2 4 4]

In Numpy, there is a method numpy.split.
x = np.arange(9.0)
np.split(x, 3)

Related

How to Expand or "scale up" an 1d array?

I have a piece of C code that can only handle an array of size 20. The array that my instrument outputs is much smaller than what the function requires. Is there a numpy or math function that can "scale up" an array to any specific size while maintaining its structural integrity? For example:
I have a 8 element array that is basically a two ramp "sawtooth" meaning its values are :
[1 2 3 4 1 2 3 4]
What I need for the C code is a 20 element array. So I can scale it, by padding linear intervals of the original array with "0"s , like:
[1,0,0,2,0,0,3,0,0,4,0,0,1,0,0,2,0,0,3,0]
so it adds up to 20 elements. I would think this process is the opposite of "decimation". (I apologize ,I'm simplifying this process so it will be a bit more understandable)
Based on your example, I guess the following approach could be tweaked to do what you want:
upsample with 0s: upsampled_l = [[i, 0, 0] for i in l] with l being your initial list
Flatten the array flat_l = flatten(upsampled_l) using a method from
How to make a flat list out of a list of lists? for instance
Get the expected length final_l = flat_l[:20]
For instance, the following code gives the output you gave in your example:
l = [1, 2, 3, 4, 1, 2, 3, 4]
upsampled_l = [[i, 0, 0] for i in l]
flat_l = [item for sublist in upsampled_l for item in sublist]
final_l = flat_l[:20]
However, the final element of the initial list (the second 4) is missing from the final list. Perhaps it's worth upsampling with only one 0 in between ([i, 0] instead of [i, 0, 0]) and finally do final_l.extend([0 for _ in range(20 - len(final_l))]).
Hope this helps!
You can manage it in a one-liner by adding zeros as another axis, then flattening:
sm = np.array([1, 2, 3, 4, 1, 2, 3, 4])
np.concatenate([np.reshape(sm, (8, 1)), np.zeros((8, 3))], axis=1).flatten()

How can I build a complementary array in numpy

I have an array of numbers corresponding to indices of another array.
index_array = np.array([2, 3, 5])
What I want to do is to create another array with the numbers 0, 1, 4, 6, 7, 8, 9. What I have thought is:
index_list = []
for i in range(10):
if i not in index_array:
index_list.append(i)
This works but I don't know if there is a more efficient way to do it or even a built-in function for it.
Probably the simplest solution is just to remove unwanted indices from the set:
n = 10
index_array = [2, 3, 5]
complement = np.delete(np.arange(n), index_array)
You can use numpy.setdiff1d to efficiently collect the unique value from a "universal array" that aren't in your index array. Passing assume_unique=True provides a small speed up.
When assume_unique is True, the result will be sorted so long as the input is sorted.
import numpy as np
# "Universal set" to take complement with respect to.
universe = np.arange(10)
a = np.array([2,3,5])
complement = np.setdiff1d(universe, a, assume_unique=True)
print(complement)
Results in
[0 1 4 6 7 8 9]

how to slice 2D numpy array based on 2 1D arrays containing initial and final indexes

I have a 2D numpy array, let's say it has shape 4x10 (4 rows and 10 columns). I have 2 1D arrays that have the initial and final indexes, so they are both 20x1. For an example, let's say
initial = [1, 2, 4, 5]
final = [3, 6, 8, 6]
then I'd like to get
data[0,1:3]
data[1,2:6]
data[2,4:8]
data[3,5:6]
Of course, each of those arrays would have different size, so I'd like to store them in a list.
If I were to do it with a for loop, it'd look like this:
arrays = []
for i in range(4):
slice = data[i,initial[i]:final[i]]
arrays.append(slice)
Is there a more efficient way to do this? I'd rather avoid use a for loop, because my actual data is huge.
You can use numpy.split with flattened data (using numpy.ndarray.flatten) and modifying the slices:
sections = np.column_stack([initial, final]).flatten()
sections[::2] += np.arange(len(initial)) * data.shape[1]
sections[1::2] += sections[::2] - np.array(initial)
np.split(data.flatten(), sections)[1::2]

Numpy: How to stack arrays in columns?

Let's say that I have n numpy arrays of the same length. I would like to now create a numpy matrix, sucht that each column of the matrix is one of the numpy arrays. How can I achieve this? Now I'm doing this in a loop and it produces the wrong results.
Note: I have to be able to stack them next to each other one by one iteratively.
my code looks like assume that get_array is a function that returns a certain array based on its argument. I don't know until after the loop, how many columns that I'm going to have.
matrix = np.empty((n_rows,))
for item in sorted_arrays:
array = get_array(item)
matrix = np.vstack((matrix,array))
any help would be appreciated
You could try putting all your arrays (or lists) into a matrix and then transposing it. This will work if all arrays are the same length.
mymatrix = np.asmatrix((array1, array2, array3)) #... putting arrays into matrix.
mymatrix = mymatrix.transpose()
This should output a matrix with each array as a column. Hope this helps.
Time and again, we recommend collecting the arrays in a list, and making the final array with one call. That's more efficient, and usually easier to get right.
alist = []
for item in sorted_arrays:
alist.append(get_array(item)
or
alist = [get_array(item) for item in sorted_arrays]
There are various ways of assembling the list. Since you want columns, and assuming get_array produces equal sized 1d arrays:
arr = np.column_stack(alist)
Collecting them in rows and transposing that works too:
arr = np.array(alist).T
arr = np.vstack(alist).T
arr = np.stack(alist).T
arr = np.stack(alist, axis=1)
If the arrays are already 2d
arr = np.concatenate(alist, axis=1)
All the stack variations use concatenate, just varying in how they tweak the shape(s) of the input arrays. The key to using concatenate is to understand the dimensions and shapes, and how to add dimensions as needed. That should, soon or later, become fluent in that kind of coding.
If they vary in shape or dimensions, things get messier.
Equally good is to put the arrays in a pre-allocated array. But you need to know the desired final shape
arr = np.zeros((m,n), dtype)
for i, item in enumerate(sorted_arrays):
arr[:,i] = get_array(item)
n is len(sorted_arrays), and m is the length of one of get_array(item). You also need to know the expected dtype (int, float etc).
If you have a, b, c, d np array of same length, the following code will accomplish what you want:
out_matrix = np.vstack([a, b, c, d]).transpose()
An example:
In [3]: a = np.array([1, 2, 3, 4])
In [4]: b = np.array([5, 6, 7, 8])
In [5]: c = np.array([2, 3, 4, 5])
In [6]: d = np.array([6, 8, 2, 4])
In [10]: np.vstack([a, b, c, d]).transpose()
Out[10]:
array([[1, 5, 2, 6],
[2, 6, 3, 8],
[3, 7, 4, 2],
[4, 8, 5, 4]])

Acquiring the Minimum array out of Multiple Arrays by order in Python

Say that I have 4 numpy arrays
[1,2,3]
[2,3,1]
[3,2,1]
[1,3,2]
In this case, I've determined [1,2,3] is the "minimum array" for my purposes, as it is one of two arrays with lowest value at index 0, and of those two arrays it has the the lowest index 1. If there were more arrays with similar values, I would need to compare the next index values, and so on.
How can I extract the array [1,2,3] in that same order from the pile?
How can I extend that to x arrays of size n?
Thanks
Using the python non-numpy .sort() or sorted() on a list of lists (not numpy arrays) automatically does this e.g.
a = [[1,2,3],[2,3,1],[3,2,1],[1,3,2]]
a.sort()
gives
[[1,2,3],[1,3,2],[2,3,1],[3,2,1]]
The numpy sort seems to only sort the subarrays recursively so it seems the best way would be to convert it to a python list first. Assuming you have an array of arrays you want to pick the minimum of you could get the minimum as
sorted(a.tolist())[0]
As someone pointed out you could also do min(a.tolist()) which uses the same type of comparisons as sort, and would be faster for large arrays (linear vs n log n asymptotic run time).
Here's an idea using numpy:
import numpy
a = numpy.array([[1,2,3],[2,3,1],[3,2,1],[1,3,2]])
col = 0
while a.shape[0] > 1:
b = numpy.argmin(a[:,col:], axis=1)
a = a[b == numpy.min(b)]
col += 1
print a
This checks column by column until only one row is left.
numpy's lexsort is close to what you want. It sorts on the last key first, but that's easy to get around:
>>> a = np.array([[1,2,3],[2,3,1],[3,2,1],[1,3,2]])
>>> order = np.lexsort(a[:, ::-1].T)
>>> order
array([0, 3, 1, 2])
>>> a[order]
array([[1, 2, 3],
[1, 3, 2],
[2, 3, 1],
[3, 2, 1]])

Categories