Python dictionary hold array ranges? - python

I am tryng to make a dict that could hold some array sniplets
like [127:130, 122:124] but dict = {1:[127:130, 122:124], 2:[127:129, 122:123]} doesn't work.
Is there a way to do this? It doesn't need to be dicts, but I want a bunch of these areas to be callable.
So I have 256x256 arrays and I want to select small areas in them for some calculations:
fft[127:130, 122:124]
Would be great if the whole part between brackets could be in a dict

You could use the slice function. It returns a slice object that can be stored in a dictionary. eg:
slice_1 = slice(127, 130)
slice_2 = slice(122, 124)
slice_a = slice(127, 129)
slice_b = slice(122, 123)
d = {1:[slice_1, slice_2],
2:[slice_a, slice_b]
}
x = fft[d[1]] # Same as fft[127:130, 122:124]
y = fft[d[2]] # Same as fft[127:129, 122:123]

Slicing numpy arrays returns a view, and not a copy, maybe this is what you are looking for?
import numpy
a = numpy.arange(10)
b = a[3:6] # array([3, 4, 5])
a[4] = 0
#b is now array([ 3, 0, 5])
b[1] = 1
#a is now array([0, 1, 2, 3, 1, 5, 6, 7, 8, 9])

Related

Applying an iterable mask, checking it against a value - if value doesn't satisfy the mask condition, move to the next value which does

I currently have some code where I've created a mask which checks to see if a variable matches the first position in a sequence, called index_pos_overload. If it matches, the variable is chosen, and the check ends. However, I want to be able to use this mask to not only check if the number satisfies the condition of the mask, but if it doesn't move along to the next value in the sequence which does. It's essentially to pick out a row in my pandas data column, hyst. My code currently looks like this:
import pandas as pd
from itertools import chain
hyst = pd.DataFrame({"test":[12, 4, 5, 4, 1, 3, 2, 5, 10, 9, 7, 5, 3, 6, 3, 2 ,1, 5, 2]})
possible_overload_cycle = 1
index_pos_overload = chain.from_iterable((hyst.index[i])
for i in range(0, len(hyst)-1, 5))
if (possible_overload_cycle == index_pos_overload):
hyst_overload_cycle = possible_overload_cycle
else:
hyst_overload_cycle = 5 #next value in iterable where index_pos_overload is true
The expected output of hyst_overload_cycle should be this:
print(hyst_overload_cycle)
5
I've included my logic as to how I think this should work - possible_overload_cycle = 1 does not point to the first position in the dataframe, so hyst_overload_cycle should return as 5, the first position in the mask. I hope I've made sense, as I can't quite seem to work out how I would go about this programatically.
If I understood you correctly, it may be simpler than you think:
index_pos_overload can be an array / list, there is no need to use complex constructs to store a sequence of values
to find the first non-zero value from index_pos_overload, one can simply use np.nonzero()[0][0] (the first [0] is to select the dimension, the second is to select the index within that axis) and use array indexing of that on the original index_pos_overload array
The code would look like:
import numpy as np
import pandas as pd
hyst = pd.DataFrame({"test":[12, 4, 5, 4, 1, 3, 2, 5, 10, 9, 7, 5, 3, 6, 3, 2 ,1, 5, 2]})
possible_overload_cycle = 1
index_pos_overload = np.array([hyst.index[i] for i in range(0, len(hyst)-1, 5)])
if possible_overload_cycle in index_pos_overload:
hyst_overload_cycle = possible_overload_cycle
else:
hyst_overload_cycle = index_pos_overload[np.nonzero(index_pos_overload)[0][0]]
print(hyst_overload_cycle)
# 5

Numpy: How to subtract every other element in array

I have the following numpy array
u = np.array([a1,b1,a2,b2...,an,bn])
where I would like to subtract the a and b elements from each other and end up with a numpy array:
u_result = np.array([(a2-a1),(b2-b1),(a3-a2),(b3-b2),....,(an-a_(n-1)),(an-a_(n-1))])
How can I do this without too much array splitting and for loops? I'm using this in a larger loop so ideally, I would like to do this efficiently (and learn something new)
(I hope the indexing of the resulting array is clear)
Or simply, perform a substraction :
u = np.array([3, 2, 5, 3, 7, 8, 12, 28])
u[2:] - u[:-2]
Output:
array([ 2, 1, 2, 5, 5, 20])
you can use ravel torearrange as your original vector.
Short answer:
u_r = np.ravel([np.diff(u[::2]),
np.diff(u[1::2])], 'F')
Here a long and moore detailed explanation:
separate a from b in u this can be achieved indexing
differentiate a and b you can use np.diff for easiness of code.
ravel again the differentiated values.
#------- Create u---------------
import numpy as np
a_aux = np.array([50,49,47,43,39,34,28])
b_aux = np.array([1,2,3,4,5,6,7])
u = np.ravel([a_aux,b_aux],'F')
print(u)
#-------------------------------
#1)
# get a as elements with index 0, 2, 4 ....
a = u[::2]
b = u[1::2] #get b as 1,3,5,....
#2)
#differentiate
ad = np.diff(a)
bd = np.diff(b)
#3)
#ravel putting one of everyone
u_result = np.ravel([ad,bd],'F')
print(u_result)
You can try in this way. Firstly, split all a and b elements using array[::2], array[1::2]. Finally, subtract from b to a (np.array(array[1::2] - array[::2])).
import numpy as np
array = np.array([7,8,9,6,5,2])
u_result = np.array(array[1::2] - array[::2] )
print(u_result)
Looks like you need to use np.roll:
shift = 2
u = np.array([1, 11, 2, 12, 3, 13, 4, 14])
shifted_u = np.roll(u, -shift)
(shifted_u - u)[:-shift]
Returns:
array([1, 1, 1, 1, 1, 1])

Recognize new circular patterns, Python

I have a function createPattern, that, given an array, it returns a list of size 8 containing 7 symbols, say (1,2,3,4,5,6,7). Each symbol can either not appear at all in the list, or appear one or more times.
What i want to do is to create random arrays and, whenever a new pattern is found, append it to circ_pattern_Collection. The difficulty im having is with the following:
I want the code to recognize the pattern independent of the starting "symbol", i.e. recognize only the new circular patterns, for example:
(1,2,3,4,5,6,7,7) = (7,1,2,3,4,5,6,7) = (3,4,5,6,7,7,1,2)… and so on.
(1,1,1,2,3,3,3,3) = (1,1,2,3,3,3,3,1) = (3,3,3,3,1,1,1,2).. and so on.
Something like this :
circ_pattern_Collection=[]
for j in range(10000):
array = np.random.randint(-1000, 1000, (3, 3))
patternList = createPattern(array)
…
"if new circular pattern found, append to circ_pattern_Collection"
…
return circ_pattern_Collection
I could ofc do it by lots of if statements but there must be a more elegant/efficient way of doing this? Any tips?
you can use np.roll to roll the array on all possible patterns,
try this:
import numpy as np
def is_same_circ(a1, a2):
if len(a1) != len(a2):
return False
return any(np.array_equal(a1,np.roll(a2, offset)) for offset in range(len(a1)))
a1 = np.array((1, 2, 3, 4, 5, 6, 7, 7))
print(is_same_circ(a1, np.array((7, 1, 2, 3, 4, 5, 6, 7))))
print(is_same_circ(a1, np.array((7, 7, 7, 3, 4, 5, 6, 7))))
Output:
True
False

Extract a larger slice than the numpy array's size

I want to extract a slice of length 10, beginning at index 2, of a numpy array A:
import numpy
A = numpy.array([1,3,5,3,9])
def bigslice(A, begin_at, length):
a = A[begin_at:begin_at + length]
while len(a) + len(A) < length:
a = numpy.concatenate((a,A))
return numpy.concatenate((a, A[:length-len(a)]))
print bigslice(A, begin_at = 2, length = 10)
#[5,3,9,1,3,5,3,9,1,3]
This is correct. But I'm looking for a more efficient way to do this (especially when I'll have arrays of thousands of elements at the end) : I suspect the concatenate used here to recreate lots of new temporary arrays, and that would be un-efficient.
How to do the same thing more efficiently ?
Since the middle part of the array is already known to you (i.e. n repetitions of the full array), you can simply construct the middle portion using np.tile:
def cyclical_slice(A, start, length):
arr_l = len(A)
middle = np.tile(A, length // arr_l)
return np.array([A[start:], middle, A[0:length - len(middle)]])
Your code doesn't seem to guarantee that you get a slice of length length, e.g.
>>> A = numpy.array([1,3,5,3,9])
>>> bigslice(A, 0, 3)
array([1, 3, 5, 3, 9, 1, 3, 5])
Assuming that this is an oversight, maybe you could use np.pad, e.g.
def wpad(A, begin_at, length):
to_pad = max(length + begin_at - len(A), 0)
return np.pad(A, (0, to_pad), mode='wrap')[begin_at:begin_at+length]
which gives
>>> wpad(A, 0, 3)
array([1, 3, 5])
>>> wpad(A, 0, 10)
array([1, 3, 5, 3, 9, 1, 3, 5, 3, 9])
>>> wpad(A, 2, 10)
array([5, 3, 9, 1, 3, 5, 3, 9, 1, 3])
and so on.

Assigning multiple array indices at once in Python/Numpy

I'm looking to quickly (hopefully without a for loop) generate a Numpy array of the form:
array([a,a,a,a,0,0,0,0,0,b,b,b,0,0,0, c,c,0,0....])
Where a, b, c and other values are repeated at different points for different ranges. I'm really thinking of something like this:
import numpy as np
a = np.zeros(100)
a[0:3,9:11,15:16] = np.array([a,b,c])
Which obviously doesn't work. Any suggestions?
Edit (jterrace answered the original question):
The data is coming in the form of an N*M Numpy array. Each row is mostly zeros, occasionally interspersed by sequences of non-zero numbers. I want to replace all elements of each such sequence with the last value of the sequence. I'll take any fast method to do this! Using where and diff a few times, we can get the start and stop indices of each run.
raw_data = array([.....][....])
starts = array([0,0,0,1,1,1,1...][3, 9, 32, 7, 22, 45, 57,....])
stops = array([0,0,0,1,1,1,1...][5, 12, 50, 10, 30, 51, 65,....])
last_values = raw_data[stops]
length_to_repeat = stops[1]-starts[1]
Note that starts[0] and stops[0] are the same information (which row the run is occurring on). At this point, since the only route I know of is what jterrace suggest, we'll need to go through some contortions to get similar start/stop positions for the zeros, then interleave the zero start/stop with the values start/stops, and interleave the number 0 with the last_values array. Then we loop over each row, doing something like:
for i in range(N)
values_in_this_row = where(starts[0]==i)[0]
output[i] = numpy.repeat(last_values[values_in_this_row], length_to_repeat[values_in_this_row])
Does that make sense, or should I explain some more?
If you have the values and repeat counts fully specified, you can do it this way:
>>> import numpy
>>> values = numpy.array([1,0,2,0,3,0])
>>> counts = numpy.array([4,5,3,3,2,2])
>>> numpy.repeat(values, counts)
array([1, 1, 1, 1, 0, 0, 0, 0, 0, 2, 2, 2, 0, 0, 0, 3, 3, 0, 0])
you can use numpy.r_:
>>> np.r_[[a]*4,[b]*3,[c]*2]
array([1, 1, 1, 1, 2, 2, 2, 3, 3])

Categories