Find start and end of abrupt changes in an arrary in Python - python

Very new to Python and seems to me this task is not solvable based on my learning.
Please help:
I have an array created by the Python program, which provides a 1d array like this:
[0,0,.01,.1,1,1,.1,.01,0,0,0,0.01,.1,1,.1,.01,0,0,0,.01,.1,1,1,.1,.01,0,0,0,.01,.1,1,1]
You can see the array number go from zero to max and then again to zero many times.
I need to find index where it starts to go up and down every time. So here it would be [3,9,12,17,20,26,29]
This is what I tried so far, but in vain
My_array==[0,0,.01,.1,1,1,.1,.01,0,0,0,0.01,.1,1,.1,.01,0,0,0,.01,.1,1,1,.1,.01,0,0,0,.01,.1,1,1]
def _edge(ii):
for i in range (ii, len(My_array)):
if np.abs(My_array[i]-My_array[i-1])>.01;
index=i # save the index where the condition met
break
for ii in range (1, len(My_array))
if ii <len(My_array): # make sure the loop continues till the end
F1_Index=_edge(ii)
F1_Index1.append(F1_Index)

If you use numpy you can do something like this:
import numpy as np
a = np.array([0,0,.01,.1,1,1,.1,.01,0,0,0,0.01,.1,1,.1,.01,0,0,0,.01,.1,1,1,.1,.01,0,0,0,.01,.1,1,1])
b = a[1:] - a[:-1] # find differences between sequential elements
v = abs(b) == 0.01 # find differences whose magnitude are 0.01
# This returns an array of True/False values
edges = v.nonzero()[0] # find indexes of True values
edges += 2 # add 1 because of the differencing scheme,
# add 1 because the results you give in the question
# are 1 based arrays and python uses zero based arrays
edges
> array([ 3, 9, 12, 17, 20, 26, 29], dtype=int64)
This is the fastest way I've found to do this sort of thing.

The following I think does what you need. It first builds a list holding -1, 0 or 1 giving the difference between adjacent values (unfortunately cmp has been removed from Python 3 as this was the perfect function for doing this). It then uses the groupby function and a non-zero filter to generate a list of indexes for when the direction changes:
import itertools
My_array = [0, 0, .01, .1, 1, 1, .1, .01, 0, 0, 0, 0.01, .1, 1, .1, .01, 0, 0, 0, .01, .1, 1, 1, .1, .01, 0, 0, 0, .01, .1, 1, 1]
def my_cmp(x,y):
if x == y: # Or for non-exact changes use: if abs(x-y) <= 0.01:
return 0
else:
return 1 if y > x else -1
def pairwise(iterable):
a, b = itertools.tee(iterable)
next(b, None)
return zip(a, b)
slope = [(my_cmp(pair[1], pair[0]), index) for index, pair in enumerate(pairwise(My_array))]
indexes_of_changes = [next(g)[1]+1 for k,g in itertools.groupby(slope, lambda x: x[0]) if k != 0]
print(indexes_of_changes)
Giving you the following result for your data:
[2, 6, 11, 14, 19, 23, 28]
Note, this gives you ANY change in direction, not just > 0.01.
Tested using Python 3.

Here is the way I did it which is working for me (mostly learned from Brad's code). Still quite do not understand how b.nonzero()[0] is working,. Brad please explain, if possible.
import numpy as np
a = np.array([0,0,.01,.1,1,1,.1,.01,0,0,0,0.01,.1,1,.1,.01,0,0,0,.01,.1,1,1,.1,.01,0,0,0,.01,.1,1,1])
b0=[x>.1 for x in a] # making array of true and false
b0=np.array(b0)
b0=b0*1# converting true false to 1 and 0
b=abs(b0[1:]-b0[:-1])# now I only have 1 where there is a change
edges = b.nonzero()[0] # find indexes of 1 values
edges
array([ 3, 5, 12, 13, 20, 22, 29], dtype=int64)

Related

Google foo.bar failing all test cases but working in python IDE

So I'm doing the foo.bar challenge, and I've got code in python that outputs the required answers. I know for a fact that for at least the first two test cases my output matches their output but it still fails all of them. I assumed it could be because its running in python 2.7.13 so I found an online sandbox that runs that version of python but my code still outputs the required output there too. I've tried using the print function to output the results, I've tried formatting the results as lists and arrays but none of this worked. The question is below:
Doomsday Fuel
Making fuel for the LAMBCHOP's reactor core is a tricky process
because of the exotic matter involved. It starts as raw ore, then
during processing, begins randomly changing between forms, eventually
reaching a stable form. There may be multiple stable forms that a
sample could ultimately reach, not all of which are useful as fuel.
Commander Lambda has tasked you to help the scientists increase fuel
creation efficiency by predicting the end state of a given ore sample.
You have carefully studied the different structures that the ore can
take and which transitions it undergoes. It appears that, while
random, the probability of each structure transforming is fixed. That
is, each time the ore is in 1 state, it has the same probabilities of
entering the next state (which might be the same state). You have
recorded the observed transitions in a matrix. The others in the lab
have hypothesized more exotic forms that the ore can become, but you
haven't seen all of them.
Write a function solution(m) that takes an array of array of
nonnegative ints representing how many times that state has gone to
the next state and return an array of ints for each terminal state
giving the exact probabilities of each terminal state, represented as
the numerator for each state, then the denominator for all of them at
the end and in simplest form. The matrix is at most 10 by 10. It is
guaranteed that no matter which state the ore is in, there is a path
from that state to a terminal state. That is, the processing will
always eventually end in a stable state. The ore starts in state 0.
The denominator will fit within a signed 32-bit integer during the
calculation, as long as the fraction is simplified regularly.
For example, consider the matrix m: [ [0,1,0,0,0,1], # s0, the
initial state, goes to s1 and s5 with equal probability
[4,0,0,3,2,0], # s1 can become s0, s3, or s4, but with different
probabilities [0,0,0,0,0,0], # s2 is terminal, and unreachable
(never observed in practice) [0,0,0,0,0,0], # s3 is terminal
[0,0,0,0,0,0], # s4 is terminal [0,0,0,0,0,0], # s5 is terminal ]
So, we can consider different paths to terminal states, such as: s0 ->
s1 -> s3 s0 -> s1 -> s0 -> s1 -> s0 -> s1 -> s4 s0 -> s1 -> s0 -> s5
Tracing the probabilities of each, we find that s2 has probability 0
s3 has probability 3/14 s4 has probability 1/7 s5 has probability 9/14
So, putting that together, and making a common denominator, gives an
answer in the form of [s2.numerator, s3.numerator, s4.numerator,
s5.numerator, denominator] which is [0, 3, 2, 9, 14].
Languages
To provide a Java solution, edit Solution.java To provide a Python
solution, edit solution.py
Test cases
========== Your code should pass the following test cases. Note that it may also be run against hidden test cases not shown here.
-- Java cases -- Input: Solution.solution({{0, 2, 1, 0, 0}, {0, 0, 0, 3, 4}, {0, 0, 0, 0, 0}, {0, 0, 0, 0,0}, {0, 0, 0, 0, 0}}) Output:
[7, 6, 8, 21]
Input: Solution.solution({{0, 1, 0, 0, 0, 1}, {4, 0, 0, 3, 2, 0}, {0,
0, 0, 0, 0, 0}, {0, 0, 0, 0, 0, 0}, {0, 0, 0, 0, 0, 0}, {0, 0, 0, 0,
0, 0}}) Output:
[0, 3, 2, 9, 14]
-- Python cases -- Input: solution.solution([[0, 2, 1, 0, 0], [0, 0, 0, 3, 4], [0, 0, 0, 0, 0], [0, 0, 0, 0,0], [0, 0, 0, 0, 0]]) Output:
[7, 6, 8, 21]
Input: solution.solution([[0, 1, 0, 0, 0, 1], [4, 0, 0, 3, 2, 0], [0,
0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0], [0, 0, 0, 0,
0, 0]]) Output:
[0, 3, 2, 9, 14]
my code is below:
import numpy as np
from fractions import Fraction
from math import gcd
def solution(M):
height = (len(M))
length = (len(M[0]))
M = np.array(M)
AB = []
#Find B
for i in range(0, height):
#if B = 1
if (sum(M[:,0])) == 0:
sumB = 1
if(M[i,0]) != 0:
B1 = Fraction((M[i,0]), (sum(M[i])))
B2 = Fraction((M[0,i]), (sum(M[0])))
B = B1 * B2
#Find sum(B) to infinity
sumB = (1/(1-B))
#Find A
boolean2 = 0
count = 0
index = []
for i in range (0, height):
if sum(M[i]) == 0:
if boolean2 == 0:
terminalstart = i
boolean = 0
boolean2 = 1
for j in range(0, height):
#if there is no A
if j==height-1 and boolean == 0:
index.append(i-terminalstart)
count +=1
if (M[j,i]) != 0:
boolean = 1
A1 = Fraction((M[j,i]), (sum(M[j])))
A = A1
if j!=0:
A2 = Fraction((M[0,j]), (sum(M[0])))
A = A1 * A2
#Find AB
AB.append(A*sumB)
#Find common denominators
x = []
y = []
for i in range (0,len(AB)):
x.append(AB[i].denominator)
lcm = 1
#change numerators to fit
for i in x:
lcm = lcm*i//gcd(lcm, i)
for i in range (0, len(AB)):
z = (lcm) / x[i]
#
z = float(z)
#
y.append(int((AB[i].numerator)*z))
#insert 0s
for i in range (0, count):
y.insert(index[i], 0)
#insert denominator
y.append(lcm)
return y
So the code and the questions are basically irrelevant, the main point is, my output (y) is exactly the same as the output in the examples, but when it runs in foo.bar it fails. To test it I used a code that simply returned the desired output in foo.bar and it worked for the test case that had this output:
def solution(M):
y = [0, 3, 2, 9, 14]
return y
So I know that since my code gets to the exact same array and data type for y in the python IDE it should work in google foo.bar, but for some reason its not. Any help would be greatly appreciated
edit:
I found a code online that works:
import numpy as np
# Returns indexes of active & terminal states
def detect_states(matrix):
active, terminal = [], []
for rowN, row in enumerate(matrix):
(active if sum(row) else terminal).append(rowN)
return(active,terminal)
# Convert elements of array in simplest form
def simplest_form(B):
B = B.round().astype(int).A1 # np.matrix --> np.array
gcd = np.gcd.reduce(B)
B = np.append(B, B.sum()) # append the common denom
return (B / gcd).astype(int)
# Finds solution by calculating Absorbing probabilities
def solution(m):
active, terminal = detect_states(m)
if 0 in terminal: # special case when s0 is terminal
return [1] + [0]*len(terminal[1:]) + [1]
m = np.matrix(m, dtype=float)[active, :] # list --> np.matrix (active states only)
comm_denom = np.prod(m.sum(1)) # product of sum of all active rows (used later)
P = m / m.sum(1) # divide by sum of row to convert to probability matrix
Q, R = P[:, active], P[:, terminal] # separate Q & R
I = np.identity(len(Q))
N = (I - Q) ** (-1) # calc fundamental matrix
B = N[0] * R * comm_denom / np.linalg.det(N) # get absorbing probs & get them close to some integer
return simplest_form(B)
When I compared the final answer from this working code to mine by adding the lines:
print(simplest_form(B))
print(type(simplest_form(B))
this is what I got
[ 0 3 2 9 14]
<class 'numpy.ndarray'>
array([ 0, 3, 2, 9, 14])
When I added the lines
y = np.asarray(y)
print(y)
print(type(y))
to my code this is what I got:
[ 0 3 2 9 14]
<class 'numpy.ndarray'>
array([ 0, 3, 2, 9, 14])
when they were both running the same test input. These are the exact same but for some reason mine doesn't work on foo.bar but his does. Am I missing something?
It turns out the
math.gcd(x, y)
function is not allowed in python 2. I just rewrote it as this:
def grcd(x, y):
if x >= y:
big = x
small = y
else:
big = y
small = x
bool1 = 1
for i in range(1, big+1):
while bool1 == 1:
if big % small == 0:
greatest = small
bool1 = 0
small-= 1
return greatest

Remove values from numpy array closer to each other

Actually i want to remove the elements from numpy array which are closer to each other.For example i have array [1,2,10,11,18,19] then I need code that can give output like [1,10,18] because 2 is closer to 1 and so on.
In the following is provided an additional solution using numpy functionalities (more precisely np.ediff1d which makes the differences between consecutive elements of a given array. This code considers as threshold the value associated to the th variable.
a = np.array([1,2,10,11,18,19])
th = 1
b = np.delete(a, np.argwhere(np.ediff1d(a) <= th) + 1) # [1, 10, 18]
Here is simple function to find the first values of series of consecutives values in a 1D numpy array.
import numpy as np
def find_consec(a, step=1):
vals = []
for i, x in enumerate(a):
if i == 0:
diff = a[i + 1] - x
if diff == step:
vals.append(x)
elif i < a.size-1:
diff = a[i + 1] - x
if diff > step:
vals.append(a[i + 1])
return np.array(vals)
a = np.array([1,2,10,11,18,19])
find_consec(a) # [1, 10, 18]
Welcome to stackoverflow. below is the code that can answer you question:
def closer(arr,cozy):
result = []
result.append(arr[0])
for i in range(1,len(arr)-1):
if arr[i]-result[-1]>cozy:
result.append(arr[i])
print result
Example:
a = [6,10,7,20,21,16,14,3,2]
a.sort()
closer(a,1)
output : [2, 6, 10, 14, 16, 20]
closer(a,3)
Output: [2, 6, 10, 14, 20]

Median of the medians of a list

I need a vector that stores the median values of the medians of the main list "v". I have tried something with the following code but I am only able to write some values in the correct way.
v=[1,2,3,4,5,6,7,8,9,10]
final=[]
nfac=0
for j in range (0,4):
nfac=j+1
for k in range (0,nfac):
if k%2==0:
final.append(v[10/2**(nfac)-1])
else:
final.append(v[9-10/2**(nfac)])
The first median in v=[1,2,3,4,5,6,7,8,9,10] is 5
Then I want the medians of the remaining sublists [1,2,3,4] and [6,7,8,9,10]. I.e. 2 and 8 respectively. And so on.
The list "final" must be in the following form:
final=[5,2,8,1,3,6,9,4,7,10]
Please take a note that the task as you defined it is basically equivalent to constructing a binary heap from an array.
Definitely start by defining a helper function for finding the median:
def split_by_median(l):
median_ind = (len(l)-1) // 2
median = l[median_ind]
left = l[:median_ind]
right = l[median_ind+1:] if len(l) > 1 else []
return median, left, right
Following the example you give, you want to process the resulting sublists in a breadth-first manner, so we need a queue to remember the following tasks:
from collections import deque
def construct_heap(v):
lists_to_process = deque([sorted(v)])
nodes = []
while lists_to_process:
head = lists_to_process.popleft()
if len(head) == 0:
continue
median, left, right = split_by_median(head)
nodes.append(median)
lists_to_process.append(left)
lists_to_process.append(right)
return nodes
So calling the function finally:
print(construct_heap([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])) # [5, 2, 8, 1, 3, 6, 9, 4, 7, 10]
print(construct_heap([5, 1, 2])) # [2, 1, 5]
print(construct_heap([1, 0, 0.5, -1])) # [0, -1, 0.5, 1]
print(construct_heap([])) # []

Assigning multiple array indices at once in Python/Numpy

I'm looking to quickly (hopefully without a for loop) generate a Numpy array of the form:
array([a,a,a,a,0,0,0,0,0,b,b,b,0,0,0, c,c,0,0....])
Where a, b, c and other values are repeated at different points for different ranges. I'm really thinking of something like this:
import numpy as np
a = np.zeros(100)
a[0:3,9:11,15:16] = np.array([a,b,c])
Which obviously doesn't work. Any suggestions?
Edit (jterrace answered the original question):
The data is coming in the form of an N*M Numpy array. Each row is mostly zeros, occasionally interspersed by sequences of non-zero numbers. I want to replace all elements of each such sequence with the last value of the sequence. I'll take any fast method to do this! Using where and diff a few times, we can get the start and stop indices of each run.
raw_data = array([.....][....])
starts = array([0,0,0,1,1,1,1...][3, 9, 32, 7, 22, 45, 57,....])
stops = array([0,0,0,1,1,1,1...][5, 12, 50, 10, 30, 51, 65,....])
last_values = raw_data[stops]
length_to_repeat = stops[1]-starts[1]
Note that starts[0] and stops[0] are the same information (which row the run is occurring on). At this point, since the only route I know of is what jterrace suggest, we'll need to go through some contortions to get similar start/stop positions for the zeros, then interleave the zero start/stop with the values start/stops, and interleave the number 0 with the last_values array. Then we loop over each row, doing something like:
for i in range(N)
values_in_this_row = where(starts[0]==i)[0]
output[i] = numpy.repeat(last_values[values_in_this_row], length_to_repeat[values_in_this_row])
Does that make sense, or should I explain some more?
If you have the values and repeat counts fully specified, you can do it this way:
>>> import numpy
>>> values = numpy.array([1,0,2,0,3,0])
>>> counts = numpy.array([4,5,3,3,2,2])
>>> numpy.repeat(values, counts)
array([1, 1, 1, 1, 0, 0, 0, 0, 0, 2, 2, 2, 0, 0, 0, 3, 3, 0, 0])
you can use numpy.r_:
>>> np.r_[[a]*4,[b]*3,[c]*2]
array([1, 1, 1, 1, 2, 2, 2, 3, 3])

How to rewrite the code more elegant

I wrote this function. The input and expected results are indicated in the docstring.
def summarize_significance(sign_list):
"""Summarizes a series of individual significance data in a list of ocurrences.
For a group of p.e. 5 measurements and two diferent states, the input data
has the form:
sign_list = [[-1, 1],
[0, 1],
[0, 0],
[0,-1],
[0,-1]]
where -1, 0, 1 indicates decrease, no change or increase respectively.
The result is a list of 3 items lists indicating how many measurements
decrease, do not change or increase (as list items 0,1,2 respectively) for each state:
returns: [[1, 4, 0], [2, 1, 2]]
"""
swaped = numpy.swapaxes(sign_list, 0, 1)
summary = []
for row in swaped:
mydd = defaultdict(int)
for item in row:
mydd[item] += 1
summary.append([mydd.get(-1, 0), mydd.get(0, 0), mydd.get(1, 0)])
return summary
I am wondering if there is a more elegant, efficient way of doing the same thing. Some ideas?
Here's one that uses less code and is probably more efficient because it just iterates through sign_list once without calling swapaxes, and doesn't build a bunch of dictionaries.
summary = [[0,0,0] for _ in sign_list[0]]
for row in sign_list:
for index,sign in enumerate(row):
summary[index][sign+1] += 1
return summary
No, just more complex ways of doing so.
import itertools
def summarize_significance(sign_list):
res = []
for s in zip(*sign_list):
d = dict((x[0], len(list(x[1]))) for x in itertools.groupby(sorted(s)))
res.append([d.get(x, 0) for x in (-1, 0, 1)])
return res
For starters, you could do:
swapped = numpy.swapaxes(sign_list, 0, 1)
for row in swapped:
mydd = {-1:0, 0:0, 1:0}
for item in row:
mydd[item] += 1
summary.append([mydd[-1], mydd[0], mydd[1])
return summary

Categories