I need to count the largest cycle of 'TRUE' in a boolean
I have a boolean Series with several TRUE sequences. I would like to be able to identify the largest cycle of TRUE values.
E.G: [0,0,1,1,0,0,0,0,0,0,1,1,1,1,1]
I would like to have the cycle: [10,14]
My first approach would be to compare element by element and take the index of each true value. The problem I see with this it's that I'm working with a considerably large dataset so I'm afraid it will take a long time.
Do you guys have any other idea that might work?
Thanks :)
One possible solution with no loops is count consecutive 1 or Trues and get indices maximal values, last add maximal values for starts of 1s groups:
s = pd.Series([0,0,1,1,0,0,0,0,0,0,1,1,1,1,1])
print (s)
a = s == 1
b = a.cumsum()
c = b.sub(b.mask(a).ffill().fillna(0)).astype(int)
print (c)
0 0
1 0
2 1
3 2
4 0
5 0
6 0
7 0
8 0
9 0
10 1
11 2
12 3
13 4
14 5
dtype: int32
m = c.max()
idx = c.index[c == m]
print (idx)
Int64Index([14], dtype='int64')
out = list(zip(idx - m + 1, idx))
print (out)
[(10, 14)]
Another idea with itertools.groupby - create lists for groups with 1 and enumerate for counter, then get list with maximim length and get minimal and maximal indices:
s = pd.Series([0,0,1,1,0,0,0,0,0,0,1,1,1,1,1])
print (s)
from itertools import groupby
a = [ list(group) for key, group in groupby(enumerate(s), key= lambda x:x[1]) if key]
print (a)
[[(2, 1), (3, 1)], [(10, 1), (11, 1), (12, 1), (13, 1), (14, 1)]]
L=[x[0] for x in max(a, key=len)]
out = [min(L), max(L)]
print (out)
[10, 14]
It looks like you'll have to go through the whole dataset somehow. But you don't need the index of each True value. You just need the index of the final one in the longest streak.
Note that if there's a tie, this will only print the latest one.
my_bools = [0,0,1,1,0,0,0,0,0,0,1,1,1,1,1]
max_streak = 0
cur_streak = 0
max_streak_idx = -1
listlen = len(my_bools)
for x in range(0, listlen):
if my_bools[x]:
cur_streak += 1
if cur_streak > max_streak:
max_streak_idx = x
max_streak += 1
else:
cur_streak = 0
print(x, cur_streak, max_streak)
if max_streak_idx == -1:
print("No trues found")
else:
print("Start of max = ", max_streak_idx - max_streak + 1, "End of max = ", max_streak_idx)
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 months ago.
Improve this question
This is my array:
import numpy as np
Arr = np.array( [
["","A","B","C","D","E","F"],
["1","0","0","0","0","0","0"],
["2","0","0","0","0","0","0"],
["3","0","X","0","0","0","0"],
["4","0","0","0","0","0","0"],
["5","0","0","0","X","0","0"],
["6","X","0","0","0","0","0"],
["7","0","0","0","0","0","0"],
["8","0","0","0","0","0","0"]
])
I want to do a binary search but I don't know how to do it with an array of strings. Basically I want to look at the position in where all my "X" are.
def findRow(a, n, m, k):
#a is the 2d array
#n is the number of rows
#m is the number of columns
#k is the "X"
l = 0
r = n - 1
mid = 0
while (l <= r) :
mid = int((l + r) / 2)
# we'll check the left and
# right most elements
# of the row here itself
# for efficiency
if(k == a[mid][0]): #checking leftmost element
print("Found at (" , mid , ",", "0)", sep = "")
return
if(k == a[mid][m - 1]): # checking rightmost element
t = m - 1
print("Found at (" , mid , ",", t , ")", sep = "")
return
if(k > a[mid][0] and k < a[mid][m - 1]): # this means the element
# must be within this row
binarySearch(a, n, m, k, mid) # we'll apply binary
# search on this row
return
if (k < a[mid][0]):
r = mid - 1
if (k > a[mid][m - 1]):
l = mid + 1
def binarySearch(a, n, m, k, x): #x is the row number
# now we simply have to apply binary search as we
# did in a 1-D array, for the elements in row
# number
# x
l = 0
r = m - 1
mid = 0
while (l <= r):
mid = int((l + r) / 2)
if (a[x][mid] == k):
print("Found at (" , x , ",", mid , ")", sep = "")
return
if (a[x][mid] > k):
r = mid - 1
if (a[x][mid] < k):
l = mid + 1
print("Element not found")
This is what I have tried but this is for int 2d arrays. Now I have a string 2d Array and I'm trying to find the location of al my "X"'s.
I want to output to be: found in (A,6), (B,3), (D,5)
Basically I want to look at the position in where all my "X" are.
You can use np.where to get the indices for each axis, then zip them to get index tuples for all the locations:
>>> list(zip(*np.where(Arr == "X")))
[(3, 2), (5, 4), (6, 1)]
If you want the (row, column) "locations", you could do this:
>>> [(Arr[row, 0], Arr[0, col]) for row, col in zip(*np.where(Arr == "X"))]
[('3', 'B'), ('5', 'D'), ('6', 'A')]
However, you seem to be treating an array as a table. You should consider using Pandas:
>>> df = pd.DataFrame(Arr[1:, 1:], columns=Arr[0, 1:], index=range(1, len(Arr[1:]) + 1))
>>> df
A B C D E F
1 0 0 0 0 0 0
2 0 0 0 0 0 0
3 0 X 0 0 0 0
4 0 0 0 0 0 0
5 0 0 0 X 0 0
6 X 0 0 0 0 0
7 0 0 0 0 0 0
8 0 0 0 0 0 0
>>> rows, cols = np.where(df == "X")
>>> [*zip(df.index[rows], df.columns[cols])]
[(3, 'B'), (5, 'D'), (6, 'A')]
#Given array of integers, find the sum of some of its k consecutive elements.
#Sample Input:
#inputArray = [2, 3, 5, 1, 6] and k = 2
#Sample Output:
#arrayMaxConsecutiveSum(inputArray, k) = 8
#Explaination:
#All possible sums of 2 consecutive elements are:
#2 + 3 = 5;
#3 + 5 = 8;
#5 + 1 = 6;
#1 + 6 = 7.
#Thus, the answer is 8`
Your question is not clear but assuming you need a function to return the sum of the highest pair of numbers in your list:
def arrayMaxConsecutiveSum(inputArray, k):
groups = (inputArray[pos:pos + k] for pos in range(0, len(inputArray), 1)) # iterate through array creating groups of k length
highest_value = 0 # start highest value at 0
for group in groups: # iterate through groups
if len(group) == k and sum(group) > highest_value: # if group is 2 numbers and value is higher than previous value
highest_value = sum(group) # make new value highest
return highest_value
inputArray = [2, 3, 5, 1, 6]
print(arrayMaxConsecutiveSum(inputArray, 2))
#8
How to code a program that shows me the item that appears most side-by-side?
Example:
6 1 6 4 4 4 6 6
I want four, not six, because there are only two sixes together.
This is what I tried (from comments):
c = int(input())
h = []
for c in range(c):
h.append(int(input()))
final = []
n = 0
for x in range(c-1):
c = x
if h[x] == h[x+1]:
n+=1
while h[x] != h[c]:
n+=1
final.append([h[c],n])
print(final)
Depends on what exactly you want for an input like
lst = [1, 1, 1, 2, 2, 2, 2, 1, 1, 1]
If you consider the four 2 the most common, because it's the longest unbroken stretch of same items, then you can groupby same values and pick the one with max len:
max((len(list(g)), k) for k, g in itertools.groupby(lst))
# (4, 2) # meaning 2 appeared 4 times
If you are interested in the element that appears the most often next to itself, you can zip the list to get pairs of adjacent items, filter those that are same, pass them through a Counter, and get the most_common:
collections.Counter((x,y) for (x,y) in zip(lst, lst[1:]) if x == y).most_common(1)
# [((1, 1), 4)] # meaning (1,1) appeared 4 times
For your example of 6 1 6 4 4 4 6 6, both will return 4.
maxcount=0; //store maximum number item side by side
num=-1; //store element with max count
for i=0 to n //loop through your array
count=0;
in=i;
while(arr[in++]==arr[i]){//count number of side by side same element
count++;
}
maxcount=max(maxcount,count);
num= maxcount==count? arr[i]:num;
i=in-1;
endfor;
Using non-library python code, how can i return the index and count of the longest sequence of even numbers?
a = [1, 3, 2, 6, 4, 1, 2, 2, 2, 8, 1]
should return 6 and 4, 6 being the index and 4 being the count.
I tried without luck..
def evenSeq(list):
count=0
for i in list:
if list[i]%2 and list[i+1]%2==0:
count+=1
return count
Here's a possible solution:
def even_seq(l):
best = (-1, -1)
start_i = 0
count = 0
for i, n in enumerate(l):
if n % 2 == 0:
count += 1
if count > best[1]:
best = (start_i, count)
else:
start_i = i + 1
count = 0
return best
a=[1,3,2,6,4,2,2,2,2,2,1,2,2,2,8,1]
def evenSeq(a):
largest = 0
temp_largest = 0
location = 0
for count, value in enumerate(a):
if value % 2 == 0:
temp_largest += 1
else:
temp_largest = 0
if temp_largest >= largest:
largest = temp_largest
location = count + 1 - temp_largest #plus one cause enumerate returns the index and we are subbing from the current streak which needs to be offset by one
return location, largest
print(evenSeq(a)) #returns 2 8
Not the prettiest solution but it helps teaches you what's going on and the basic logic of the solution. Basically it checks if the number is even, and keeps a count on the current streak stored in temp_largest. Checks if the temp_largest is the biggest known streak at that moment, and updates the index from enumerate.
Edited based on comment:
for count, value in enumerate(a):
This line basically goes through the list, putting the value in value and the current index in count. enumerate() basically will go through what ever you pass it and returns a count starting from 0 along wit the item. see below for example.
a=[1,3,2,6,4,2,2,2,2,2,1,2,2,2,8,1]
for index, value in enumerate(a):
print('{} index and value is {}'.format(index,value))
Prints out:
0 index and value is 1
1 index and value is 3
2 index and value is 2
3 index and value is 6
4 index and value is 4
5 index and value is 2
6 index and value is 2
7 index and value is 2
8 index and value is 2
9 index and value is 2
10 index and value is 1
11 index and value is 2
12 index and value is 2
13 index and value is 2
14 index and value is 8
15 index and value is 1
I would try it this way:
def evenSeq(seq):
i = startindex = maxindex = maxcount = 0
while i < len(seq):
if seq[i]%2==0:
startindex = i
while i < len(seq) and seq[i]%2==0:
i+=1
if maxcount < i - startindex:
maxcount = i - startindex
maxindex = startindex
i+=1
return (maxindex, maxcount)
I have a file with two columns, lets say A and B
A B
1 10
0 11
0 12
0 15
1 90
0 41
I want to create a new column (a list), lets call the empty list C = []
I would like to loop through A, find if A == 1, and if it is I want to append the value of B[A==1] (10 in the first case) to C until the next A == 1 arrives.
So my final result would be:
A B C
1 10 10
0 11 10
0 12 10
0 15 10
1 90 90
0 41 90
I have tried using the for loop, but only to my dismay:
for a in A:
if a == 1:
C.append(B[a==1])
elif a == 0:
C.append(B[a==1])
You could use another variable to keep the value of the last index in A that had a value of 1, and update it when the condition is met:
temp = 0
for index, value in enumerate(A):
if value == 1:
C.append(B[index])
temp = index
else:
C.append(B[temp])
enumerate() gives you a list of tuples with index and values from an utterable.
For A, it will be [(0, 1), (1, 0), (2, 0), (3, 0), (4, 1), (5, 0)].
P.S: When you try to address a list using a boolean (B[a == 1]) it will return the item in the first place if the condition is false (B[a != 1] => B[False] => B[0]) or the item in the second place if it's true (B[a == 1] => B[True] => B[1]).
You may also try using groupby.
Though solution I have come up with looks a bit convoluted to me:
>>> from itertools import izip, groupby, count
>>> from operator import itemgetter
>>> def gen_group(L):
acc = 0
for item in L:
acc += item
yield acc
>>> [number_out for number,length in ((next(items)[1], 1 + sum(1 for _ in items)) for group,items in groupby(izip(gen_group(A), B), itemgetter(0))) for number_out in repeat(number, length)]
[10, 10, 10, 10, 90, 90]
The idea is to prepare groups and then use them to group your input:
>>> list(gen_group(A))
[1, 1, 1, 1, 2, 2]