More elegant way of find a range of repeating elements - python

I have this problem.
let l be a list containing only 0's and 1's, find all tuples that represents the start and end of a repeating sequence of 1's.
example
l=[1,1,0,0,0,1,1,1,0,1]
answer:
[(0,2),(5,8),(9,10)]
i solved the problem with the following code, but i think it is pretty messy, i would like to know if there is a cleaner way to solve this problem (maybe using map/reduce ?)
from collections import deque
def find_range(l):
pairs=deque((i,i+1) for i,e in enumerate(l) if e==1)
ans=[]
p=[0,0]
while(len(pairs)>1):
act=pairs.popleft()
nex=pairs[0]
if p==[0,0]:
p=list(act)
if act[1]==nex[0]:
p[1]=nex[1]
else:
ans.append(tuple(p))
p=[0,0]
if(len(pairs)==1):
if p==[0,0]:
ans.append(pairs.pop())
else:
ans.append((p[0],pairs.pop()[1]))
return ans

With itertools.groupby magic:
from itertools import groupby
lst = [1, 1, 0, 0, 0, 1, 1, 1, 0, 1]
indices, res = range(len(lst)), []
for k, group in groupby(indices, key=lambda i: lst[i]):
if k == 1:
group = list(group)
sl = group[0], group[-1] + 1
res.append(sl)
print(res)
The output:
[(0, 2), (5, 8), (9, 10)]
Or with a more efficient generator function:
def get_ones_coords(lst):
indices = range(len(lst))
for k, group in groupby(indices, key=lambda i: lst[i]):
if k == 1:
group = list(group)
yield group[0], group[-1] + 1
lst = [1, 1, 0, 0, 0, 1, 1, 1, 0, 1]
print(list(get_ones_coords(lst))) # [(0, 2), (5, 8), (9, 10)]
As a short bonus, here's alternative numpy approach, though sophisticated, based on discrete difference between consecutive numbers (numpy.diff) and extracting indices of non-zero items (numpy.faltnonzero):
In [137]: lst = [1,1,0,0,0,1,1,1,0,1]
In [138]: arr = np.array(lst)
In [139]: np.flatnonzero(np.diff(np.r_[0, arr, 0]) != 0).reshape(-1, 2)
Out[139]:
array([[ 0, 2],
[ 5, 8],
[ 9, 10]])

Code:
a = [[l.index(1)]]
[l[i] and len(a[-1])==2 and a.append([i]) or l[i] or len(a[-1])==1 and a[-1].append(i) for i in range(len(l))]
Output:
[[0, 2], [5, 8], [9]]

Code:
l=[1,1,0,0,0,1,1,1,0,1]
indices = [ind for ind, elem in enumerate(l) if elem == 1]
diff = [0]+[x - indices[i - 1] for i, x in enumerate(indices)][1:]
change_ind = [0]+[i for i, change in enumerate(diff) if change > 1]+[len(indices)]
split_indices = [tuple(indices[i:j]) for i,j in zip(change_ind,change_ind[1:])]
proper_tuples = [(tup[0], tup[-1]) if len(tup)>2 else tup for tup in split_indices]
print(proper_tuples)
Logic:
indices is the list of indices where l elements = 1 => [0, 1, 5, 6, 7, 9]
diff calculates the difference between the indices found above and appends a 0 at the start to keep their lengths the same => [0, 1, 4, 1, 1, 2]
change_ind indicates the locations where a split needs to happen which corresponds to where diff is greater than 1. Also append the first index and last index for later use or else you will only have the middle tuple => [0, 2, 5, 6]
split_indices creates tuples based on the range indicated in consecutive elements in change_ind (using zip which creates the combination of ranges) => [(0, 1), (5, 6, 7), (9,)]
Lastly, proper_tuples loops through the tuples create in split_indices and insures that if their length is greater than 2, then only consider the first and last elements, otherwise keep as is => [(0, 1), (5, 7), (9,)]
Output:
[(0, 1), (5, 7), (9,)]
Final Comments:
Although this does not match what OP suggested in the original question:
[(0,2),(5,8),(9,10)]
It does make more logical sense and seems to follow what OP indicated in the comments.
For example, at the start of l there are two ones - so the tuple should be (0, 1) not (0, 2) to match the proposed (start, end) notation.
Likewise at the end there is only a single one - so the tuple corresponding to this is (9,) not (9, 10)

Related

remove some elements from a matrix according to the indices

I have a matrix:
a = ([[1, 0, 0, 0],
[0, 0, 1, 1],
[0, 1, 0, 1]
[1, 0, 0, 1]])
and I want to print the 0s in the matrix but not all of the 0s. I only want to keep the 0s in every row with the smallest index and remove all subsequent zeros in the row.
For instance, in the first row of this matrix, the second element (a[0][1]) should be kept and the rest of elements in the first row should be deleted since they are all zeros.
I used pop() for 2D array but I got attribute error. And the output is not correct too. I don't know how to compare indices and select the smallest column index in every row.
This is my code:
for ix, row in enumerate(a):
for iy, i in enumerate(row):
if i==0 and (iy+ix<(iy+1)+ix) :
a[ix].pop((iy+1))
print(ix,iy)
elif i==0 and (iy+ix>(iy+1)+ix):
a[ix].pop(iy)
print(ix,iy+1)
print(a)
my expected result is the set of indices and the modified matrix a.
0 1
1 0
2 0
3 1
a=[[1,0],[0,1,1],[0,1,1],[1,0]]
Could anyone help me?
This solution only works if there is at least one zero in every row.
indices = []
for x,row in enumerate(a):
i = row.index(0)
indices.append((x,i))
a[x] = row[:i+1] + [e for e in row[i:] if e]
print(indices)
print(a)
Output
[(0, 1), (1, 0), (2, 0), (3, 1)]
[[1, 0], [0, 1, 1], [0, 1, 1], [1, 0, 1]]
Assuming there's a zero in every row, you can get its column index with
c = np.argmin(a, axis=1)
Alternatively, if the matrix can contain negative numbers, you can do
c = np.argmax(np.equal(a, 0), axis=1)
The rows are just
r = np.arange(len(a))
The result you want is then
result = np.stack((r, c), axis=-1)
If there are rows without a zero in them, you can filter the result with a mask:
mask = np.array(a)[r, c] == 0
result = result[mask, :]
Looking at your example input
a = [[1,0,0,0],[0,0,1,1],[0,1,0,1],[1,0,0,1]]
and the expected output
>>[(0, 1), (1, 0), (2, 0), (3, 1)]
you can reframe the problem as finding the index of the element in each row which has the value zero (and where more than one element exists, return the first).
By framing it this way, the solution is as simple as iterating through each row of a and retrieving the index of the value 0 (whereby only the first element will be returned by default).
Using list comprehension that would look like this:
value_to_find = 0
desired_indexes = [
row.index(value_to_find) for row in a
]
or using map that would be:
value_to_find = 0
desired_indexes = map(lambda row:row.index(value_to_find), a)
Then you could enumerate them to pair the results with the row number
enumerate(desired_indexes)
Et voila!
>>[(0, 1), (1, 0), (2, 0), (3, 1)]
The entire solution can be written in a single line like so:
answer = list(enumerate(map(lambda row:row.index(0), a)))
try this:
a = [[1, 0, 0, 0],
[0, 0, 1, 1],
[0, 1, 0, 1],
[1, 0, 0, 1]]
b = []
for i in a:
f = False
c = []
for j in i:
if (j==0 and f==False) or j != 0:
c.append(j)
if j == 0: f = True
else:
continue
b.append(c)
output:
[[1, 0], [0, 1, 1], [0, 1, 1], [1, 0, 1]]
For getting indices zero in array you can try this:
list({i : j.index(0) for i,j in enumerate(b)}.items())
# [(0, 1), (1, 0), (2, 0), (3, 1)]

Python - Finds the index of the smallest element in the list A from index k onwards

I am stuck in finding how I can take the "k" in consideration to solve the following problem. Basically, it should start at index k and look for the lowest value in the range from k until the end of the list.
def find_min_index(A, k):
"""
Finds the index of the smallest element in the list A from index k onwards
Parameters:
A (list)
k: index from which start search
Example use:
>>> find_min_index([1, 2, 5, -1], 0)
3
>>> find_min_index([1, 1, 1, 5, 9], 2)
2
"""
minpos = A.index(min(A))
return minpos
One-liner solution is this:
return A[k:].index(min(A[k:]) + k
You select the minimal element from A[k:], find its index in A[k:] and add k to it to compensate the search area.
A slightly neater solution is this:
slice = A[k:]
return slice.index(min(slice)) + k
You can use enumerate to keep track of the original index before you slice the list with k as the starting index:
from operator import itemgetter
def find_min_index(A, k):
return min(list(enumerate(A))[k:], key=itemgetter(1))[0]
so that:
print(find_min_index([1, 2, 5, -1], 0))
print(find_min_index([1, 1, 1, 5, 9], 2))
would output:
3
2
You could use enumerate to find the index of min:
def find_min_index(A, k):
"""
Finds the index of the smallest element in the list A from index k onwards
Parameters:
A (list)
k: index from which start search
Example use:
>>> find_min_index([1, 2, 5, -1], 0)
3
>>> find_min_index([1, 1, 1, 5, 9], 2)
2
"""
o, _ = min(enumerate(A[k:]), key=lambda i: i[1])
minpos = k + o
return minpos
print(find_min_index([1, 2, 3, 4], 1))
print(find_min_index([4, 3, 2, 1], 1))
Output
1
3
You can add k to the index calculated from a sliced input list:
def find_min_index(A, k):
sliced = A[k:]
return k + sliced.index(min(sliced))
find_min_index([1, 2, 5, -1], 2) # 3
find_min_index([1, 1, 1, 5, 9], 2) # 2

Find items and repetitions in list

I am working in Python and considering the following problem: given a list, such as [1, 0, -2, 0, 0, 4, 5, 0, 3] which contains the integer 0 multiple times, I would like to have the indices at of these 0 and for each one, the number of times it appears in the list until a different element appears or the list ends.
Given l = [1, 0, -2, 0, 0, 4, 5, 0], the function would return ((1, 1), (3, 2), (7, 1)). The result is a list of tuples. The first element of the tuple is the index (in the list) of the given element and the second is the number of times it is repeated until a different element appears or the list ends.
Naively, I would write something like this:
def myfun(l, x):
if x not in l:
print("The given element is not in list.")
else:
j = 0
n = len(l)
r = list()
while j <= (n-2):
count = 0
if l[j] == x:
while l[j + count] == x and j <= (n-1):
count +=1
r.append((j, count))
j += count
else:
j += 1
if l[-1] == x:
r.append((n-1, 1))
return r
But I was wondering whether there would be a nicer (shorter?) way of doing the same thing.
Not the prettiest, but a one-liner:
>>> import itertools
>>> l=[1, 0, -2, 0, 0, 4, 5, 0]
>>> [(k[0][0],len(k)) for k in [list(j) for i,j in itertools.groupby(enumerate(l), lambda x: x[1]) if i==0]]
[(1, 1), (3, 2), (7, 1)]
First, itertools.groupby(enumerate(l), lambda x: x[1]) will group by the second item of enumerate(l), but keep the index of the item.
Then [list(j) for i,j in itertools.groupby(enumerate(l), lambda x: x[1]) if i==0] will keep only the 0 values.
Finally, the last list comprehension is needed because list(j) consume the itertools object.
Another oneliner with groupby, without using intermediate lists:
>>> from itertools import groupby
>>> l = [1, 0, -2, 0, 0, 4, 5, 0, 3]
>>> [(next(g)[0], 1 + sum(1 for _ in g)) for k, g in groupby(enumerate(l), key=lambda x: x[1]) if k == 0]
[(1, 1), (3, 2), (7, 1)]
In above enumerate will return (index, value) tuples which are then grouped by the value. groupby returns (key, iterable) tuples and if key is nonzero the group is discarded. For kept groups next is used to pull out the first item in the group and take index from there while rest of the items are processed by generator expression given to sum in order to get the count.
This is how i would do this
l=[1, 0, -2, 0, 0, 4, 5, 0]
lis=[]
t=0
for m in range(len(l)):
if l[m]==0:
if t==0:
k=m
j=1
t=1
else:
j=j+1
t=1
if m==len(l)-1:
lis.append((k,j))
else:
if t==1:
t=0
lis.append((k,j))
Another solution, using itertools.takewhile:
from itertools import takewhile
L = [1, 0, -2, 0, 0, 4, 5, 0]
res = []
i = 0
while i < len(L):
if L[i] == 0:
t = len(list(takewhile(lambda k: k == 0, L[i:])))
res.append((i, t))
i += t
else:
i += 1
print(res)
The line
t = len(list(takewhile(lambda k: k == 0, L[i:])))
counts the number of zeroes there are from the current position to the right.
While clear enough, the disadvantage of this solution is that it needs the whole list before processing it.

How do I find a value relative to where a list occurs within my list in Python?

I have a list of numbers:
Data = [0,2,0,1,2,1,0,2,0,2,0,1,2,0,2,1,1,...]
And I have a list of tuples of two, which is all possible combinations of the individual numbers above:
Combinations = [(0,0),(0,1),(0,2),(1,0),(1,1),(1,2),(2,0),(2,1),(2,2)]
I want to try to find where each item in Combinations appears in Data and add the value after each occurrence to another list.
For example, for (0,2) I want to make a list [0,0,0,1] because those are the the values that fall immediately after (0,2) occurs in Data.
So far, I have:
any(Data[i:i+len(CurrentTuple)] == CurrentTuple for i in xrange(len(Data)-len(CurrentTuple)+1))
Where CurrentTuple is Combinations.pop().
The problem is that this only gives me a Boolean of whether the CurrentTuple occurs in Data. What I really need is the value after each occurrence in Data.
Does anyone have any ideas as to how this can be solved? Thanks!
You can use a dict to group the data to see where/if any comb lands in the original list zipping up pairs:
it1, it2 = iter(Data), iter(Data)
next(it2)
Combinations = [(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)]
d = {c: [] for c in Combinations}
ind = 2
for i, j in zip(it1, it2):
if (i, j) in d and ind < len(Data):
d[(i, j)].append(Data[ind])
ind += 1
print(d)
Which would give you:
{(0, 1): [2, 2], (1, 2): [1, 0], (0, 0): [], (2, 1): [0, 1], (1, 1): [2], (2, 0): [1, 2, 1, 2], (2, 2): [], (1, 0): [2], (0, 2): [0, 0, 0, 1]}
You could also do it in reverse:
from collections import defaultdict
it1, it2 = iter(Data), iter(Data)
next(it2)
next_ele_dict = defaultdict(list)
data_iter = iter(Data[2:])
for ind, (i, j) in enumerate(zip(it1, it2)):
if ind < len(Data) -2:
next_ele_dict[(i, j)].append(next(data_iter))
def next_ele():
for comb in set(Combinations):
if comb in next_ele_dict:
yield comb, next_ele_dict[comb]
print(list(next_ele()))
Which would give you:
[((0, 1), [2, 2]), ((1, 2), [1, 0]), ((2, 1), [0, 1]), ((1, 1), [2]), ((2, 0), [1, 2, 1, 2]), ((1, 0), [2]), ((0, 2), [0, 0, 0, 1])]
Any approach is better than a pass over the Data list for every element in Combinations.
To work for arbitrary length tuples we just need to create the tuples based on the length:
from collections import defaultdict
n = 2
next_ele_dict = defaultdict(list)
def chunks(iterable, n):
for i in range(len(iterable)-n):
yield tuple(iterable[i:i+n])
data_iter = iter(Data[n:])
for tup in chunks(Data, n):
next_ele_dict[tup].append(next(data_iter))
def next_ele():
for comb in set(Combinations):
if comb in next_ele_dict:
yield comb, next_ele_dict[comb]
print(list(next_ele()))
You can apply it to whatever implementation you prefer, the logic will be the same as far as making the tuples goes.
sum([all(x) for x in (Data[i:i+len(CurrentTuple)] == CurrentTuple for i in xrange
(len(Data)-len(CurrentTuple)+1))])
What you did return a generator that produce the following list:
[array([False, True], dtype=bool),
array([ True, False], dtype=bool),
array([False, True], dtype=bool),
...
array([False, False], dtype=bool),
array([False, False], dtype=bool),
array([False, False], dtype=bool),
array([False, True], dtype=bool)]
One of the array that you have in this list match the CurrentTuple only if both the bool in the array are True. The all return True only if all the elements of the list are True so the list generated by the [all(x) for x in ...] will contains True only if the twin of numbers match the CurrentTuple. A True is conted as 1 when you use sum. I hope it is clear.
If you want to compare only non-overlapping pairs:
[2,2,
0,2,
...]
and keep the algorithm as general as possible, you can use the following:
sum([all(x) for x in (Data[i:i+len(CurrentTuple)] == CurrentTuple for i in xrange
(0,len(Data)-len(CurrentTuple)+1,len(CurrentTuple)))])
Despite much more cryptic, this code is much faster than any alternative that using append (look [Comparing list comprehensions and explicit loops (3 array generators faster than 1 for loop) to understand why).

Add a 0 or 1 based on a value in a certain column

I hope anyone can help me with the following. I have a list called: 'List'. And I have a list called X.
Now I would like to check whether the value in the third column of each row in List is smaller than (<) X or equal/bigger than X. If the value is smaller I would like to add a 0 to the 6th column and a 1 if it is equal/bigger. And for each X I would like the answers to be added to the upfollowing columns to List. So in this case there are 4 X values. So as a result 4 columns should be added to List. My code below probably shows I'm quite an amature and I hope you can help me out. Thank you in advance.
List = [(3,5,6,7,6),(3,5,3,2,6),(3,6,1,0,5)]
X= [1,4,5,6]
for item in X:
for number in row[3] for row in List:
count = 0
if number < item:
List[5+count].append(0)
count += 1
return List
else:
List[5+count].append(1)
count += 1
return List
return List
First, you should know that tuples (parenthesis enclosed lists) are immutable, so you can not change anything about them once they're defined. It's better to use a list in your case (enclosed by []).
List = [[3,5,6,7,6],[3,5,3,2,6],[3,6,1,0,5]]
X= [1,4,5,6]
for item in X: # loop on elements of X
for subList in List: # loop on 'rows' in List
if subList[2] < item: # test if 3rd element is smaller than item in X
subList.append(0); # push 0 to the end of the row
else:
subList.append(1); # push 1 to the end of the row
List = [(3,5,6,7,6),(3,5,3,2,6),(3,6,1,0,5)]
X= [1,4,5,6]
scores = []
for item in List:
scores.append(tuple(map(lambda x: 0 if item[2] < x else 1, X)))
result = []
for item, score in zip(List, scores):
result.append(item + score)
print(result)
# [(3, 5, 6, 7, 6, 1, 1, 1, 1), (3, 5, 3, 2, 6, 1, 0, 0, 0), (3, 6, 1, 0, 5, 1, 0, 0, 0)]
Your indentation is off (you should unindent everything starting with your for statement.
You can't append to tuples (your rows inside the List variable are actually tuples).
Since you are not in a function, return does not do anything.
Since indices start with 0, you should use row[2] for 3rd row.
There are more elements in your X than the number of rows in List.
That being said, you can also use list comprehensions to implement this. Here is a one-liner that does the same thing:
>>> List = [(3,5,6,7,6),(3,5,3,2,6),(3,6,1,0,5)]
>>> X = [1,4,5,6]
>>> print [tuple(list(t[0])+[0]) if t[0][2] < t[1] else tuple(list(t[0]) + [1]) for t in zip(List, X)]
will print
[(3, 5, 6, 7, 6, 1), (3, 5, 3, 2, 6, 0), (3, 6, 1, 0, 5, 0)]
List = [[3,5,6,7,6],[3,5,3,2,6],[3,6,1,0,5]]
X= [1,4,5,6]
elems = [row[3] for row in List]
for i in range(len(elems)):
for x in X:
if elems[i] < x:
List[i].append(0)
else:
List[i].append(1)
print List
And you cannot use return if you are not using functions.
return needs to be called from inside a function. It exits the function and the value specified by return is given back to the function.
So you can't use it in your program.
In the list, each row is actually known as a tuple. Tuples don't have the append function so you can't use that to add to the end of a row.
Also, you can't have two for loops in a single line. (Which is not a problem since we only need one to achieve your output)
I've modified your code so that it looks similar so it's easier for you to understand.
List = [(3,5,6,7,6),(3,5,3,2,6),(3,6,1,0,5)]
X= [1,4,5,6]
for item in X:
n = 0
for row in list:
if row[3] < item:
list[n] = list[n] + (0,)
else:
list[n] = list[n] + (1,)
n = n+1
print List
You need to add with (0,) or (1,) to show that it's a tuple addition. (Or else python will think that you're adding a tuple with an integer)
agree with Selcuk
[edited #1: Thanks #Rawing, I mistyped > as <]
Here is AlmostGr's version simplified:-
List = [[3, 5, 6, 7, 6], [3, 5, 3, 2, 6], [3, 6, 1, 0, 5]]
X = [1, 4, 5, 6]
for num in X:
for item in List:
if num > item[2]:
item.append(0)
else:
item.append(1)
it runs for all elements in X and produces the output:
[[3, 5, 6, 7, 6, 1, 1, 1, 1], [3, 5, 3, 2, 6, 1, 0, 0, 0], [3, 6, 1, 0, 5, 1, 0, 0, 0]]

Categories