Related
I have a collection of lists of integer values in python like the following:
[0, 0, 1, 0, 1, 0, 0, 2, 1, 1, 1, 2, 1]
Now I would like to have a somewhat "smoothed" sequence where each value with the same preceding and following value (which both differ from the central value in question) is replaced with this preceeding-following value. So my list above becomes:
[0, 0, 0, 0, 0, 0, 0, 2, 1, 1, 1, 1, 1]
(The order or procession is from left to right, just to reconcile possible conflicting groupings.)
How could I achieve list?
Bonus: same as above with possible parametrization how many preceeding-following values must occur to change the central value (2-2 or 3-3 instead of just 1-1).
A straightforward loop should do the trick:
_list = [0, 0, 1, 0, 1, 0, 0, 2, 1, 1, 1, 2, 1]
for i in range(1, len(_list)-1):
if _list[i-1] == _list[i+1]:
_list[i] = _list[i-1]
print(_list)
Output:
[0, 0, 0, 0, 0, 0, 0, 2, 1, 1, 1, 1, 1]
arr = [0, 0, 1, 0, 1, 0, 0, 2, 1, 1, 1, 2, 1]
res = [arr[0]]
i = 0
for i in range(1,len(arr)):
if res[i-1] not in arr[i:i+2]:
res.append(arr[i])
else:
res.append(res[i-1] )
print(res)
To allow the number of preceding / following values to be changed, you can create a 'pad' the list and iterate through a moving window on the padded list to check if all surrounding values are the same.
def smooth(lst, values=1, padding=None):
padded = [padding] * values + lst + [padding] * values
for i, n in enumerate(lst):
surrounding = set(padded[i:i+values] + padded[i+values+1:i+values*2+1])
if len(surrounding) == 1:
yield surrounding.pop()
else:
yield n
print(list(smooth([0, 0, 1, 0, 1, 0, 0, 2, 1, 1, 1, 2, 1]))) # [0, 0, 0, 1, 0, 0, 0, 2, 1, 1, 1, 1, 1]
If your input list may contain None, choose a different padding parameter when calling the generator.
For instance, if I have the following distribution, I want to randomly select 4 elements = 1 and change that element to = 0.
lst = [1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0]
-> Function <-
lst = [0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0]
or
lst = [1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
or
lst = [0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0]
or ....
Using numpy:
ones = np.where(lst)[0]
to_zero = np.random.choice(ones, 4, replace=False)
for i in to_zero: # Alternatively, if lst is an array: lst[to_zero] = 0
lst[i] = 0
To achieve what you want, you first need to get the list of the indexes of the one elements. Then you need to pick 4 random indexes and update the values from the starting list.
Note that you cannot use the random.choices method because it doesn’t return unique values, use random.sample instead.
def update_randomly(tab):
n = range(len(tab))
one_indexes = [i for i in n if tab[i] == 1]
rdm_indexes = random.sample(one_indexes, 4)
return [0 if i in rdm_indexes else tab[i] for i in n]
Consider the following list:
l = [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0]
The 1s subdivide the list into 5 parts:
l = [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0]
I would like each part to have not more than n consecutive zeros (if possible) before a 1, but you cannot erase the current 1s. Also there should be no 1's following each other.
Quick example: let's say n = 3, lshould be:
l = [0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0]
For n = 2 it would be:
l = [0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1]
For the first part, I did not include a 1 after two zeros because then you would have two 1s following each other.
Any idea how I can do this?
Here is what I tried:
import numpy as np
max_number_of_cells_per_list = 3
l = [0, 0, 0, 1, 0, 0, 0,0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0]
print(l)
# Find position of 1
pos_1 =[i for i,x in enumerate(l) if x == 1]
# Get number of cells
pos_1.insert(0,0)
numb_cells = np.diff(pos_1)
n = np.round(np.divide(numb_cells,max_number_of_cells_per_list))
k = 0
j = 0
for i,li in enumerate(l):
if l[i] == 1:
if n[k] > 1:
add = int((i-j)/n[k])
for jj in range(int(n[k])):
if jj == n[k]-1:
jj = i
else:
jj += add
l[jj] = 1
k += 1
j = i
print(l)
If you try to run the code, you will see that it makes no difference to l. I don't understand why... but I am not too interested to find my mistake if you have better/different ideas. :)
Since you are using NumPy, here is a solution using it. Note that it is not vectorized, and I'm not really sure if you can vectorize it as we have to perform grouping operations on the array, and NumPy doesn't have much functionality for that (though it's possible that I just don't see it yet).
I will be using np.split to get the [0, ..., 1] groups, and then check two cases: first, for arrays that don't actually end with 1 (a possible group at the end of the array), and for arrays that have more than n + 2 zeros. And then I just insert 1 at each n + 1 position making sure that there will be no two 1s together.
import numpy as np
a = np.array([0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0])
n = 3 # or n = 2, or any other n >= 0 value
result = []
for array in np.split(a.copy(), np.where(a == 1)[0] + 1):
last_index = -2 if array[-1] == 1 else None
array[n:last_index:n + 1] = 1
result.append(array)
np.concatenate(result)
# for n = 3: array([0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0])
# for n = 2: array([0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1])
Alternatively, instead of splitting the array in multiple parts and operating on them, we could operate only on indices of 1. For example, here I get initial indices of 1, and add more of them in between using range:
from itertools import tee
l = [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0]
n = 3
def pairwise(iterable):
"""s -> (s0, s1), (s1, s2), (s2, s3), ..."""
a, b = tee(iterable)
next(b, None)
return zip(a, b)
def one_indices(seq, n):
"""Returns new indices where we will put 1s"""
indices = [index + 1 for index, value in enumerate(seq) if value == 1]
complete_groups_count = len(indices) # those that end with 1
indices = [0, *indices, len(seq)]
for group_index, (start, end) in enumerate(pairwise(indices), start=1):
if group_index <= complete_groups_count:
yield from range(start + n, end - 2, n + 1)
yield end - 1
else: # last group that doesn't end with 1
yield from range(start + n, end, n + 1)
result = [0] * len(l)
for index in one_indices(l, 3):
result[index] = 1
result
# for n = 3: [0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0]
# for n = 2: [0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1]
This is probably more efficient than splitting and concatenating arrays as in the first example, but it's also more difficult to read.
Finally, as a bonus, here is a solution using pandas. I saw in your previous related questions that you were using it, so you may find it useful:
from functools import partial
import pandas as pd
def fill_ones(series, n):
last_index = -2 if series.iloc[-1] == 1 else None
series.iloc[n:last_index:n + 1] = 1
return series
l = [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0]
s = pd.Series(l)
groups = s.shift().eq(1).cumsum()
fill_w_distance_3 = partial(fill_ones, n=3)
s.groupby(groups).transform(fill_w_distance_3).tolist()
# [0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0]
fill_w_distance_2 = partial(fill_ones, n=2)
s.groupby(groups).transform(fill_w_distance_2).tolist()
# [0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1]
This question already has answers here:
Inconsistent comprehension syntax?
(3 answers)
Closed 7 years ago.
Looking to generate a Python list that uses an if statement to check whether a number is even or odd. If even take the List value as '1' and if odd take the list value as '0'.
Progress to date:
List1 = [x for x in range(0,99) if x % 2 == 0]
However, this only generates a list of even numbers. When I change the expression to add an else check I get a syntax error. Any help appreciated.
List1 = [1 for x in range(0,99) if x % 2 == 0 else 0]
You are using a filter, where you want to alter the left-hand-side expression instead, using a conditional expression:
[1 if x % 2 == 0 else 0 for x in range(99)]
This can be simplified to:
[1 - (x % 2) for x in range(99)]
Change the if else condition present inside the list_comprehension like below.
>>> [1 if x % 2 == 0 else 0 for x in range(0,99)]
[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1]
This would print 1 for even numbers and 0 for odd numbers.
I would say first that the following question is not for homework purpose even because i've finish software engineer a few months ago. Anyway today I was working and one friend ask to me this strange sorting problem.
"I have a List with 1000 rows, each row represent a number, and I want to create 10 sub lists each have a similar summation of the numbers from the main list. How can I do that?"
For example I've the main list composed by 5,4,3,2 and 1. It's simple, I create two sub lists
one with 5 and 3 the other with 4,2 and 1 the result of each list it's similar: 8 for the first 7 for the second.
I can't figure it out the algorithm even if know it's simple but I'm missing something.
Let A be the input array. I'll assume it is sorted ascending.
A = [2,3,6,8,11]
Let M[i] be the number of sublist found so far to have sum equal to i.
Starts with only M[0] = 1 because there is one list with has sum equals zero, that is the empty list.
M = [1,0,0,...]
Then take each item from the list A one-by-one.
Update the number of ways you have to compose a list of each sum when considering
that the item you just take can be used.
Suppose a is the new item
for each j:
if M[j] != 0:
M_next[j+a] = M[j+a] + M[j]
When you found any M[j] which reach 10 during that, you should stop the algorithm.
Also, modify to remember the items in the list to be able to get the actual list at the end!
Notes:
You can use sparse representation for M
This is similar to those Knapsack and subset sum problems.
Perhaps you might find many better algorithms reading on those.
Here is a working code in Python:
A = [2,3,6,8,11]
t = sum(A)
M = [0]*(t+1)
M[0] = 1
print 'init M :',M
for a in A:
for j in range(len(M)-1,-1,-1):
if M[j] != 0:
M[j+a] += M[j]
print 'use',a,':',M
And its output:
init M : [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
use 2 : [1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
use 3 : [1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
use 6 : [1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
use 8 : [1, 0, 1, 1, 0, 1, 1, 0, 2, 1, 1, 2, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
use 11 : [1, 0, 1, 1, 0, 1, 1, 0, 2, 1, 1, 3, 0, 2, 2, 0, 2, 2, 0, 3, 1, 1, 2, 0, 1, 1, 0, 1, 1, 0, 1]
Take the interpretation of M[11] = 3 at the end for example;
it means there are 3 sublists with sum equals 11.
If you trace the progress, you can see the sublists are {2,3,6},{3,8},{11}.
To account for the fact that you allow the 10 sublists to have similar sum. Not just exactly the same sum. You might want to change termination condition from "terminate if any M[j] >= 10" to "terminate if sum(M[j:j+3]) >= 10" or something like that.