Duplicating zeroes in an array making room by shifting letters - python

I'm working on a problem which would take this input:
[1, 0, 2, 3, 0, 4, 5, 0]
And output this:
[1, 0, 0, 2, 3, 0, 0, 4]
My function detects zeroes and when a zero is the i-th element of the list, i+1 also becomes a zero and the rest of the list is shifted to make room for it. The elements at the end of the list get pushed out to make room.
I was able to do it with two for loops, but that has O(n^2), and I want to do it in O(n). I came up with this:
new = [0] * len(arr)
zeroes = 0
d = 0
I create a second list of zeroes, zeroes counts the list of zeroes and d is the index to copy for the second list. The array I use is an input into the function and named arr.
First I count zeroes:
for i in range(len(arr)):
if arr[i] == 0:
zeroes+=1
Then I copy. I check by index the value is zero, and if it is, I skip the d-th and d+1-th element.
for i in range(len(arr)-zeroes):
if arr[i] == 0:
d+=1
else:
new[d] = arr[i]
d+=1
However for:
[1, 0, 2, 3, 0, 4, 5, 0]
The output is:
[1, 0, 0, 2, 3, 0, 0, 0]
I'm not sure why the last element doesn't change.

Here is a simpler solution which is still O(n):
a = [1,0,2,3,0,4,5,0]
b = []
for i in a:
b.append(i)
if i == 0:
b.append(0)
b = b[:len(a)]
The value of b would be
[1, 0, 0, 2, 3, 0, 0, 4]

Related

Counting and summing list elements, in sections separated by three or more zeros

I have a list of integers which I want to separate according to a certain condition. I want to get the sum and the count of the list elements, stopping when three or more consecutive elements are equal to 0; then the sum and count orders restart again from where they stopped.
For example, part of the list is:
[8, 2, 1, 1, 2, 0, 0, 0, 0, 0, 6, 0, 2, 0, 0, 0, 8, 0, 0, 2, 0, 0, 0, 6, 0, 0]
The process would be:
8, 2, 1, 1, 2 -> sum: 14, length: 5
0, 0, 0, 0, 0
6, 0, 2 -> sum: 8, length: 3
0, 0, 0
8, 0, 0, 2 -> sum: 10, length: 4
0, 0, 0
6, 0, 0 -> sum: 6, length: 3
So the output I want is:
[[14, 5], [8, 3], [10, 4], [6, 3]]
What I've written so far computes the sum okay, but my problem is that zeros within sections aren't counted in the lengths.
Current (incorrect) output:
[[14, 5], [8, 2], [10, 2], [6, 2]]
Code:
arr = [8, 2, 1, 1, 2, 0, 0, 0, 0, 0, 6, 0, 2, 0, 0, 0, 8, 0, 0, 2, 0, 0, 0, 6, 0, 0]
result = []
summed, count = 0, 0
for i in range(0, len(arr) - 2):
el, el1, el2 = arr[i], arr[i + 1], arr[i + 2]
if el != 0:
summed = summed + el
count = count + 1
if el == 0 and el1 == 0 and el2 == 0:
if summed != 0:
result.append([summed, count])
summed = 0
count = 0
elif i == len(arr) - 3:
summed = el + el1 + el2
count = count + 1
result.append([summed, count])
break
print(result)
It is quite hard to understand what your code does. Working with Strings seems more straightforward and readable, your output can be achieved in just two lines (thanks to #CrazyChucky for the improvement):
import re
arr = [8, 2, 1, 1, 2, 0, 0, 0, 0, 0, 6, 0, 2, 0, 0, 0, 8, 0, 0, 2, 0, 0, 0, 6, 0, 0]
# Convert to String by joining integers, and split into substrings, the separator being three zeros or more
strings = re.split(r'0{3,}', ''.join(str(i) for i in arr))
# Sums and counts using list comprehensions
output = [[sum(int(x) for x in substring), len(substring)] for substring in strings]
Output:
>>>output
>>>[[14, 5], [8, 3], [10, 4], [6, 3]]
Remember that readability is always the most important factor in any code. One should read your code for the first time and understand how it works.
If the full list contains numbers with more than one digit, you can do the following:
# Convert to String by joining integers, seperating them by a commade, and split into substrings, the separator being three zeros or more
strings = re.split(r',?(?:0,){3,}', ','.join(str(i) for i in arr))
# Make a list of numbers from those strings
num_lists = [string.split(',') for string in strings]
# # Sums and counts using list comprehensions
output = [[sum(int(x) for x in num_list), len(num_list)] for num_list in num_lists]
This answer is not so much to suggest a way I'd recommend doing it, as to highlight how clever Paul Lemarchand's idea of using a regular expression is. Without Python's re module doing the heavy lifting for you, you have to either look ahead to see how many zeros are coming (as in Prakash Dahal's answer), or keep track of how many zeros you've seen as you go. I think this implementation of the latter is about the simplest and shortest way you could solve this problem "from scratch":
input_list = [8, 2, 1, 1, 2, 0, 0, 0, 0, 0, 6, 0, 2, 0, 0, 0, 8, 0, 0, 2, 0, 0,
0, 6, 0, 0]
output_list = []
current_run = []
pending_zeros = 0
for num in input_list:
# If the number is 0, increment the number of "pending" zeros. (We
# don't know yet if they're part of a separating chunk or not.)
if num == 0:
pending_zeros += 1
# If this is the first nonzero after three or more zeros, process
# the existing run and start over from the current number.
elif pending_zeros >= 3:
output_list.append((sum(current_run), len(current_run)))
current_run = [num]
pending_zeros = 0
# Otherwise, the pending zeros (if any) should be included in the
# current run. Add them, and then the current number.
else:
current_run += [0] * pending_zeros
current_run.append(num)
pending_zeros = 0
# Once we're done looping, there will still be a run of numbers in the
# buffer (assuming the list had any nonzeros at all). It may have
# pending zeros at the end, too. Include the zeros if there are 2 or
# fewer, then process.
if current_run:
if pending_zeros <= 2:
current_run += [0] * pending_zeros
output_list.append((sum(current_run), len(current_run)))
print(output_list)
[(14, 5), (8, 3), (10, 4), (6, 3)]
One note: I made each entry in the list a tuple rather than a list. Tuples and lists have a lot of overlap, and in this case either would probably work perfectly well... but a tuple is a more idiomatic choice for an immutable data structure that will always be the same length, in which each position refers to something different. (In other words, it's not a list of equivalent items, but rather a well-defined combination of (sum, length).)
Use this:
a = [8, 2, 1, 1, 2, 0, 0, 0, 0, 0, 6, 0, 2, 0, 0, 0, 8, 0, 0, 2, 0, 0, 0, 6, 0,0]
total_list = []
i = 0
sum_int = 0
count_int = 0
for ind, ele in enumerate(a):
if ind < (len(a) - 2):
sum_int += ele
if sum_int != 0:
count_int += 1
if (a[ind] == 0) and (a[ind+1] == 0) and (a[ind+2] == 0):
if sum_int != 0:
count_int -= 1
total_list.append([sum_int, count_int])
sum_int = 0
count_int = 0
else:
sum_int += ele
count_int += 1
if sum_int != 0:
total_list.append([sum_int, count_int+1])
sum_int = 0
count_int = 0
print(total_list)

Repeating each element of a vector by a number of times provided by another counts vector [duplicate]

Say I have an array with longitudes, lonPorts
lonPort =np.loadtxt('LongPorts.txt',delimiter=',')
for example:
lonPort=[0,1,2,3,...]
And I want to repeat each element a different amount of times. How do I do this? This is what I tried:
Repeat =[5, 3, 2, 3,...]
lonPort1=[]
for i in range (0,len(lenDates)):
lonPort1[sum(Repeat[0:i])]=np.tile(lonPort[i],Repeat[i])
So the result would be:
lonPort1=[0,0,0,0,0,1,1,1,2,2,3,3,3,...]
The error I get is:
list assignment index out of range
How do I get rid of the error and make my array?
Thank you!
You can use np.repeat():
np.repeat(a, [5,3,2,3])
Example:
In [3]: a = np.array([0,1,2,3])
In [4]: np.repeat(a, [5,3,2,3])
Out[4]: array([0, 0, 0, 0, 0, 1, 1, 1, 2, 2, 3, 3, 3])
Without relying on numpy, you can create a generator that will consume your items one by one, and repeat them the desired amount of time.
x = [0, 1, 2, 3]
repeat = [4, 3, 2, 1]
def repeat_items(x, repeat):
for item, r in zip(x, repeat):
while r > 0:
yield item
r -= 1
for value in repeat_items(x, repeat):
print(value, end=' ')
displays 0 0 0 0 1 1 1 2 2 3.
Providing a numpy-free solution for future readers that might want to use lists.
>>> lst = [0,1,2,3]
>>> repeat = [5, 3, 2, 3]
>>> [x for sub in ([x]*y for x,y in zip(lst, repeat)) for x in sub]
[0, 0, 0, 0, 0, 1, 1, 1, 2, 2, 3, 3, 3]
If lst contains mutable objects, be aware of the pitfalls of sequence multiplication for sequences holding mutable elements.

Create a new list from a given list such that the new list can flag consecutive repetitions in the given list

I have a long list (several hundred thousand items) of numbers and I want to create a new list of equal size to find out the places where there are consecutive repetitions of numbers. The new list will have 0 and 1 values, such that for consecutive repeated indexes the new list will have 1 and for remaining indexes it will have 0 value.
If there is something as a pandas column that can be helpful as well.
Sample given list and resultant array. List can have float values also.
given_array = [1, 2, 3, 5, 5, 5, 5, 0, -2, -4, -6, -8, 9, 9, 9]
result_array = [0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1]
I have given a small working example of my code below.
import itertools
def list_from_count(list_item):
"""
Function takes an iterator and based on the length of the item
returns 1 if length is 1 or list of 0 for length greater than 1
"""
if len(list(list_item[1])) == 1:
return 1
else:
return [0] * len(list(list_item[1]))
r0 = list(range(1,4))
r1 = [5]*4
r2 = list(range(0,-10,-2))
r3 = [9]*3
r = r0 + r1 + r2 + r3
gri = itertools.groupby(r)
res = list(map(list_from_count,gri))
print ("Result",'\n',res)
Result
[1, 1, 1, [], 1, 1, 1, 1, 1, []]
Thanks in advance!
You can use itertools.groupby and output repeated 1s if the length of a group is greater than 1:
from itertools import groupby
result_array = []
for _, g in groupby(given_array):
size = sum(1 for i in g)
if size == 1:
result_array.append(0)
else:
result_array.extend([1] * size)
or with a list comprehension:
result_array = [i for _, g in groupby(given_array) for s in (sum(1 for i in g),) for i in ([0] if s == 1 else [1] * s)]
result_array becomes:
[0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1]
You're using len(list(list_item[1])) twice. The first time you use it, it processes all the items in the iterator. When you call it the second time, the iterator is all used up, so it returns 0, that's why you get a zero-element list.
You need to save the length in a variable the first time:
def list_from_count(list_item):
l = len(list(list_item[1]))
if l == 1:
return [0]
else:
return [1] * l
You also need to return a list consistently from this function, then you can concatenate all the results, so you don't get a mix of numbers and sublists.
res = []
for el in gri:
res += list_from_count(el)
print(res)
This situation is more akin to a run length encoding problem. Consider more_itertools.run_length:
Given
import more_itertools as mit
iterable = [1, 2, 3, 5, 5, 5, 5, 0, -2, -3, -6, -8, 9, 9, 9]
Code
result = [[0] if n == 1 else [1] * n for _, n in mit.run_length.encode(iterable)]
result
# [[0], [0], [0], [1, 1, 1, 1], [0], [0], [0], [0], [0], [1, 1, 1]]
Now simply flatten the sub-lists (however you wish) into one list:
list(mit.flatten(result))
# [0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1]
Details
mit.run_length.encode compresses an iterable by yielding tuples of (value, # of repititions), e.g.:
list(mit.run_length.encode("abaabbba"))
# [('a', 1), ('b', 1), ('a', 2), ('b', 3), ('a', 1)]
Our comprehension ignores the value, uses repetitions n and creates sub-lists of [0] and [1] * n.
Note: more_itertools is a third-party package. Install via > pip install more_itertools.
Use the PANDAS shift operator to create a vector shifted 1 element. Compare that to the original. This will give you a vector of True/False values, showing where an element matched the previous one. Run a linear search down that list to extend one element at the front: change [False, True] to [True, True]. Convert to int, and you have the list you specified.

How to sum up each element in array list?

Having a bit of writing out the code.
For example, if I have an array of:
a = ([0, 0, 1, 2], [0, 1, 1, 0], [0, 0, 1, 0], [1, 0, 1, 3], [0, 1, 1, 3])
if I want to add first element of each item,
as in to return a list of 0 + 0 + 0 + 1 + 0, 0 + 1 + 0, 0 + 0 ...
I wrote the code:
def test(lst):
sum = 0
test_lst = []
i = 0
while i in range(0, 4):
for j in range(0, len(lst)):
sum += lst[j][i]
test_lst.append(sum)
i += 1
return test_lst
I get index size error.
How can I go about this?
sum(zip(*a)[0])
zip is a function that takes any number of n-length sequences and returns n tuples (among other things). The first of these tuples has the elements that came first in the tuples passed to zip. sum adds them together.
EDIT:
In Python 3, the above doesn't work. Use:
sum(next(zip(*a)))
instead. For all such sums,
map(sum, zip(*a))
a = ([0, 0, 1, 2], [0, 1, 1, 0], [0, 0, 1, 0], [1, 0, 1, 3], [0, 1, 1, 3])
Try using list comprehensions:
sum([item[0] for item in a])
The line above takes the first element of each list in the tuple, then puts it into a temporary list. We then call sum on that temporary list, which yields the answer.

initialization with python list

I have a problem with correct initialization within a list
import random
a = [random.randint(0,1) for x in range(10)] # getting random 0 and 1
b = a[:] # copying 'a' list for purpose of analysis
for x,y in enumerate(b): # adding + 1 where value is 1
if y != 0:
b[x] += b[x-1]
print(a) # > [1, 0, 0, 1, 1, 1, 0, 0, 1, 1]
print(b) # > [2, 0, 0, 1, 2, 3, 0, 0, 1, 2]
# wanted # > [1, 0, 0, 1, 2, 3, 0, 0, 1, 2]
from a[1:] everything is ok. Python does correct initialization, however if a[0] == 1 and a[9] == 1, Python ofcourse takes a[9] as a start value in my case.
I am just asking if there is any pythonic way to solve this > explaining python to just start initialization from 0 at a[0] and passing a[9] as first value.
Thanks
You can skip the first value rather easily:
for x,y in enumerate(b[1:]): # adding + 1 where value is 1
if y != 0:
b[x + 1] += b[x]
b[1:] just skips the first value from the list for the enumeration. This way the first number is untouched. But because the indexes in x are now all one too low, we have to add one in both cases, turning x into x + 1 and x - 1 into x. This way we access the right index.
Using your test list, it produced the following output:
[1, 0, 0, 1, 1, 1, 0, 0, 1, 1] # original list
[1, 0, 0, 1, 2, 3, 0, 0, 1, 2] # processed list

Categories