I'm working with lists that look as follows:
[2,3,4,5,6,7,8,13,14,15,16,17,18,19,20,30,31,32,33,34,35]
In the end I want to extract only the first and last integer in a consecutive series, as such:
[(2,8),(13,20),(30,35)]
I am new to working with Python, below my code for trying to solve this problem
helix = []
single_prot_helices = []
for ind,pos in enumerate(prot[:-1]):
if pos == prot[ind+1]-1: #if 2 == 3-1 essentially
helix.append(pos)
elif pos < prot[ind+1]-1: #if 8 < 13-1 for example
helix.append(pos)
single_prot_helices.append(helix) #save in a permanent list, clear temp list
helix.clear()
In this case prot is a list just like the example above. I expected single_prot_helices to look something like this:
[[2,3,4,5,6,7,8],[13,14,15,16,17,18,19,20],[30,31,32,33,34,35]]
and at this point it would have been easy to get the first and last integer from these lists and put them in a tuple, but instead of the expected list I got:
[[20,30,31,32,33,34,35],[20,30,31,32,33,34,35]]
Only the last series of numbers was returned and I got 1 less lists than expected (expected 3, received 2). I don't understand where I made a mistake since I believe my code follows my logic: look at the number (pos), look at the next number, if the next number is larger by 1 then add the number (pos) to a list (helix); if the next number is larger by more than 1 then add the smaller number (pos) to the list (helix), append the list to a permanent list (single_prot_helices) and then clear the list (helix) to prepare it for the next series of numbers to be appended.
Any help will be highly appreciated.
You could do something like this:
foo = [2,3,4,5,6,7,8,13,14,15,16,17,18,19,20,30,31,32,33,34,35]
series = []
result = []
for i in foo:
# if the series is empty or the element is consecutive
if (not series) or (series[-1] == i - 1):
series.append(i)
else:
# append a tuple of the first and last item of the series
result.append((series[0], series[-1]))
series = [i]
# needed in case foo is empty
if series:
result.append((series[0], series[-1]))
print(result) # [(2, 8), (13, 20), (30, 35)]
Or, as a generator:
def generate_series(list_of_int):
series = []
for i in list_of_int:
if not series or series[-1] == i - 1:
series.append(i)
else:
yield (series[0], series[-1])
series = [i]
if series:
yield (series[0], series[-1])
foo = [2,3,4,5,6,7,8,13,14,15,16,17,18,19,20,30,31,32,33,34,35]
print([item for item in generate_series(foo)]) # [(2, 8), (13, 20), (30, 35)]
Yours has a few problems.
The main one is that helix is a mutable list and you only ever clear it. This is causing you to append the same list multiple times which is why they're all identical.
The first fix is to assign a new list to helix rather than clearing.
prot = [2,3,4,5,6,7,8,13,14,15,16,17,18,19,20,30,31,32,33,34,35]
helix = []
single_prot_helices = []
for ind,pos in enumerate(prot[:-1]):
if pos == prot[ind+1]-1: #if 2 == 3-1 essentially
helix.append(pos)
elif pos < prot[ind+1]-1: #if 8 < 13-1 for example
helix.append(pos)
single_prot_helices.append(helix) #save in a permanent list, clear temp list
helix = []
print(single_prot_helices) # [[2, 3, 4, 5, 6, 7, 8], [13, 14, 15, 16, 17, 18, 19, 20]]
As you can see the last list is missed. That is because the last helix is never appended.
You could add:
if helix:
single_prot_helices.append(helix)
But that still only gives you:
[[2, 3, 4, 5, 6, 7, 8], [13, 14, 15, 16, 17, 18, 19, 20], [30, 31, 32, 33, 34]]
leaving out the last element since you only ever iterate to the second from last one.
Which means you would need to do something complicated and confusing like this outside of your loop:
if helix:
if helix[-1] == prot[-1] - 1:
helix.append(prot[-1])
single_prot_helices.append(helix)
else:
single_prot_helices.append(helix)
single_prot_helices.append(prot[-1])
else:
single_prot_helices.append(prot[-1])
Giving you:
[[2, 3, 4, 5, 6, 7, 8], [13, 14, 15, 16, 17, 18, 19, 20], [30, 31, 32, 33, 34, 35]]
If you're still confused by names and mutability Ned Batchelder does a wonderful job of explaining the concepts with visual aids.
You can try this one that uses zip().
res = []
l, u = prot[0], -1
for x, y in zip(prot, prot[1:]):
if y-x > 1:
res.append((l, x))
u, l = x, y
res.append((l, prot[-1])) # [(2, 8), (13, 20), (30, 35)]
The solution of #Axe319 works but it doesn't explain what is happening with your code.
The best way to copy data from a list is by using copy(), otherwise you will copy its pointer.
When you add helix to single_prot_helices you will add a pointer of that list so you will have:
# helix = [2, 3, 4, 5, 6, 7, 8]
# single_prot_helices = [[2, 3, 4, 5, 6, 7, 8]]
And when you do helix.clear() you will have:
# helix = []
# single_prot_helices = [[]]
Why ? Because in single_prot_helices you added the pointer and not the elements of that list.
After the second iteration you will have
# helix = [13, 14, 15, 16, 17, 18, 19, 20]
# single_prot_helices = [[13, 14, 15, 16, 17, 18, 19, 20], [13, 14, 15, 16, 17, 18, 19, 20]]
Why two lists in single_prot_helices ? Because you added a second pointer to the list, and first was still there.
Add some prints to your code to understand it well. Note that I added 40, 41 in the list so that you can have better visualisation:
prot = [2,3,4,5,6,7,8,13,14,15,16,17,18,19,20,30,31,32,33,34,35,40, 41]
helix = []
single_prot_helices = []
for ind,pos in enumerate(prot[:-1]):
if pos == prot[ind+1]-1: #if 2 == 3-1 essentially
helix.append(pos)
elif pos < prot[ind+1]-1: #if 8 < 13-1 for example
helix.append(pos)
print("helix")
print(helix)
single_prot_helices.append(helix) #save in a permanent list, clear temp list
print(single_prot_helices)
helix.clear()
print("clear")
print(helix)
print(single_prot_helices)
In a single line
data = [2,3,4,5,6,7,8,13,14,15,16,17,18,19,20,30,31,32,33,34,35]
mask = iter(data[:1] + sum(([i1, i2] for i1, i2 in zip(data, data[1:] + data[:1]) if i2 != i1+1), []))
print (list(zip(mask, mask)))
Gives #
[(2, 8), (13, 20), (30, 35)]
Functional approch
Using groupby and count from itertools considering taking diffrence between index and item if they are same group accordingly. Count them uisng ittertools counter - counter
data = [2,3,4,5,6,7,8,13,14,15,16,17,18,19,20,30,31,32,33,34,35]
from itertools import groupby, count
final = []
c = count()
for key, group in groupby(data, key = lambda x: x-next(c)):
block = list(group)
final.append((block[0], block[-1]))
print(final)
Also gives #
[(2, 8), (13, 20), (30, 35)]
I'm trying to create a simple script that will help me for network planning.
I have a list of numbers--
subnets = [21, 21, 21, 19, 20, 23, 23]
For every duplicate, I'm looking to replace 2 duplicated numbers with a new number that's minus 1. For example, replace (2) of the 21s with a 20. Replace (2) 23s with a 22, and so on.
I've got--
x = sorted(subnets)
z = []
for a in x:
if a not in z:
z.append(a)
else:
z.remove(a)
z.append(a-1)
print(z)
That kinda gets me what I want--
[19, 20, 20, 21, 22]
but there's (2) 20s in my final output, and that needs to be replaced with a 19, and then the (2) remaining 19s need to be replaced with an 18.
How do I continually loop my logic here?
My final output should be [18, 21, 22]
this below code gets me what I want, but it's nasty and I'm not going to know how many for loops to use as my subnets list changes, if I've got an additional 10 numbers in the subnets list, I don't want to have to change the script every time to keep creating these for loops
subnets = [21, 21, 21, 19, 20, 23, 23]
x = sorted(subnets)
z = []
y = []
v = []
for a in x:
if a not in z:
z.append(a)
else:
z.remove(a)
z.append(a-1)
for a in z:
if a not in y:
y.append(a)
else:
y.remove(a)
y.append(a-1)
for a in y:
if a not in v:
v.append(a)
else:
v.remove(a)
v.append(a-1)
print(v)
The following solution may be suboptimal in terms of performance, but it seems easy to understand. You need a Counter object from the standard Python library that counts duplicate objects in a list.
subnets = [21, 21, 21, 19, 20, 23, 23]
from collections import Counter
The loop goes for as long as there are duplicates - that is, the length of the set of items is smaller than the length of the list of items (sets eliminate duplicates).
while len(subnets) > len(set(subnets)):
# [v-1] * (cnt // 2) - even # of duplicates
# [v] * (cnt % 2) - odd remaining non-duplicate
# sum() rebuilds the list from the sublists
subnets = sum([[v-1] * (cnt // 2) + [v] * (cnt % 2)
for v, cnt in Counter(subnets).items()], [])
# [18, 21, 22]
This is similar to your idea, we just run the replacement operation in a while loop, modifying a list in-place:
def replace_pairs_minus_one(inp):
l = inp[:]
while(True):
for i, e in enumerate(l):
if e in l[i+1:]:
# pair found
# here we modify the list, since we are breaking anyway
l[i] = e - 1
l.remove(e)
break
else:
# break out of while if no pairs found
break
return sorted(l)
subnets = [21, 21, 21, 19, 20, 23, 23]
pairs_removed = replace_pairs_minus_one(subnets) # [18, 21, 22]
I want to go through each element of an array I've created. However, I'm doing some debugging and it's not. Here's what I have so far and what it's printing out.
def prob_thirteen(self):
#create array of numbers 2-30
xcoords = [range(2,31)]
ycoords = []
for i in range(len(xcoords)):
print 'i:', xcoords[i]
output:
i: [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30]
Why does 'i' return my whole array and not just the first element: 2? I'm not sure why this is returning my whole array.
xcoords = [range(2,31)]
This line will create an array of length 1. The only element in that array is an array of the numbers 2 -> 30. Your loop is printing the elements of the outer array. Change that line to:
xcoords = range(2,31)
This answer is correct for Python 2 because the range function returns a list. Python 3 will return a range object (which can be iterated on producing the required values). The following line should work in Python 2 and 3:
xoords = list(range(2,31))
First of all, change xcoords so that it isn't a list inside a list:
xcoords = range(2, 31)
We don't need to iterate over a list using an index into the list using len(xcoords). In python we can simply iterate over a list like this:
for coord in xcoords:
print "i: ", coord
If we did need to keep track of the index we could use enumerate:
for i, coord in enumerate(xcoords):
print str(i) + ":", coord
This question already has answers here:
How to find the cumulative sum of numbers in a list?
(25 answers)
Closed 8 years ago.
can you help me with code which returns partial sum of numbers in text file?
I must import text file, then make a code for partial sums without tools ..etc.
My input:
4
13
23
21
11
The output should be (without brackets or commas):
4
17
40
61
72
I was trying to make code in python, but could only do total sum and not partial one.
If i use the += operator for generator, it gives me an error!
Well, since everyone seems to be giving their favourite idiom for solving the problem, how about itertools.accumulate in Python 3:
>>> import itertools
>>> nums = [4, 13, 23, 21, 11]
>>> list(itertools.accumulate(nums))
[4, 17, 40, 61, 72]
There are a number of ways to create your sequence of partial sums. I think the most elegant is to use a generator.
def partial_sums(iterable):
total = 0
for i in iterable:
total += i
yield total
You can run it like this:
nums = [4, 13, 23, 21, 11]
sums = list(partial_sums(nums)) # [ 4, 17, 40, 61, 72]
Edit To read the data values from your file, you can use another generator, and chain them together. Here's how I'd do it:
with open("filename.in") as f_in:
# Sums generator that "feeds" from a generator expression that reads the file
sums = partial_sums(int(line) for line in f_in)
# Do output:
for value in sums:
print(value)
# If you need to write to a file, comment the loop above and uncomment this:
# with open("filename.out", "w") as f_out:
# f_out.writelines("%d\n" % value for value in sums)
numpy.cumsum will do what you want.
If you're not using numpy, you can write your own.
def cumsum(i):
s = 0
for elt in i:
s += elt
yield s
try this:
import numpy as np
input = [ 4, 13, 23, 21, 11 ]
output = []
output.append(input[0])
for i in np.arange(1,len(input)):
output.append(input[i] + input[i-1])
print output
Use cumulative sum in numpy:
import numpy as np
input = np.array([4, 13, 23, 21 ,11])
output = input.cumsum()
Result:
print output
>>>array([ 4, 17, 40, 61, 72])
Or if you need a list, you may convert output to list:
output = list(output)
print output
>>>[4, 17, 40, 61, 72]
This is an alternative solution using reduce:
nums = [4, 13, 23, 21, 11]
partial_sum = lambda a, b: a + [a[-1] + b]
sums = reduce(partial_sum, nums[1:], nums[0:1])
Pluses in lambda are not the same operator, the first one is list concatenation and the second one is sum of two integers. Altough Blckknght's may be more clear, this one is shorter and works in Python 2.7.
something like this:
>>> lst = [4, 13, 23, 21 ,11]
>>> [sum(lst[:i+1]) for i, x in enumerate(lst)]
[4, 17, 40, 61, 72]