Fully enumerate range from list of breakpoints - python

This is a bit of a python 101 question, but I can't think of a pythonic way to enumerate a list of breakpoints to all integers between those breakpoints.
Say I have:
breaks = [4, 7, 13, 15, 18]
and I want
enumerated = [[4,5,6],[7,8,9,10,11,12],[13,14],[15,16,17],[18]]
(My actual use case involves breakpoints that are years; I want all years in each range).
I could loop through breaks with a counter, create a range for each interval and store it in a list, but I suspect there is a simple one-liner for this sort of enumeration. Efficiency is a concern since I am working with millions of records.

You can use zip
>>> enumerated = [range(start, end) for start,end in zip(breaks, breaks[1:])] + [[breaks[-1]]]
>>> enumerated
[[4, 5, 6], [7, 8, 9, 10, 11, 12], [13, 14], [15, 16, 17], [18]]
Using a zipping of a list with itself in an offset of one (zip(breaks, breaks[1:]) is a known "trick" to get all pairs. This drops the last one so I added it manually.

you can create a generator, it will be memory efficient:
def f(b):
if not b:
raise StopIteration
x = b[0]
for y in b[1:]:
yield xrange(x, y)
x = y
yield [y]
print list(f(breaks))

You can use a simple list comprehension and range()
breaks = [4, 7, 13, 15, 18]
new = [range(breaks[i],breaks[i+1]) for i in xrange(len(breaks)-1)]+[[breaks[-1]]]
print new
[[4, 5, 6], [7, 8, 9, 10, 11, 12], [13, 14], [15, 16, 17]]

Related

Loop selecting one item per list in nested list

list = [[1,2,3,4],[5,6,7,8],[9,10,11,12]]
I would like to have a loop that randomly selects only ONE item from the indexes of the list for all 3 of them. So the loop would start and pick 3, then picks 7 and then picks 9, for example. And then the loop stops, doesn't continue on picking items again. I only want 3 repetitions
I have managed to do this
(with:
for i in list:
item = list[0].pop((random.choice(list)[0])))
but it doesn't do it only once, but it goes through all of the items (choosing the first one) of the first index, then moves to the second one and so on.
Any help is appreciated!
You seem to be indexing the list on 0 in each iteration, which will only give you random values from the first inner list. Use random.choice iterating over the list, or use map:
list(map(random.choice, my_list))
# [3, 8, 11]
Equivalently:
[random.choice(i) for i in my_list]
Based on the comments, if you want to remove the item you've randomly selected from the list, use instead:
[i.pop(random.randint(0,len(i))) for i in my_list]
# [4, 6, 9]
print(my_list)
# [[1, 2, 3], [5, 7, 8], [10, 11, 12]]
This is my code:
print("before random sample",pos)
pos = random.sample(pos, len(pos))
print("after random sample", pos)
test = [i.pop(random.randint(0,len(i))) for i in pos]```
This is the output:
#before random sample [[1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [11, 12, 13, 14, 15], [16, 17, 18, 19, 20]]
#after random sample [[6, 7, 8, 9, 10], [1, 2, 3, 4, 5], [11, 12, 13, 14, 15], [16, 17, 18, 19, 20]]
#.......
#test = [i.pop(random.randint(0,len(i))) for i in pos]
#IndexError: pop index out of range

Recursively transposing columns from a given 2D-array to the rows of a new 2D-array

Since I'm here, might as well get all the help I can. NOTE: I believe this is different from other problems involving transposing with arrays.
Like in my first post, I have been looking into recursion in Python, and have been attempting to solve problems with normally simple solutions via loops, etc., but instead solving them with recursive algorithms.
For this particular problem, given a 2D array formatted in an NxN grid, I want to take each column from this grid and turn them into the rows of a new grid.
As an example, let's say I pass in a grid: [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]]
My program should take the column [1, 5, 9, 13] and make it the row of a new 2D array; below is a visualization:
[[1, 2, 3, 4], [[1, 5, 9, 13],
[5, 6, 7, 8], ----> [2, 6, 10, 14],
[9, 10, 11, 12], ----> [3, 7, 11, 15],
[13, 14, 15, 16]] [4, 8, 12, 16]]
What now follows is my attempt using a helper function to extract each column, and a helper to create an n value that needs to be passed into it:
def get_column(array, n):
base_val = array[0][n]
if len(array) == 1:
return [base_val]
else:
return [base_val] + get_column(array[1:], n)
def integer(grid):
return len(grid)
def transpose(grid, n = integer(grid), new_grid = []):
if n < len(grid):
new_grid.append(get_column(grid, n))
n += 1
return transpose(grid, n, new_grid)
else:
return new_grid
The only call from the outside in my main() function is to the transpose() function passing through only the grid as an argument.
In python, list is actually structured as a dynamic array. It means that you need O(n) time to access the n-element. So transposing using a list selection approach would not achieve the vectorised effect.
How does NumPy's transpose() method permute the axes of an array?
has explained what numpy.transpose has done to optimize the transpose operation.

How to match a nested list with list

The problem is I'm trying to compare nested list and list without the same value or element ?
lst3 = [1, 6, 7, 10, 13, 28]
lst4 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]
lst5 = [list(filter(lambda x: x not in lst3, sublist)) for sublist in lst4]
which returns:
[[17, 18, 21, 32], [11, 14], [5, 8, 15, 16]]
but I would like to get the number that don't match from l3. Here an example:
[[1,6,7,10,28],[1,6,10],[1,7,13,28]]
I would like the results to be:
[[1,6,7,10,28],[1,6,10],[1,7,13,28]]
In your example you are comparing each element in each sublist with lst3.
lst5 = [list(filter(lambda x: x not in lst3, sublist)) for sublist in lst4]
Problem is that you are asking whether each x from sublist is not in lst3 which is going to give you the remaining results from the sublist. You may want to do it the other way around.
lst5 = [list(filter(lambda x: x not in sublist, lst3)) for sublist in lst4]
Not only does it give you the answers you want but I even noticed you made a mistake in your expected results:
[[1, 6, 7, 10, 28], [1, 6, 10], [7, 10, 13, 28]]
Compared to yours:
[[1, 6, 7, 10, 28], [1, 6, 10], [1, 7, 13, 28]]
(See the last nested array)
Online example:
https://onlinegdb.com/Hy8K8GPSB
Rather than using things like filter and lambda, you could more readably just use a list comprehension:
lst5 = [[x for x in lst3 if not x in sublist] for sublist in lst4]
Which is
[[1, 6, 7, 10, 28], [1, 6, 10], [7, 10, 13, 28]]
This differs slightly from what you gave as your expected output, but I think that you made a typographical error in the third sublist of that expected output.
I would take John Coleman's answer but tweak the word order for readability.
lst5 = [[x for x in lst3 if x not in sublist] for sublist in lst4]
I have two list that are two dimensional list with at least 100 rows. I would like to match c1 to c2 or vice versa. But the real problem is instead of typing in row by row from c1 to match c2. Is there a faster way to loop through all the rows from c1 to match all the rows from c2 ?
I tried c1[0] and c1[1] and c1[2]. This method will work but i would have to do alot of typing row by row. This will be to much typing especially if its alot of rows?
Here i have two list that are two dimensional list.
c1 = [[2, 6, 7],[2,4,6],[3,6,8]].....
c2 = [[13, 17, 18], [7, 11, 13], [5, 6, 8]].......
[list(filter(lambda x: x in c3, sublist)) for sublist in c2].

Python - Comparing each item of a list to every other item in that list

I need to compare every item in a very long list (12471 items) to every other item in the same list. Below is my list:
[array([3, 4, 5])
array([ 6, 8, 10])
array([ 9, 12, 15])
array([12, 16, 20])
array([15, 20, 25])
...] #12471 items long
I need to compare the second item of each array to the first item of every other array to see if they're equal. And preferably, in a very efficient way. Is there a simple and efficient way to do this in Python 2.x?
I worked up a very crude method here, but it is terribly slow:
ls=len(myList) #12471
l=ls
k=0
for i in myList:
k+=1
while l>=0:
l-=1
if i[1]==myList[l][0]:
#Do stuff
l=ls
While this is still theoretically N^2 time (worst case), it should make things a bit better:
import collections
inval = [[3, 4, 5],
[ 6, 8, 10],
[ 9, 12, 15],
[ 12, 14, 15],
[12, 16, 20],
[ 6, 6, 10],
[ 8, 8, 10],
[15, 20, 25]]
by_first = collections.defaultdict(list)
by_second = collections.defaultdict(list)
for item in inval:
by_first[item[0]].append(item)
by_second[item[1]].append(item)
for k, vals in by_first.items():
if k in by_second:
print "by first:", vals, "by second:", by_second[k]
Output of my simple, short case:
by first: [[6, 8, 10], [6, 6, 10]] by second: [[6, 6, 10]]
by first: [[8, 8, 10]] by second: [[6, 8, 10], [8, 8, 10]]
by first: [[12, 14, 15], [12, 16, 20]] by second: [[9, 12, 15]]
Though this DOES NOT handle duplicates.
We can do this in O(N) with an assumption that python dict takes O(1) time for insert and lookup.
In the first scan, we create a map storing first number and row index by scanning the full list
In the second scan, we find if map from first scan contains second element of each row. If map contains then value of map gives us the list of row indices that match the required criterion.
myList = [[3, 4, 5], [ 6, 8, 10], [ 9, 12, 15], [12, 16, 20], [15, 20, 25]]
first_column = dict()
for idx, list in enumerate(myList):
if list[0] in first_column:
first_column[list[0]].append(idx)
else:
first_column[list[0]] = [idx]
for idx, list in enumerate(myList):
if list[1] in first_column:
print ('rows matching for element {} from row {} are {}'.format(list[1], idx, first_column[list[1]]))

Sum of elements in a list of lists with varying lengths in Python

I am trying to calculate the sum of elements in a list of lists. I have no trouble in computing the sum if the lists within the main list all have the same size, like below:
a = [[4], [8], [15]]
total = [sum(i) for i in zip(*a)]
result:
total = [27] #(4 + 8 + 15) = 27, GOOD!!!
however there is a chance that I may have a list with a different size within the main list, such as:
a = [[3], [4, 6], [10]]
expected result:
total = [17, 19] #(3 + 4 + 10) = 17, (3 + 6 + 10) = 19
I got stuck here, obviously my solution for the lists of equal sizes does not work. What would be the best way to get the result the way I defined? My intuition was figuring out the list with the maximum length, then expanding other lists to that length with adding zeros, and finally computing the sums separately. This sounds like an ugly solution though, and I wonder if there is there a quick and more elegant way to do this.
Thanks!
EDIT: Should have explained it better. I got confused a bit too... Here below are better examples:
The number of elements in the lists within the list a never exceeds 2. Examples:
a = [[1], [10], [5]] #Expected result: [16] (1+10+5)
a = [[1, 10], [3], [4]] #Expected result: [8, 17] (1+3+4, 10+3+4)
a = [[1, 10], [3], [2, 8]] #Expected result: [6, 12, 15, 21] (1+3+2, 1+3+8, 10+3+2, 10+3+8)
EDIT2: Accepted answer computes the correct results independent of the list sizes.
Wild guess: you want every possible sum, i.e. the sums you get from taking every possible selection of elements from the sublists?
>>> from itertools import product
>>> a = [[4], [8], [15]]
>>> [sum(p) for p in product(*a)]
[27]
>>> a = [[3], [4, 6], [10]]
>>> [sum(p) for p in product(*a)]
[17, 19]
One way to check this interpretation is to see whether you like the answer it gives for the test in the comments:
>>> a = [[1,2], [3,4,5], [6,7,8,9]] # Tim Pietzcker's example
>>> [sum(p) for p in product(*a)]
[10, 11, 12, 13, 11, 12, 13, 14, 12, 13, 14, 15, 11, 12, 13, 14, 12, 13, 14, 15, 13, 14, 15, 16]

Categories