Iterate over all partial "head" lists of a list - python

Is there a "more Pythonic" way than:
l=list(range(5)) # ANY list, this is just an example
list(l[:i] for i in range(1, len(l)))
Out[14]: [[0], [0, 1], [0, 1, 2], [0, 1, 2, 3]]
E.g. not using the index.
In C++ one can construct a sequence using a pair of (start, end) iterators. Is there an equivalent in Python?

To be clear, if you're aiming for "Pythonic", I think your current example of using indices are what most Python programmers would do.
That being said, you also mentioned creating an object using a pair of (start, end) values. Python has slice objects, which is what the square bracket indices (the object's __getitem__ call) internally uses. It can be created using the builtin slice function:
>>> my_list = [0, 1, 2, 3, 4]
>>> slice1 = slice(0, 3)
>>> slice1.start
0
>>> slice1.stop
3
>>> my_list[slice1]
[0, 1, 2]
>>> slice2 = slice(1, 2)
>>> my_list[slice2]
[1]
Of course this works only if my_list can be indexed. If you want this to work for iterables in general, #Hackaholic's answer using itertools.islice is what I would use.
And yes, this still means you will need to use the square bracket index eventually. The difference here is you're storing the (start, stop) value of the partial heads in objects you can use to actually create the partial heads.
Now to come back to your example:
>>> slices = [slice(0, x) for x in range(len(any_list))]
>>> partial_heads = [any_list[slc] for slc in slices]

you can use itertools.islice:
>>> import itertools
>>> l=list(range(5))
>>> [list(itertools.islice(l, x)) for x in range(1,len(l))]
[[0], [0, 1], [0, 1, 2], [0, 1, 2, 3]]
you can check time of execution: There two factor memory performance and speed.
>>> timeit.timeit('[list(itertools.islice(l, x)) for x in range(1,len(l))]', setup='l=list(range(5))')
3.2744126430006872
>>> timeit.timeit('list(l[:i] for i in range(1, len(l)))', setup='l=list(range(5))')
1.9414149740005087
Here you go:
>>> [list(range(x)) for x in range(1, 5)]
[[0], [0, 1], [0, 1, 2], [0, 1, 2, 3]]

Related

How to convert [2,3,4] to [0,0,1,1,1,2,2,2,2] to utilize tf.math.segment_sum?

Assume I have an array like [2,3,4], I am looking for a way in NumPy (or Tensorflow) to convert it to [0,0,1,1,1,2,2,2,2] to apply tf.math.segment_sum() on a tensor that has a size of 2+3+4.
No elegant idea comes to my mind, only loops and list comprehension.
Would something like this work for you?
import numpy
arr = numpy.array([2, 3, 4])
numpy.repeat(numpy.arange(arr.size), arr)
# array([0, 0, 1, 1, 1, 2, 2, 2, 2])
You don't need to use numpy. You can use nothing but list comprehensions:
>>> foo = [2,3,4]
>>> sum([[i]*foo[i] for i in range(len(foo))], [])
[0, 0, 1, 1, 1, 2, 2, 2, 2]
It works like this:
You can create expanded arrays by multiplying a simple one with a constant, so [0] * 2 == [0,0]. So for each index in the array, we expand with [i]*foo[i]. In other words:
>>> [[i]*foo[i] for i in range(len(foo))]
[[0, 0], [1, 1, 1], [2, 2, 2, 2]]
Then we use sum to reduce the lists into a single list:
>>> sum([[i]*foo[i] for i in range(len(foo))], [])
[0, 0, 1, 1, 1, 2, 2, 2, 2]
Because we are "summing" lists, not integers, we pass [] to sum to make an empty list the starting value of the sum.
(Note that this likely will be slower than numpy, though I have not personally compared it to something like #Patol75's answer.)
I really like the answer from #Patol75 since it's neat. However, there is no pure tensorflow solution yet, so I provide one which maybe kinda complex. Just for reference and fun!
BTW, I didn't see tf.repeat this API in tf master. Please check this PR which adds tf.repeat support equivalent to numpy.repeat.
import tensorflow as tf
repeats = tf.constant([2,3,4])
values = tf.range(tf.size(repeats)) # [0,1,2]
max_repeats = tf.reduce_max(repeats) # max repeat is 4
tiled = tf.tile(tf.reshape(values, [-1,1]), [1,max_repeats]) # [[0,0,0,0],[1,1,1,1],[2,2,2,2]]
mask = tf.sequence_mask(repeats, max_repeats) # [[1,1,0,0],[1,1,1,0],[1,1,1,1]]
res = tf.boolean_mask(tiled, mask) # [0,0,1,1,1,2,2,2,2]
Patol75's answer uses Numpy but Gort the Robot's answer is actually faster (on your example list at least).
I'll keep this answer up as another solution, but it's slower than both.
Given that a = [2,3,4] this could be done using a loop like so:
b = []
for i in range(len(a)):
for j in range(a[i]):
b.append(range(len(a))[i])
Which, as a list comprehension one-liner, is this diabolical thing:
b = [range(len(a))[i] for i in range(len(a)) for j in range(a[i])]
Both end up with b = [0,0,1,1,1,2,2,2,2].

Is there a way to set a list value in a range of integer in python

I want to perform the following:
>>> [0-2, 4] #case 1
[-2, 4] #I want the output to be [0, 1, 2, 4]
I know I can perform the same in this way:
>>> list(range(3)) + [4] #case 2
[0, 1, 2, 4]
But I am curious is there any way to achieve the same result using the case 1 method (or something similar)? Do I need to override the integer '-' operator or do I need to do anything with the list?
>>> [*range(0,3), 4]
[0, 1, 2, 4]
Should come closest. Python 3 only.
The answer by #timgeb is great, but this returns a list, and by default range returns an "an immutable sequence type" and so using itertools.chain would give an iterable that is more closely related to range().
import itertools
itertools.chain(range(3), range(4,5))
which (once converting to a list with list() so we can see the contents) would give:
[0, 1, 2, 4]
Or you could make your own generator:
def joinRanges(r1, r2):
for i in r1:
yield i
for i in r2:
yield i
which achieves the same effect as before when calling with:
joinRanges(range(3), range(4,5))

Check if two nested lists are equivalent upon substitution

For some context, I'm trying to enumerate the number of unique situations that can occur when calculating the Banzhaf power indices for four players, when there is no dictator and there are either four or five winning coalitions.
I am using the following code to generate a set of lists that I want to iterate over.
from itertools import chain, combinations
def powerset(iterable):
s = list(iterable)
return chain.from_iterable(map(list, combinations(s, r)) for r in range(2, len(s)+1))
def superpowerset(iterable):
s = powerset(iterable)
return chain.from_iterable(map(list, combinations(s, r)) for r in range(4, 6))
set_of_lists = superpowerset([1,2,3,4])
However, two lists in this set shouldn't be considered unique if they are equivalent under remapping.
Using the following list as an example:
[[1, 2], [1, 3], [2, 3], [1, 2, 4]]
If each element 2 is renamed to 3 and vice-versa, we would get:
[[1, 3], [1, 2], [3, 2], [1, 3, 4]]
The order in each sub-list is unimportant, and the order of the sub-lists is also un-important. Thus, the swapped list can be rewritten as:
[[1, 2], [1, 3], [2, 3], [1, 3, 4]]
There are 4 values, so there are P(4,4)=24 possible remappings that could occur (including the trivial mapping).
Is there any way to check this easily? Or, even better, is there are way to avoid generating these lists to begin with?
I'm not even sure how I would go about transforming the first list into the second list (but could brute force it from there). Also, I'm not restricted to data type (to a certain extent) and using frozenset would be fine.
Edit: The solution offered by tobias_k answers the "checking" question but, as noted in the comments, I think I have the wrong approach to this problem.
This is probably no complete solution yet, but it might show you a direction to investigate further.
You could map each element to some characteristics concerning the "topology", how it is "connected" with other elements. You have to be careful not to take the ordering in the sets into account, or -- obviously -- the element itself. You could, for example, consider how often the element appears, in what sized groups it appears, and something like this. Combine those metrics to a key function, sort the elements by that key, and assign them new names in that order.
def normalize(lists):
items = set(x for y in lists for x in y)
counter = itertools.count()
sorter = lambda x: sorted(len(y) for y in lists if x in y)
mapping = {k: next(counter) for k in sorted(items, key=sorter)}
return tuple(sorted(tuple(sorted(mapping[x] for x in y)) for y in lists))
This maps your two example lists to the same "normalized" list:
>>> normalize([[1, 2], [1, 3], [2, 3], [1, 2, 4]])
((0, 1), (0, 2), (1, 2), (1, 2, 3))
>>> normalize([[1, 3], [1, 2], [3, 2], [1, 3, 4]])
((0, 1), (0, 2), (1, 2), (1, 2, 3))
When applied to all the lists, it gets the count down from 330 to 36. I don't know if this is minimal, but it looks like a good start.
>>> normalized = set(map(normalize, set_of_lists))
>>> len(normalized)
36

Returning a list of list elements

I need help writing a function that will take a single list and return a different list where every element in the list is in its own original list.
I know that I'll have to iterate through the original list that I pass through and then append the value depending on whether or not the value is already in my list or create a sublist and add that sublist to the final list.
an example would be:
input:[1, 2, 2, 2, 3, 1, 1, 3]
Output:[[1,1,1], [2,2,2], [3,3]]
I'd do this in two steps:
>>> import collections
>>> inputs = [1, 2, 2, 2, 3, 1, 1, 3]
>>> counts = collections.Counter(inputs)
>>> counts
Counter({1: 3, 2: 3, 3: 2})
>>> outputs = [[key] * count for key, count in counts.items()]
>>> outputs
[[1, 1, 1], [2, 2, 2], [3, 3]]
(The fact that these happen to be in sorted numerical order, and also in the order of first appearance, is just a coincidence here. Counters, like normal dictionaries, store their keys in arbitrary order, and you should assume that [[3, 3], [1, 1, 1], [2, 2, 2]] would be just as possible a result. If that's not acceptable, you need a bit more work.)
So, how does it work?
The first step creates a Counter, which is just a special subclass of dict made for counting occurrences of each key. One of the many nifty things about it is that you can just pass it any iterable (like a list) and it will count up how many times each element appears. It's a trivial one-liner, it's obvious and readable once you know how Counter works, and it's even about as efficient as anything could possibly be.*
But that isn't the output format you wanted. How do we get that? Well, we have to get back from 1: 3 (meaning "3 copies of 1") to [1, 1, 1]). You can write that as [key] * count.** And the rest is just a bog-standard list comprehension.
If you look at the docs for the collections module, they start with a link to the source. Many modules in the stdlib are like this, because they're meant to serve as source code for learning from as well as usable code. So, you should be able to figure out how the Counter constructor works. (It's basically just calling that _count_elements function.) Since that's the only part of Counter you're actually using beyond a basic dict, you could just write that part yourself. (But really, once you've understood how it works, there's no good reason not to use it, right?)
* For each element, it's just doing a hash table lookup (and insert if needed) and a += 1. And in CPython, it all happens in reasonably-optimized C.
** Note that we don't have to worry about whether to use [key] * count vs. [key for _ in range(count)] here, because the values have to be immutable, or at least of an "equality is as good as identity" type, or they wouldn't be usable as keys.
The most time efficient would be to use a dictionary:
collector = {}
for elem in inputlist:
collector.setdefault(elem, []).append(elem)
output = collector.values()
The other, more costly option is to sort, then group using itertools.groupby():
from itertools import groupby
output = [list(g) for k, g in groupby(sorted(inputlist))]
Demo:
>>> inputlist = [1, 2, 2, 2, 3, 1, 1, 3]
>>> collector = {}
>>> for elem in inputlist:
... collector.setdefault(elem, []).append(elem)
...
>>> collector.values()
[[1, 1, 1], [2, 2, 2], [3, 3]]
>>> from itertools import groupby
>>> [list(g) for k, g in groupby(sorted(inputlist))]
[[1, 1, 1], [2, 2, 2], [3, 3]]
What about this, as you said you wanted a function:
def makeList(user_list):
user_list.sort()
x = user_list[0]
output = [[]]
for i in user_list:
if i == x:
output[-1].append(i)
else:
output.append([i])
x = i
return output
>>> print makeList([1, 2, 2, 2, 3, 1, 1, 3])
[[1, 1, 1], [2, 2, 2], [3, 3]]

Indexing a nested list in python

Given data as
data = [ [0, 1], [2,3] ]
I want to index all first elements in the lists inside the list of lists. i.e. I need to index 0 and 2.
I have tried
print data[:][0]
but it output the complete first list .i.e.
[0,1]
Even
print data[0][:]
produces the same result.
My question is specifically how to accomplish what I have mentioned. And more generally, how is python handling double/nested lists?
Using list comprehension:
>>> data = [[0, 1], [2,3]]
>>> [lst[0] for lst in data]
[0, 2]
>>> [first for first, second in data]
[0, 2]
Using map:
>>> map(lambda lst: lst[0], data)
[0, 2]
Using map with operator.itemgetter:
>>> import operator
>>> map(operator.itemgetter(0), data)
[0, 2]
Using zip:
>>> zip(*data)[0]
(0, 2)
With this sort of thing, I generally recommend numpy:
>>> data = np.array([ [0, 1], [2,3] ])
>>> data[:,0]
array([0, 2])
As far as how python is handling it in your case:
data[:][0]
Makes a copy of the entire list and then takes the first element (which is the first sublist).
data[0][:]
takes the first sublist and then copies it.
The list indexing or nesting in general (be it dict, list or any other iterable) works Left to Right. Thus,
data[:][0]
would work out as
(data[:]) [0] == ([[0, 1], [2,3]]) [0]
which ultimately gives you
[0, 1]
As for possible workaronds or proper methods, falsetru & mgilson have done a good job in that regards.
try this:
print [x for x, y in data[:]]

Categories