Related
I am pretty new to Python (3.6) and struggling to understand itertools groupby.
I've got the following list containing integers:
list1 = [1, 2, 0, 2, 3, 0, 4, 5, 0]
But the list could also be much longer and the '0' doesn't have to appear after every pair of numbers. It can also be after 3, 4 or more numbers. My goal is to split this list into sublists where the '0' is used as a delimiter and doesn't appear in any of these sublists.
list2 = [[1, 2], [2, 3], [4, 5]]
A similar problem has been solved here already:
Python spliting a list based on a delimiter word
Answer 2 seemed to help me a lot but unfortunately it only gave me a TypeError.
import itertools as it
list1 = [1, 2, 0, 2, 3, 0, 4, 5, 0]
list2 = [list(group) for key, group in it.groupby(list1, lambda x: x == 0) if not key]
print(list2)
File "H:/Python/work/ps0001/example.py", line 13, in
list2 = [list(group) for key, group in it.groupby(list, lambda x: x == '0') if not key]
TypeError: 'list' object is not callable
I would appreciate any help and be very happy to finally understand groupby.
You were checking for "0" (str) but you only have 0 (int) in your list. Also, you were using list as a variable name for your first list, which is a keyword in Python.
from itertools import groupby
list1 = [1, 2, 0, 2, 7, 3, 0, 4, 5, 0]
list2 = [list(group) for key, group in groupby(list1, lambda x: x == 0) if not key]
print(list2)
This should give you:
[[1, 2], [2, 7, 3], [4, 5]]
In your code, you need to change lambda x: x == '0' to lambda x: x == 0, since your working with a list of int, not a list of str.
Since others have shown how to improve your solution with itertools.groupby, you can also do this task with no libraries:
>>> list1 = [1, 2, 0, 2, 3, 0, 4, 5, 0]
>>> zeroes = [-1] + [i for i, e in enumerate(list1) if e == 0]
>>> result = [list1[zeroes[i] + 1: zeroes[i + 1]] for i in range(len(zeroes) - 1)]
>>> print(result)
[[1, 2], [2, 3], [4, 5]]
You can use regex for this:
>>> import ast
>>> your_list = [1, 2, 0, 2, 3, 0, 4, 5, 0]
>>> a_list = str(your_list).replace(', 0,', '], [').replace(', 0]', ']')
>>> your_result = ast.literal_eval(a_list)
>>> your_result
([1, 2], [2, 3], [4, 5])
>>> your_result[0]
[1, 2]
>>>
Or a single line solution:
ast.literal_eval(str(your_list).replace(', 0,', '], [').replace(', 0]', ']'))
You could do that within a Loop as depicted in the commented Snippet below:
list1 = [1, 2, 0, 2, 3, 0, 4, 5, 0]
tmp,result = ([],[]) # tmp HOLDS A TEMPORAL LIST :: result => RESULT
for i in list1:
if not i:
# CURRENT VALUE IS 0 SO WE BUILD THE SUB-LIST
result.append(tmp)
# RE-INITIALIZE THE tmp VARIABLE
tmp = []
else:
# SINCE CURRENT VALUE IS NOT 0, WE POPULATE THE tmp LIST
tmp.append(i)
print(result) # [[1, 2], [2, 3], [4, 5]]
Effectively:
list1 = [1, 2, 0, 2, 3, 0, 4, 5, 0]
tmp,result = ([],[]) # HOLDS A TEMPORAL LIST
for i in list1:
if not i:
result.append(tmp); tmp = []
else:
tmp.append(i)
print(result) # [[1, 2], [2, 3], [4, 5]]
Use zip to return a tuple of lists and convert them to list later on
>>> a
[1, 2, 0, 2, 3, 0, 4, 5, 0]
>>> a[0::3]
[1, 2, 4]
>>> a[1::3]
[2, 3, 5]
>>> zip(a[0::3],a[1::3])
[(1, 2), (2, 3), (4, 5)]
>>> [list(i) for i in zip(a[0::3],a[1::3])]
[[1, 2], [2, 3], [4, 5]]
Try to use join and then split by 0
lst = [1, 2, 0, 2, 3, 0, 4, 5, 0]
lst_string = "".join([str(x) for x in lst])
lst2 = lst_string.split('0')
lst3 = [list(y) for y in lst2]
lst4 = [list(map(int, z)) for z in lst3]
print(lst4)
Running on my console:
I am wondering, how i can shorten this:
test = [1, 2, 3]
test[0] = [1, 2, 3]
test[1] = [1, 2, 3]
test[2] = [1, 2, 3]
I tried something like this:
test = [1[1, 2, 3], 2 [1, 2, 3], 3[1, 2, 3]]
#or
test = [1 = [1, 2, 3], 2 = [1, 2, 3], 3 = [1, 2, 3]] #I know this is dumb, but at least I tried...
But it's not functioning :|
Is this just me beeing stupid and trying something that can not work, or is there a proper Syntax for this, that I don't know about?
The simplest way is
test = [[1, 2, 3], [1, 2, 3], [1, 2, 3]]
But, if you want to have more number of lists to be created then you might want to go with list comprehension, like this
test = [[1, 2, 3] for i in range(100)]
This will create a list of 100 sub lists. The list comprehension is to create a new list and it can be understood like this
test = []
for i in range(100):
test.append([1, 2, 3])
Note: Instead of doing test[0] = ..., you can simply make use of list.append like this
test = []
test.append([1, 2, 3])
...
If you look at the language definition of list,
list_display ::= "[" [expression_list | list_comprehension] "]"
So, a list can be constructed with list comprehension or expression list. If we see the expression list,
expression_list ::= expression ( "," expression )* [","]
It is just a comma separated one or more expressions.
In your case,
1[1, 2, 3], 2[1, 2, 3] ...
are not valid expressions, since 1[1, 2, 3] has no meaning in Python. Also,
1 = [1, 2, 3]
means you are assigning [1, 2, 3] to 1, which is also not a valid expression. That is why your attempts didn't work.
Your code: test = [1 = [1, 2, 3], 2 = [1, 2, 3], 3 = [1, 2, 3]] is pretty close. You can use a dictionary to do exactly that:
test = {1: [1, 2, 3], 2: [1, 2, 3], 3: [1, 2, 3]}
Now, to call test 1 simply use:
test[1]
Alternatively, you can use a dict comprehension:
test = {i: [1, 2, 3] for i in range(3)}
It's a list comprehension:
test = [[1, 2, 3] for i in range(3)]
If you want, you can do this:
test = [[1,2,3]]*3
#=> [[1, 2, 3], [1, 2, 3], [1, 2, 3]]
===== Edited ====.
However, Note that all elements refer to the same object
# 1. -----------------
# If one element is changed, the others will be changed as well.
test = [[1,2,3]]*3
print(test)
#=>[[1, 2, 3], [1, 2, 3], [1, 2, 3]]
test[0][1] = 4
print(test)
#=>[[1, 4, 3], [1, 4, 3], [1, 4, 3]] # because it refer to same object
# 2. -----------------
# Of course, those elements have the same value.
print("same value") if test[0] == test[1] else print("not same value")
#=> same value
# 3. -----------------
# Make sure that All refer to the same object
print("refer to the same object") if test[0] is test[1] else print("not refer to the same object")
#=> refer to the same object
# 4. -----------------
# So, Make sure that All have same id
hex(id(test[0]))
#=>e.g. 0x7f116d337820
hex(id(test[1]))
#=>e.g. 0x7f116d337820
hex(id(test[2]))
#=>e.g. 0x7f116d337820
I'm trying to create a pair of functions that, given a list of "starting" numbers, will recursively add to each index position up to a defined maximum value (much in the same way that a odometer works in a car--each counter wheel increasing to 9 before resetting to 1 and carrying over onto the next wheel).
The code looks like this:
number_list = []
def counter(start, i, max_count):
if start[len(start)-1-i] < max_count:
start[len(start)-1-i] += 1
return(start, i, max_count)
else:
for j in range (len(start)):
if start[len(start)-1-i-j] == max_count:
start[len(start)-1-i-j] = 1
else:
start[len(start)-1-i-j] += 1
return(start, i, max_count)
def all_values(fresh_start, i, max_count):
number_list.append(fresh_start)
new_values = counter(fresh_start,i,max_count)
if new_values != None:
all_values(*new_values)
When I run all_values([1,1,1],0,3) and print number_list, though, I get:
[[1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1],
[1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1],
[1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1],
[1, 1, 1], [1, 1, 1], [1, 1, 1]]
Which is unfortunate. Doubly so knowing that if I replace the first line of all_values with
print(fresh_start)
I get exactly what I'm after:
[1, 1, 1]
[1, 1, 2]
[1, 1, 3]
[1, 2, 1]
[1, 2, 2]
[1, 2, 3]
[1, 3, 1]
[1, 3, 2]
[1, 3, 3]
[2, 1, 1]
[2, 1, 2]
[2, 1, 3]
[2, 2, 1]
[2, 2, 2]
[2, 2, 3]
[2, 3, 1]
[2, 3, 2]
[2, 3, 3]
[3, 1, 1]
[3, 1, 2]
[3, 1, 3]
[3, 2, 1]
[3, 2, 2]
[3, 2, 3]
[3, 3, 1]
[3, 3, 2]
[3, 3, 3]
I have already tried making a copy of fresh_start (by way of temp = fresh_start) and appending that instead, but with no change in the output.
Can anyone offer any insight as to what I might do to fix my code? Feedback on how the problem could be simplified would be welcome as well.
Thanks a lot!
temp = fresh_start
does not make a copy. Appending doesn't make copies, assignment doesn't make copies, and pretty much anything that doesn't say it makes a copy doesn't make a copy. If you want a copy, slice it:
fresh_start[:]
is a copy.
Try the following in the Python interpreter:
>>> a = [1,1,1]
>>> b = []
>>> b.append(a)
>>> b.append(a)
>>> b.append(a)
>>> b
[[1, 1, 1], [1, 1, 1], [1, 1, 1]]
>>> b[2][2] = 2
>>> b
[[1, 1, 2], [1, 1, 2], [1, 1, 2]]
This is a simplified version of what's happening in your code. But why is it happening?
b.append(a) isn't actually making a copy of a and stuffing it into the array at b. It's making a reference to a. It's like a bookmark in a web browser: when you open a webpage using a bookmark, you expect to see the webpage as it is now, not as it was when you bookmarked it. But that also means that if you have multiple bookmarks to the same page, and that page changes, you'll see the changed version no matter which bookmark you follow.
It's the same story with temp = a, and for that matter, a = [1,1,1]. temp and a are "bookmarks" to a particular array which happens to contain three ones. And b in the example above, is a bookmark to an array... which contains three bookmarks to that same array that contains three ones.
So what you do is create a new array and copy in the elements of the old array. The quickest way to do that is to take an array slice containing the whole array, as user2357112 demonstrated:
>>> a = [1,1,1]
>>> b = []
>>> b.append(a[:])
>>> b.append(a[:])
>>> b.append(a[:])
>>> b
[[1, 1, 1], [1, 1, 1], [1, 1, 1]]
>>> b[2][2] = 2
>>> b
[[1, 1, 1], [1, 1, 1], [1, 1, 2]]
Much better.
When I look at the desired output I can't help but think about using one of the numpy grid data production functions.
import numpy
first_column, second_column, third_column = numpy.mgrid[1:4,1:4,1:4]
numpy.dstack((first_column.flatten(),second_column.flatten(),third_column.flatten()))
Out[23]:
array([[[1, 1, 1],
[1, 1, 2],
[1, 1, 3],
[1, 2, 1],
[1, 2, 2],
[1, 2, 3],
[1, 3, 1],
[1, 3, 2],
[1, 3, 3],
[2, 1, 1],
[2, 1, 2],
[2, 1, 3],
[2, 2, 1],
[2, 2, 2],
[2, 2, 3],
[2, 3, 1],
[2, 3, 2],
[2, 3, 3],
[3, 1, 1],
[3, 1, 2],
[3, 1, 3],
[3, 2, 1],
[3, 2, 2],
[3, 2, 3],
[3, 3, 1],
[3, 3, 2],
[3, 3, 3]]])
Of course, the utility of this particular approach might depend on the variety of input you need to deal with, but I suspect this could be an interesting way to build the data and numpy is pretty fast for this kind of thing. Presumably if your input list has more elements you could have more min:max arguments fed into mgrid[] and then unpack / stack in a similar fashion.
Here is a simplified version of your program, which works. Comments will follow.
number_list = []
def _adjust_counter_value(counter, n, max_count):
"""
We want the counter to go from 1 to max_count, then start over at 1.
This function adds n to the counter and then returns a tuple:
(new_counter_value, carry_to_next_counter)
"""
assert max_count >= 1
assert 1 <= counter <= max_count
# Counter is in closed range: [1, max_count]
# Subtract 1 so expected value is in closed range [0, max_count - 1]
x = counter - 1 + n
carry, x = divmod(x, max_count)
# Add 1 so expected value is in closed range [1, max_count]
counter = x + 1
return (counter, carry)
def increment_counter(start, i, max_count):
last = len(start) - 1 - i
copy = start[:] # make a copy of the start
add = 1 # start by adding 1 to index
for i_cur in range(last, -1, -1):
copy[i_cur], add = _adjust_counter_value(copy[i_cur], add, max_count)
if 0 == add:
return (copy, i, max_count)
else:
# if we have a carry out of the 0th position, we are done with the sequence
return None
def all_values(fresh_start, i, max_count):
number_list.append(fresh_start)
new_values = increment_counter(fresh_start,i,max_count)
if new_values != None:
all_values(*new_values)
all_values([1,1,1],0,3)
import itertools as it
correct = [list(tup) for tup in it.product(range(1,4), range(1,4), range(1,4))]
assert number_list == correct
Since you want the counters to go from 1 through max_count inclusive, it's a little bit tricky to update each counter. Your original solution was to use several if statements, but here I have made a helper function that uses divmod() to compute each new digit. This lets us add any increment to any digit and will find the correct carry out of the digit.
Your original program never changed the value of i so my revised one doesn't either. You could simplify the program further by getting rid of i and just having increment_counter() always go to the last position.
If you run a for loop to the end without calling break or return, the else: case will then run if there is one present. Here I added an else: case to handle a carry out of the 0th place in the list. If there is a carry out of the 0th place, that means we have reached the end of the counter sequence. In this case we return None.
Your original program is kind of tricky. It has two explicit return statements in counter() and an implicit return at the end of the sequence. It does return None to signal that the recursion can stop, but the way it does it is too tricky for my taste. I recommend using an explicit return None as I showed.
Note that Python has a module itertools that includes a way to generate a counter series like this. I used it to check that the result is correct.
I'm sure you are writing this to learn about recursion, but be advised that Python isn't the best language for recursive solutions like this one. Python has a relatively shallow recursion stack, and does not automatically turn tail recursion into an iterative loop, so this could cause a stack overflow inside Python if your recursive calls nest enough times. The best solution in Python would be to use itertools.product() as I did to just directly generate the desired counter sequence.
Since your generated sequence is a list of lists, and itertools.product() produces tuples, I used a list comprehension to convert each tuple into a list, so the end result is a list of lists, and we can simply use the Python == operator to compare them.
In order to make the set of all combinations of numbers 0 to x, with length y, we do:
list_of_combinations=list(combinations(range(0,x+1),y))
list_of_combinations=map(list,list_of_combinations)
print list_of_combinations
This will output the result as a list of lists.
For example, x=4, y=3:
[[0, 1, 2], [0, 1, 3], [0, 1, 4], [0, 2, 3], [0, 2, 4], [0, 3, 4], [1, 2, 3], [1, 2, 4],
[1, 3, 4], [2, 3, 4]]
I am trying to do the above, but only outputting lists that have 2 members chosen beforehand.
For instance, I would like to only output the set of the combos that has 1 and 4 inside it. The output would then be (for x=4, y=3):
[[0, 1, 4], [1, 2, 4], [1, 3, 4]]
The best approach I have now is to make a list that is y-2 length with all numbers of the set without the chosen numbers, and then append the chosen numbers, but this seems very inefficient. Any help appreciated.
*Edit: I am doing this for large x and y, so I can't just write out all the combos and then search for the selected elements, I need to find a better method.
combinations() returns an iterable, so loop over that while producing the list:
[list(combo) for combo in combinations(range(x + 1), y) if 1 in combo]
This produces one list, the list of all combinations that match the criteria.
Demo:
>>> from itertools import combinations
>>> x, y = 4, 3
>>> [list(combo) for combo in combinations(range(x + 1), y) if 1 in combo]
[[0, 1, 2], [0, 1, 3], [0, 1, 4], [1, 2, 3], [1, 2, 4], [1, 3, 4]]
The alternative would be to produce y - 1 combinations of range(x + 1) with 1 removed, then adding 1 back in (using bisect.insort() to avoid having to sort afterwards):
import bisect
def combinations_with_guaranteed(x, y, *guaranteed):
values = set(range(x + 1))
values.difference_update(guaranteed)
for combo in combinations(sorted(values), y - len(guaranteed)):
combo = list(combo)
for value in guaranteed:
bisect.insort(combo, value)
yield combo
then loop over that generator:
>>> list(combinations_with_guaranteed(4, 3, 1))
[[0, 1, 2], [0, 1, 3], [0, 1, 4], [1, 2, 3], [1, 2, 4], [1, 3, 4]]
>>> list(combinations_with_guaranteed(4, 3, 1, 2))
[[0, 1, 2], [1, 2, 3], [1, 2, 4]]
This won't produce as many combinations for filtering to discard again.
It may well be that for larger values of y and guaranteed numbers, just using yield sorted(combo + values) is going to beat repeated bisect.insort() calls.
This should do the trick:
filtered_list = filter(lambda x: 1 in x and 4 in x, list_of_combinations)
To make your code nicer (use more generators), I'd use this
combs = combinations(xrange(0, x+1), y)
filtered_list = map(list, filter(lambda x: 1 in x and 4 in x, combs))
If you don't need the filtered_list to be a list and it can be an iterable, you could even do
from itertools import ifilter, imap, combinations
combs = combinations(xrange(0, x+1), y)
filtered_list = imap(list, ifilter(lambda x: 1 in x and 4 in x, combs))
filtered_list.next()
> [0, 1, 4]
filtered_list.next()
> [1, 2, 4]
filtered_list.next()
> [1, 3, 4]
filtered_list.next()
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> StopIteration
Was just wondering what's the most efficient way of generating all the circular shifts of a list in Python. In either direction. For example, given a list [1, 2, 3, 4], I want to generate either:
[[1, 2, 3, 4],
[4, 1, 2, 3],
[3, 4, 1, 2],
[2, 3, 4, 1]]
where the next permutation is generated by moving the last element to the front, or:
[[1, 2, 3, 4],
[2, 3, 4, 1],
[3, 4, 1, 2],
[4, 1, 2, 3]]
where the next permutation is generated by moving the first element to the back.
The second case is slightly more interesting to me because it results in a reduced Latin square (the first case also gives a Latin square, just not reduced), which is what I'm trying to use to do experimental block design. It actually isn't that different from the first case since they're just re-orderings of each other, but order does still matter.
The current implementation I have for the first case is:
def gen_latin_square(mylist):
tmplist = mylist[:]
latin_square = []
for i in range(len(mylist)):
latin_square.append(tmplist[:])
tmplist = [tmplist.pop()] + tmplist
return latin_square
For the second case its:
def gen_latin_square(mylist):
tmplist = mylist[:]
latin_square = []
for i in range(len(mylist)):
latin_square.append(tmplist[:])
tmplist = tmplist[1:] + [tmplist[0]]
return latin_square
The first case seems like it should be reasonably efficient to me, since it uses pop(), but you can't do that in the second case, so I'd like to hear ideas about how to do this more efficiently. Maybe there's something in itertools that will help? Or maybe a double-ended queue for the second case?
You can use collections.deque:
from collections import deque
g = deque([1, 2, 3, 4])
for i in range(len(g)):
print list(g) #or do anything with permutation
g.rotate(1) #for right rotation
#or g.rotate(-1) for left rotation
It prints:
[1, 2, 3, 4]
[4, 1, 2, 3]
[3, 4, 1, 2]
[2, 3, 4, 1]
To change it for left rotation just replace g.rotate(1) with g.rotate(-1).
For the first part, the most concise way probably is
a = [1, 2, 3, 4]
n = len(a)
[[a[i - j] for i in range(n)] for j in range(n)]
# [[1, 2, 3, 4], [4, 1, 2, 3], [3, 4, 1, 2], [2, 3, 4, 1]]
and for the second part
[[a[i - j] for i in range(n)] for j in range(n, 0, -1)]
# [[1, 2, 3, 4], [2, 3, 4, 1], [3, 4, 1, 2], [4, 1, 2, 3]]
These should also be much more efficient than your code, though I did not do any timings.
variation on slicing "conservation law" a = a[:i] + a[i:]
ns = list(range(5))
ns
Out[34]: [0, 1, 2, 3, 4]
[ns[i:] + ns[:i] for i in range(len(ns))]
Out[36]:
[[0, 1, 2, 3, 4],
[1, 2, 3, 4, 0],
[2, 3, 4, 0, 1],
[3, 4, 0, 1, 2],
[4, 0, 1, 2, 3]]
[ns[-i:] + ns[:-i] for i in range(len(ns))]
Out[38]:
[[0, 1, 2, 3, 4],
[4, 0, 1, 2, 3],
[3, 4, 0, 1, 2],
[2, 3, 4, 0, 1],
[1, 2, 3, 4, 0]]
more_itertools is a third-party library that offers a tool for cyclic permutations:
import more_itertools as mit
mit.circular_shifts(range(1, 5))
# [(1, 2, 3, 4), (2, 3, 4, 1), (3, 4, 1, 2), (4, 1, 2, 3)]
See also Wikipedia:
A circular shift is a special kind of cyclic permutation, which in turn is a special kind of permutation.
The answer by #Bruno Lenzi does not seem to work:
In [10]: from itertools import cycle
In [11]: x = cycle('ABCD')
In [12]: print [[x.next() for _ in range(4)] for _ in range(4)]
[['A', 'B', 'C', 'D'], ['A', 'B', 'C', 'D'], ['A', 'B', 'C', 'D'], ['A', 'B', 'C', 'D']]
I give a correct version below, however the solution by #f5r5e5d is faster.
In [45]: def use_cycle(a):
x=cycle(a)
for _ in a:
x.next()
print [x.next() for _ in a]
....:
In [46]: use_cycle([1,2,3,4])
[2, 3, 4, 1]
[3, 4, 1, 2]
[4, 1, 2, 3]
[1, 2, 3, 4]
In [50]: def use_slice(a):
print [ a[n:] + a[:n] for n in range(len(a)) ]
....:
In [51]: use_slice([1,2,3,4])
[[1, 2, 3, 4], [2, 3, 4, 1], [3, 4, 1, 2], [4, 1, 2, 3]]
In [54]: timeit.timeit('use_cycle([1,2,3,4])','from __main__ import use_cycle',number=100000)
Out[54]: 0.4884989261627197
In [55]: timeit.timeit('use_slice([1,2,3,4])','from __main__ import use_slice',number=100000)
Out[55]: 0.3103291988372803
In [58]: timeit.timeit('use_cycle([1,2,3,4]*100)','from __main__ import use_cycle',number=100)
Out[58]: 2.4427831172943115
In [59]: timeit.timeit('use_slice([1,2,3,4]*100)','from __main__ import use_slice',number=100)
Out[59]: 0.12029695510864258
I removed the print statement in use_cycle and use_slice for timing purposes.
Using itertools to avoid indexing:
x = itertools.cycle(a)
[[x.next() for i in a] for j in a]
This will be my solution.
#given list
a = [1,2,3,4]
#looping through list
for i in xrange(len(a)):
#inserting last element at the starting
a.insert(0,a[len(a)-1])
#removing the last element
a = a[:len(a)-1]
#printing if you want to
print a
This will output the following:
[4, 1, 2, 3]
[3, 4, 1, 2]
[2, 3, 4, 1]
[1, 2, 3, 4]
You can also use pop instead of using list slicing but the problem with pop is that it will return something.
Also the above code will work for any length of list. I have not checked for performance of the code. I am assuming that it will work better.
You should have a look at Python docs for getting a good understanding of List slicing.