I have an iterator that consists of several lists of the same size. For my purpose I need to know the length of at least one of these lists. But as it is with iterators they can't be accessed the same way as ordinary arrays. So my idea was to get this length by saying:
for i in iter:
list_len = len(i)
break
And this works, however, when using this list later on, and wanting to loop over it again it skips the first iteration, and basically continues from the next iteration from the previous loop (the one above).
Is there some way to fix this ? Or, what is the pythonic way of doing it ?
I was thinking/reading about doing it like:
from itertools import tee
iter_tmp, iter = tee(iter)
for i in iter_tmp:
list_len = len(i)
break
And yeah, that works too, since I can now use the original iter for later use, but it just hurt my eyes that I have to make a loop, import itertools and such just to get the length of a list in an iterator. But maybe that is just the way to go about it ?
UPDATE
Just trying to further explain what I'm doing.
As such iterations is not a list or an array, but in my case, if I were to loop through my iterator I would get something like (in the case of my iterator having four "lists" in it):
>>> for i in iter_list:
print(i)
[1, 2, 5, 3]
[3, 2, 5, 8]
[6, 8, 3, 7]
[1, 4, 6, 1]
Now, all "lists" in the iterator has the same length, but since the lists themselves are calculated through many steps, I really don't know the length in any way before it enters the iterator. If I don't use an iterator I run out of memory - so it is a pro/con solution. But yeah, it is the length of just one of the lists I need as a constant I can use throughout the rest of my code.
That is how iterators work. But you have a few options apart from tee.
You can extract the first element and reuse it when iterating the second time:
first_elem = next(my_iter)
list_len = len(first_elem)
for l in itertools.chain([first_elem], my_iter):
pass
Or if you are going to iterate over the iterator more times, you could perhaps listify it (if it's feasible to fit in memory).
my_list = list(my_iter)
first_len = len(my_list[0])
for l in my_list:
pass
And certainly not the least, as Palivek said, keep/get the information about the length of the lists (from) somewhere else.
In general iterators are not re-iteratable so you'll probably need to store something additional anyway.
class peek_iterator(object):
def __init__(self, source):
self._source = iter(source)
self._first = None
self._sent = False
def __iter__(self):
return self
def next(self):
if self._first is None:
self._first = self._source.next()
if self._sent:
return self._source.next()
self._sent = True
return self._first
def get_isotropic(self, getter):
if self._first is None:
self._first = self._source.next()
return getter(self._first)
lists = [[1, 2, 3], [4, 5, 6]]
i = peek_iterator(lists)
print i.get_isotropic(len) # 3
for j in i: print j # [1, 2, 3]; [4, 5, 6]
You can do a little trick and wrap the original iterator in a generator. This way, you can obtain the first element and "re-yield" it with the generator without consuming the entire iterator. The head() function below returns the first element and a generator that iterates over the original sequence.
def head(seq):
seq_iter = iter(seq)
first = next(seq_iter)
def gen():
yield first
yield from seq_iter
return first, gen()
seq = range(100, 300, 50)
first, seq2 = head(seq)
print('first item: {}'.format(first))
for item in seq2:
print(item)
Output:
first item: 100
100
100
150
200
250
This is conceptually equivalent to Moberg's answer, but uses a generator to "re-assemble" the original sequence instead of itertools.chain().
Related
I want to make a code that receives a random list and stores only positive numbers.
However, if I run it with the code I wrote, I only get positive numbers, but the order is reversed. What should I do?
As an example of the code, [3, 2, 1, 0] is displayed.
I want to print this out [0, 1, 2, 3].
def filter(list):
flist = []
for i in list:
if list[i]>=0:
flist.append(list[i])
else:
continue
return flist
list = [-1,-2,-3,-4,0,1,2,3]
print(filter(list))
for i in list iterates over the items in the list, not the indices. Since the first four items in the list are negative numbers, when you use them as indices, you end up iterating through the last half of the list in reverse order before reaching zero and then iterating through the first half of the list in forward order. (If all the items in the list didn't happen to be valid indices for that same list, you'd just get an IndexError instead.)
To iterate over all the items in the list in order by index use range and len:
# 'filter' and 'list' are both builtin names, don't redefine them
def filter_nat(nums):
flist = []
for i in range(len(nums)):
if nums[i]>=0:
flist.append(nums[i])
else:
continue
return flist
nums = [-1,-2,-3,-4,0,1,2,3]
print(filter_nat(nums)) # [0, 1, 2, 3]
It's simpler to iterate by value, though; you just need to use the value itself rather than trying to use it as an index:
def filter_nat(nums):
flist = []
for i in nums:
if i >=0:
flist.append(i)
return flist
nums = [-1,-2,-3,-4,0,1,2,3]
print(filter_nat(nums)) # [0, 1, 2, 3]
and it's simpler yet to use a comprehension instead of individually appending each item:
def filter_nat(nums):
return [i for i in nums if i >= 0]
nums = [-1,-2,-3,-4,0,1,2,3]
print(filter_nat(nums)) # [0, 1, 2, 3]
sort before returning the list
def filter(list):
flist = []
for i in list:
if list[i]>=0:
flist.append(list[i])
else:
continue
flist.sort()
return flist
list = [-1,-2,-3,-4,0,1,2,3]
print(filter(list))
Output:
[0, 1, 2, 3]
What do you want? Something in the same order as encountered reading from left to right or in increasing order? If it is in the order of reading in initial list, here is the answer:
function append() adds the element at the end of the list. Hence, you could either go over the list from the end to its beginning using append() or going in same order as you, use another fashion in order to add the element at the beginning. When sticking to your code, we would have
def filter(list):
flist = []
for i in list:
if list[i]>=0:
flist = [i] + flist
else:
continue
return flist
However, this could be written way more "pythonic" in the following way:
def filter(list):
return [i for i in list if i>=0]
Try for i in range (len(list)):
You would be looping through the number of items in the list. Personally it is easier and cleaner code for me but everyone has his preferences.
Try:
print(filter(list[::-1]))
I would like to perform an operation to each element in my list, and when a certain condition is met, to skip to the last element of the list.
Here is a MWE where I print all the items in a list until I reach my condition (item ==4), after which I manually repeat the print statement on the final element. The desired output is to print 0, 1, 2, 3, 4, 7:
my_list = [0, 1, 2, 3, 4, 5, 6, 7]
breaked_out = False
for item in my_list:
print(item)
if item == 4:
breaked_out = True
break
if breaked_out:
print(my_list[-1])
I have this ugly use of a flag (breaked_out) and also need to repeat my print() command. This isn't particularly legible either.
I have a slightly better implementation in mind that uses a while loop instead:
my_list = [0, 1, 2, 3, 4, 5, 6, 7]
i = 0
while i < len(my_list):
item = my_list[i]
print(item)
if item == 4:
i = len(my_list)-1
else:
i += 1
Here I'm not repeating my operation (in this case, print()) but I have to do this unpythonic index accounting.
Is there a more readable way to get this sort of iteration? Other things to add about my situation:
This is iterating on a list, not a generator, so I have access to len().
I need to loop through in this order, so I can't reverse(my_list) and treat the special case first.
There is no Pythonic way to do exactly what you want
Not what you asked for, but less ugly
The minimal change, not so ugly solution that doesn't actually advance the iterator is to just put the code in the loop instead of having a flag variable:
my_list = [0, 1, 2, 3, 4, 5, 6, 7]
for item in my_list:
print(item)
if item == 4:
print(my_list[-1]) # Handling of last element inlined
break
# Optionally, an else: block can run to do something special when you didn't break
# which might be important if the item that's equal to 4 is the last or second
# to last item, where the former does the work for the final element twice,
# while the latter does it only once, but looks like it never found the element
# (processing all items without breaking looking the same as processing elements
# 0 through n - 1, then processing n separately, then breaking)
else:
print("Processed them all!")
or to avoid processing the final element twice when it's the first element meeting the test criteria, use enumerate to track your position:
my_list = [0, 1, 2, 3, 4, 5, 6, 7]
for i, item in enumerate(my_list, 1): # Lets us test against len(my_list) rather than len(my_list) - 1
print(item)
if item == 4 and i < len(my_list): # Don't process last item if we just processed it!
print(my_list[-1]) # Handling of last element inlined
break
What you asked for:
There's only two ways I know of to do this, both of which involve converting to an iterator first so you can manipulate the iterator within the loop so it will skip to the last element for the next loop. In both cases, your original code changes to:
my_list = [0, 1, 2, 3, 4, 5, 6, 7]
lstiter = iter(my_list)
for item in lstiter:
print(item)
if item == 4:
# iterator advance goes here
where that placeholder line at the bottom is what changes.
The documented, but slow approach
Using the consume recipe from itertools, advance it to near the end. You need to know where you are, so the for loop changes to:
for i, item in enumerate(lstiter, 1): # Starting from 1 avoids needing an extra - 1 in
# the length check and consume's second argument
and the placeholder is filled with:
if i < len(my_list):
consume(lstiter, len(my_list) - i)
Downside: Advancing an arbitrary iterator manually like this is O(n) (it has to produce and discard all the values, which can take time for a large list).
The efficient, but undocumented approach
list iterators provide a __setstate__ method (it's used for pickleing them so they can be unpickled at the same offset in the underlying list). You can abuse this to change the position of the iterator to any place you like. For this, you keep the for loop without enumerate, and just fill the placeholder with:
lstiter.__setstate__(len(my_list) - 1)
which directly skips to the iterator such that the next element it produces will be the final element of the list. It's efficient, it's simple, but it's non-obvious, and I doubt any part of the spec requires that __setstate__ be provided at all, let alone implemented in this useful way (there are a bazillion methods you can choose from to implement pickling, and they could have selected another option). That said, the implementation is effectively required for all pickle protocol versions to date, for compatibility reasons (if they got rid of __setstate__, pickles produced on older Python would not be readable on modern Python), so it should be fairly reliable.
A warning:
If the final element of your list matches the condition, this will turn into an infinite loop, unlike the other solutions (break double processes the final element in that case, consume only processes each element at most once). breaking explicitly doesn't reenter the loop, so that's safe, and the consume recipe can't back up an iterator, so again, safe. But since this sets the position, it can set it back to the same position forever. If this is a possibility, I'd recommend the explicit break (using enumerate to check indices to avoid double-processing the final element), or failing that, you can add even more hackery by checking the length hint of the iterator to see if you were already at the end (and therefore should not adjust the position):
from operator import length_hint # At top of file
my_list = [0, 1, 2, 3, 4, 5, 6, 7]
lstiter = iter(my_list)
for item in lstiter:
print(item)
if item == 4 and length_hint(lstiter) > 1: # Don't set when finished or reaching last item anyway
lstiter.__setstate__(len(my_list) - 1)
As an alternative to checking length_hint and similar hackery, you could use a flag variable that gets set to True when the condition passes and prevents reentering the if a second time, e.g.:
my_list = [0, 1, 2, 3, 4, 5, 6, 7]
lstiter = iter(my_list)
skipped = False
for i, item in enumerate(lstiter, 1):
print(item)
if item == 4 and not skipped and i < len(my_list): # Don't set when finished or reaching last item anyway
lstiter.__setstate__(len(my_list) - 1)
skipped = True
but this is straying further and further from Pythonic with every change. :-)
Don't do too much for a simple task
my_list = [0, 1, 2, 3, 4, 5, 6, 7]
for item in my_list:
print(item)
if item == 4:
print(my_list[-1])
break
# task here
Is it possible to pass a (moving) pointer to a list start into a function in Python?
I have a recursive function working on a section of a list. The list itself is not changed, only the pointer to a 'starting-point' into it. The problem I ran into was that long lists killed the code with memory overrun.
Here is the code:
def trim(l):
print("list len= ", len(l))
if len(l)!= 1:
trim(l[1:])
else:
print("done")
The above example is contrived, my actual code does different stuff than just trimming the list, but it also has a moving start-pointer. A list of 1 million integers blew out of memory on a 10G RAM machine.
Any ideas are welcome.
Couldn't you just pass the index instead of passing the whole new list?
So you would call trim(l, 0) and then check the index against the length of the list, and then call trim(l, 1) if needed.
def trim(l, idx):
print("list len = ", (len(l) - idx))
if idx < (len(x) - 1):
trim(l, idx + 1)
else:
print("done")
If you're writing a non-tail-call recursive function to iterate over a list, your problem is more likely to be a stack overflow, or out-of-memory error related to the stack size.
I recommend re-writing this with an integer pointer and a for-loop, as it seems that Python doesn't have tail-call optimisation.
Here's a guess at what you might be wanting to do:
x = [0,0,0,0,0,1,2,3,4]
def trim_leading_zero(l):
the_len = len(l)
start_i = 0
for i in xrange(the_len):
if l[i] != 0:
return l[i:]
>>> trim_leading_zero(x)
[1, 2, 3, 4]
It's not clear from your code what it's meant to actually do. If you're trying to actually return a sequence, then you may want to look at Generators, which don't require holding an entire sequence in memory.
When dealing with large data, use generators instead of regular iterators.
def trim(l):
print("list len= ", len(l))
pointer = 0
if len(l)!= 1:
yield l[pointer:]
pointer += 1
else:
print("done")
x = [1, 2, 3, 4, 5, 6, 7, 8, 9]
for i in trim(x):
print i
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
Generators will yield one item at a time and let you do whatever you need with it, avoiding create the whole list first before processing. If you want to get a list out of it, you can simply do list(trim(x)).
There are great explanations of yield and generators here - What does the yield keyword do
I'd make a function in python, that given a list returns a list of list, in which every element is the list given decreased by one.
Input: list_decreaser([0,3,4,5,6,7,8)
Output: [[0,3,4,5,6,7],[0,3,4,5,6],[0,3,4,5],[0,3,4],[0,3],[0]]
My attempt:
def list_decreaser(list):
listresult = []
for x in range(len(list)-1):
list.remove(list[x])
listresult.append(list)
return listresult
The code appends the same list multiple times. It should append copy of the list.
And use del list[..] instead of list.remove(list[..]) to delete an item at specific index.
def list_decreaser(xs):
listresult = []
for i in range(len(xs)-1, 0, -1): # <--- interate backward
del xs[i]
listresult.append(xs[:]) # <----
return listresult
print(list_decreaser([0,3,4,5,6,7,8]))
Or using list comprehension:
>>> xs = [0,3,4,5,6,7,8]
>>> [xs[:i] for i in range(len(xs)-1, 0, -1)]
[[0, 3, 4, 5, 6, 7], [0, 3, 4, 5, 6], [0, 3, 4, 5], [0, 3, 4], [0, 3], [0]]
BTW, don't use list as a variable name. It shadows builtin list function.
The problem is that you're appending the same list over and over again. You keep mutating the list in-place, but you're never creating a new list. So you end up with a list of N references to the same empty list.
This is the same problem discussed in two FAQ questions. I think How do I create a multidimensional list explains it best.
Anyway, what you need to do is append a new list each time through the loop. There are two ways to do that.
First, you can append a copy of the current list, instead of the list itself:
def list_decreaser(list):
listresult = []
for x in range(len(list)-1):
list.remove(list[x])
listresult.append(list[:]) # this is the only change
return listresult
This solves your problem, but it leaves a few new problems:
First, list.remove(list[x]) is a very bad idea. If you give it, say, [0, 1, 2, 0], what happens when you try to remove that second 0? You're calling list.remove(0), and there's no way the list can know you wanted the second 0 rather than the first! The right thing to do is call del list[x] or list.pop(x).
But once you fix that, you're removing the elements from the wrong side. x is 0, then 1, then 2, and so on. You remove element 0, then element 1 (which is the original element 2), then element 2 (which is the original element 4), and eventually get an IndexError. Even if you fixed the "skipping an index" issue (which is also explained in the FAQ somewhere), you'd still be removing the first elements rather than the last ones. You can fix that by turning the range around. However, there's an even easier way: Just remove the last element each time, instead of trying to figure out which x is the right thing, which you can do by specifying -1, or just calling pop with no argument. And then you can use a much simpler loop, too:
def list_decreaser(list):
listresult = []
while list:
list.pop()
listresult.append(list[:])
return listresult
Of course this appends the last, empty list, which you apparently didn't want. You can fix that by doing while len(list) >= 1, or putting an if list: listresult.append(list[:]), or in various other ways.
Alternatively, you can make new truncated lists instead of truncating and copying the same list over and over:
def list_decreaser(list):
listresult = []
while len(list):
list = list[:-1]
listresult.append(list)
return listresult
Note that in this second version, rather than changing the value stored in list, we're creating a new list and storing that new list in list.
use this
def list_decreaser(list1):
listresult = []
for i in list1:
list1 = list[:-1]
listresult.append(list1)
return listresult
I want to write a function that takes items from a list and groups them into groups of size n.
Ie, for n = 5, [1, 2, 3, 4, 5, 6, 7] would become [[1, 2, 3, 4, 5], [6, 7]].
What's the best python idiomatic way to do this?
You could do this:
[a[x:x+n] for x in range(0, len(a), n)]
(In Python 2, use xrange for efficiency; in Python 3 use range as above.)
I don't know of a good command to do this, but here's a way to do it with a list comprehension:
l = [1,2,3,4,5,6,7]
n = 5
newlist = [l[i:i+n] for i in range(0,len(l),n)]
Edit: as a commenter pointed out, I had accidentally put l[i:i+n] in a list.
Solutions using ranges with steps only work on sequences such as lists and tuples (not iterators). They also aren't as efficient as they can be, since they access the sequence many times instead of iterating over it once.
Here's a version which supports iterators and only iterates over the input once, creating a list of lists:
def blockify(iterator, blocksize):
"""Split the items in the given iterator into blocksize-sized lists.
If the number of items in the iterator doesn't divide by blocksize,
a smaller block containing the remaining items is added to the result.
"""
blocks = []
for index, item in enumerate(iterator):
if index % blocksize == 0:
block = []
blocks.append(block)
block.append(item)
return blocks
And now an iterator version which returns an iterator of tuples, doesn't have a memory overhead, and allows choosing whether to include the remainder. Note that the output can be converted into a list via list(blockify(...)).
from itertools import islice
def blockify(iterator, blocksize, include_remainder=True):
"""Split the items in the given iterator into blocksize-sized tuples.
If the number of items in the iterator doesn't divide by blocksize and
include_remainder is True, a smaller block containing the remaining items
is added to the result; if include_remainder is False the remaining items
are discarded.
"""
iterator = iter(iterator) # we need an actual iterator
while True:
block = tuple(islice(iterator, blocksize))
if len(block) < blocksize:
if len(block) > 0 and include_remainder:
yield block
break
yield block
[a[n*k:n*(k+1)] for k in range(0,len(a)/n+1)]