I would like to loop through a list checking each item against the one following it.
Is there a way I can loop through all but the last item using for x in y? I would prefer to do it without using indexes if I can.
Note
freespace answered my actual question, which is why I accepted the answer, but SilentGhost answered the question I should have asked.
Apologies for the confusion.
for x in y[:-1]
If y is a generator, then the above will not work.
the easiest way to compare the sequence item with the following:
for i, j in zip(a, a[1:]):
# compare i (the current) to j (the following)
If you want to get all the elements in the sequence pair wise, use this approach (the pairwise function is from the examples in the itertools module).
from itertools import tee, izip, chain
def pairwise(seq):
a,b = tee(seq)
b.next()
return izip(a,b)
for current_item, next_item in pairwise(y):
if compare(current_item, next_item):
# do what you have to do
If you need to compare the last value to some special value, chain that value to the end
for current, next_item in pairwise(chain(y, [None])):
if you meant comparing nth item with n+1 th item in the list you could also do with
>>> for i in range(len(list[:-1])):
... print list[i]>list[i+1]
note there is no hard coding going on there. This should be ok unless you feel otherwise.
To compare each item with the next one in an iterator without instantiating a list:
import itertools
it = (x for x in range(10))
data1, data2 = itertools.tee(it)
data2.next()
for a, b in itertools.izip(data1, data2):
print a, b
This answers what the OP should have asked, i.e. traverse a list comparing consecutive elements (excellent SilentGhost answer), yet generalized for any group (n-gram): 2, 3, ... n:
zip(*(l[start:] for start in range(0, n)))
Examples:
l = range(0, 4) # [0, 1, 2, 3]
list(zip(*(l[start:] for start in range(0, 2)))) # == [(0, 1), (1, 2), (2, 3)]
list(zip(*(l[start:] for start in range(0, 3)))) # == [(0, 1, 2), (1, 2, 3)]
list(zip(*(l[start:] for start in range(0, 4)))) # == [(0, 1, 2, 3)]
list(zip(*(l[start:] for start in range(0, 5)))) # == []
Explanations:
l[start:] generates a a list/generator starting from index start
*list or *generator: passes all elements to the enclosing function zip as if it was written zip(elem1, elem2, ...)
Note:
AFAIK, this code is as lazy as it can be. Not tested.
Related
This is just a general algorithmic question, but I was wondering if there is a more efficient/faster way of comparing every element of a list to every other element of a list without duplicate checking.
I have an implementation which is straightforward
for item in list:
for comparator in list:
if comparator == item:
pass
else:
# do something...
But the problem with this is that there's going to be a duplicate check & doesn't scale well. Is there an methodology that can do this more efficiently or more quickly?
We can do better than just double loop and avoid comparing with the same index. We can also eliminate 'mirrored checks' (e.g. if we check a==b we don't need to check b==a). This is what combinations from itertools produces. So each time we add another item to the input we get another n-1 possible comparisons.
The example below generates the combinations of length 2 and creates tuples of the two numbers and the result of an equality check.
>>> from itertools import combinations
>>> [(a, b, a == b) for a, b in combinations([1,2,3,1,2], 2)]
[(1, 2, False), (1, 3, False), (1, 1, True), (1, 2, False), (2, 3, False), (2, 1, False), (2, 2, True), (3, 1, False), (3, 2, False), (1, 2, False)]
To avoid comparing duplicates, use enumerate() and skip the case where the indexes are the same.
for i, item in enumerate(list):
for j, comparator in enumerate(list):
if i == j:
continue
if comparator != item:
# do something
There's no way to avoid the scaling problem -- you have to loop over every pair of elements to achieve your goal. That's inherently O(n^2).
for i in list1:
if i in list2:
continue
else:
# do something
not sure if this is what you need.
Parallel computing using multiprocessing/threading might make it faster.
from multiprocessing.dummy import Pool as ThreadPool
def action(enum_elem):
index, item = enum_elem
for index2, comparator in enumerate(list1):
if index == index2:
continue
if comparator != item:
# do something
pool = ThreadPool(5)
pool.map(action, list(enumerate(list1)))
I'm trying to solve the problem from the Rosalind.
Return: The total number of signed permutations of length n, followed by a list of
all such permutations (you may list the signed permutations in any order).
I have an idea for a solution in Python, but I cannot implement it to the end. Consider for example that n = 2.
numbers = [1, -1, 2, -2]
ordering = permutations(numbers,n)
So now I've got some tuples as a result:
(1, -1) (1, 2) (1, -2) (-1, 1) (-1, 2) (-1, -2) (2, 1) (2, -1) (2, -2) (-2, 1)
(-2, -1) (-2, 2)
I need to exclude those that have elements of equal modulus. For example, (-1, 1). Is it possible to implement this, and if possible, how?
A pythonic solution using list comprehension:
filtered_perms = [(x,y) for x,y in ordering if abs(x) != abs(y)]
Edit:
Code that works fine with python 3.7:
import itertools as itt
# create permutation generator object
perms = itt.permutations([-2, -1, 1, 2], 2)
# here generator is turned into a list with certain permutations excluded
filtered_perms = [(x,y) for x,y in perms if abs(x) != abs(y)]
# print whole list
print(filtered_perms)
# print first permutation
print(filtered_perms[0])
# print length of the list
print(len(filtered_perms))
Edit2:
To fix the problem with no elements in ordering:
ordering = list(itertools.permutations([-2, -1, 1, 2],2))
after that, ordering will be a list of all elements from itertools.permutations.
The solutions proposed before are right, but if in the case that you want to process the resulting list after generating the permutations would be a nice idea to have a generator instead of a list. For that, I recommend you to design your own generator function based on itertools.permutations:
def signed_permutations(iterable, r=2):
for item in permutations(iterable, r):
if abs(item[0]) != abs(item[1]):
yield item
And you can use it as:
for item in signed_permutations(numbers):
do_something(item)
Or if you just want to create a list:
sigperm = list(signed_permutations(numbers))
The filter function is probably what you're looking for.
list(filter(lambda pair: abs(pair[0]) != abs(pair[1]), ordering))
The condition might be wrong, I'm not sure what you mean by equal modulus.
I have a list:
L = [1,2,3,4,5,6,7,8]
I want to iterate consecutive elements in the list such that, when it comes to last element i'e 8 it pairs with the first element 1.
The final output I want is:
[1,2],[2,3],[3,4],[4,5],[5,6],[6,7],[7,8],[8,1]
I tried using this way:
for first,second in zip(L, L[1:]):
print([first,second])
But I am getting only this result:
[1,2],[2,3],[3,4],[4,5],[5,6],[6,7],[7,8]
How do I make a pair of last element with first? I have heard about the negative indexing property of a list.
You can simply extend the second list in zip() with a list with only the first item, something like:
for first, second in zip(L, L[1:] + L[0:1]): # or simply zip(L, L[1:] + L[:1])
print([first, second])
You can use cycle to cycle the lists (in combination with islice to skip the first element):
from itertools import cycle, islice
L = [1,2,3,4,5,6,7,8]
rest_L_cycled = islice(cycle(L), 1, None)
items = zip(L, rest_L_cycled)
print(list(items))
This is easily extensible. Note that it relies on the fact that zip halts on the shorter list (the second argument is an infinite cycle). It also does everything lazily and does not create any intermediate list (well, except for the printed list) :-)
You can just append the front element(s) to the back.
for first,second in zip(L, L[1:] + L[:1]):
print([first,second])
You can also iterate through the indexes of L, and for the index of the second item of the output tuples, simply use the remainder of the length of L:
[(L[i], L[(i + 1) % len(L)]) for i in range(len(L))]
You can simply concatenate the list resulting from zip(L, L[1:]) with the pair formed by the last element L[-1] and first one L[0] and iterate over the result
for first,second in zip(L, L[1:]) + [(L[-1],L[0])]:
print ([first,second])
It gives the desired outcome.
It's not a one-liner, but:
>>> L = [1,2,3,4,5,6,7,8]
>>> z = list(zip(L[:-1], L[1:]))
>>> z.append((L[-1], L[0]))
>>> z
[(1, 2), (2, 3), (3, 4), (4, 5), (5, 6), (6, 7), (7, 8), (8, 1)]
for i in range(len(L)):
first = L[i]
second = L[(i+1) % len(L)]
You can simply use itertools.zip_longest with a fillvalue of the first item.
from itertools import zip_longest
list(map(tuple, zip_longest(L, L[1:], fillvalue=L[0]))
This is a version (over-)using itertools:
from itertools import islice, cycle
L = [1,2,3,4,5,6,7,8]
for a, b in zip(L, islice(cycle(L), 1, None)):
print(a, b)
The idea is to cycle over the second argument - that way zip runs until L itself is exhausted. This does not create any new lists.
I dont know how to properly ask this question but here is what I am trying to do.
lists = []
for x in range(3):
for y in range(3):
if x!=y:
lists.append([x,y])
Is there a simple solution so it doesnt give me lists that are the same but reversed:
for example [2,0] and [0,2]?
I know I could go through the lists and remove them afterwards but is there a solution to not even make the list? (sorry my english isnt perfect)
You can use itertools.combinations
>>> from itertools import combinations
>>> list(combinations(range(3), 2))
[(0, 1), (0, 2), (1, 2)]
With the above example we take any combination of two elements from range(3) without repeating any elements.
Sure: if you add all pairs with y > x instead of all possible pairs, only one of each pair (x, y) and (y, x) will appear.
lists = []
for x in range(3):
for y in range(x + 1, 3):
lists.append([x,y])
If you don't want those "duplicates", you want a combination
a combination is a way of selecting items from a collection, such that (unlike permutations) the order of selection does not matter
>>> import itertools
>>> list(itertools.combinations(iterable=range(3), r=2))
[(0, 1), (0, 2), (1, 2)]
Above I have used combinations() from the Python module itertools.
Explanation
I've set r=2 because you want a subsequence length of 2 (the form you described as [x, y])
The iterable=range(3) parameter is just a list of elements that are going to be used to make combinations of, so range(3) would result in [0, 1, 2]
The list() applied to the end result is simply to force the output to be printed out to the console because otherwise itertools.combinations returns an iterable that you iterate through to pull the elements one by one.
Easy:
for x in range(3):
for y in range(x, 3):
lists.append([x,y])
One frequently finds expressions of this type in python questions on SO. Either for just accessing all items of the iterable
for i in range(len(a)):
print(a[i])
Which is just a clumbersome way of writing:
for e in a:
print(e)
Or for assigning to elements of the iterable:
for i in range(len(a)):
a[i] = a[i] * 2
Which should be the same as:
for i, e in enumerate(a):
a[i] = e * 2
# Or if it isn't too expensive to create a new iterable
a = [e * 2 for e in a]
Or for filtering over the indices:
for i in range(len(a)):
if i % 2 == 1: continue
print(a[i])
Which could be expressed like this:
for e in a [::2]:
print(e)
Or when you just need the length of the list, and not its content:
for _ in range(len(a)):
doSomethingUnrelatedToA()
Which could be:
for _ in a:
doSomethingUnrelatedToA()
In python we have enumerate, slicing, filter, sorted, etc... As python for constructs are intended to iterate over iterables and not only ranges of integers, are there real-world use-cases where you need in range(len(a))?
If you need to work with indices of a sequence, then yes - you use it... eg for the equivalent of numpy.argsort...:
>>> a = [6, 3, 1, 2, 5, 4]
>>> sorted(range(len(a)), key=a.__getitem__)
[2, 3, 1, 5, 4, 0]
Short answer: mathematically speaking, no, in practical terms, yes, for example for Intentional Programming.
Technically, the answer would be "no, it's not needed" because it's expressible using other constructs. But in practice, I use for i in range(len(a) (or for _ in range(len(a)) if I don't need the index) to make it explicit that I want to iterate as many times as there are items in a sequence without needing to use the items in the sequence for anything.
So: "Is there a need?"? — yes, I need it to express the meaning/intent of the code for readability purposes.
See also: https://en.wikipedia.org/wiki/Intentional_programming
And obviously, if there is no collection that is associated with the iteration at all, for ... in range(len(N)) is the only option, so as to not resort to i = 0; while i < N; i += 1 ...
What if you need to access two elements of the list simultaneously?
for i in range(len(a[0:-1])):
something_new[i] = a[i] * a[i+1]
You can use this, but it's probably less clear:
for i, _ in enumerate(a[0:-1]):
something_new[i] = a[i] * a[i+1]
Personally I'm not 100% happy with either!
Going by the comments as well as personal experience, I say no, there is no need for range(len(a)). Everything you can do with range(len(a)) can be done in another (usually far more efficient) way.
You gave many examples in your post, so I won't repeat them here. Instead, I will give an example for those who say "What if I want just the length of a, not the items?". This is one of the only times you might consider using range(len(a)). However, even this can be done like so:
>>> a = [1, 2, 3, 4]
>>> for _ in a:
... print True
...
True
True
True
True
>>>
Clements answer (as shown by Allik) can also be reworked to remove range(len(a)):
>>> a = [6, 3, 1, 2, 5, 4]
>>> sorted(range(len(a)), key=a.__getitem__)
[2, 3, 1, 5, 4, 0]
>>> # Note however that, in this case, range(len(a)) is more efficient.
>>> [x for x, _ in sorted(enumerate(a), key=lambda i: i[1])]
[2, 3, 1, 5, 4, 0]
>>>
So, in conclusion, range(len(a)) is not needed. Its only upside is readability (its intention is clear). But that is just preference and code style.
Sometimes matplotlib requires range(len(y)), e.g., while y=array([1,2,5,6]), plot(y) works fine, scatter(y) does not. One has to write scatter(range(len(y)),y). (Personally, I think this is a bug in scatter; plot and its friends scatter and stem should use the same calling sequences as much as possible.)
It's nice to have when you need to use the index for some kind of manipulation and having the current element doesn't suffice. Take for instance a binary tree that's stored in an array. If you have a method that asks you to return a list of tuples that contains each nodes direct children then you need the index.
#0 -> 1,2 : 1 -> 3,4 : 2 -> 5,6 : 3 -> 7,8 ...
nodes = [0,1,2,3,4,5,6,7,8,9,10]
children = []
for i in range(len(nodes)):
leftNode = None
rightNode = None
if i*2 + 1 < len(nodes):
leftNode = nodes[i*2 + 1]
if i*2 + 2 < len(nodes):
rightNode = nodes[i*2 + 2]
children.append((leftNode,rightNode))
return children
Of course if the element you're working on is an object, you can just call a get children method. But yea, you only really need the index if you're doing some sort of manipulation.
Sometimes, you really don't care about the collection itself. For instance, creating a simple model fit line to compare an "approximation" with the raw data:
fib_raw = [1, 1, 2, 3, 5, 8, 13, 21] # Fibonacci numbers
phi = (1 + sqrt(5)) / 2
phi2 = (1 - sqrt(5)) / 2
def fib_approx(n): return (phi**n - phi2**n) / sqrt(5)
x = range(len(data))
y = [fib_approx(n) for n in x]
# Now plot to compare fib_raw and y
# Compare error, etc
In this case, the values of the Fibonacci sequence itself were irrelevant. All we needed here was the size of the input sequence we were comparing with.
If you have to iterate over the first len(a) items of an object b (that is larger than a), you should probably use range(len(a)):
for i in range(len(a)):
do_something_with(b[i])
I have an use case I don't believe any of your examples cover.
boxes = [b1, b2, b3]
items = [i1, i2, i3, i4, i5]
for j in range(len(boxes)):
boxes[j].putitemin(items[j])
I'm relatively new to python though so happy to learn a more elegant approach.
Very simple example:
def loadById(self, id):
if id in range(len(self.itemList)):
self.load(self.itemList[id])
I can't think of a solution that does not use the range-len composition quickly.
But probably instead this should be done with try .. except to stay pythonic i guess..
One problem with for i, num in enumerate(a) is that num does not change when you change a[i]. For example, this loop:
for i, num in enumerate(a):
while num > 0:
a[i] -= 1
will never end.
Of course, you could still use enumerate while swapping each use of num for a[i], but that kind of defeats the whole purpose of enumerate, so using for i in range(len(a)) just becomes more logical and readable.
Having a range of indices is useful for some more sophisticated problems in combinatorics. For example, to get all possible partitions of a list into three non-empty sections, the most straightforward approach is to find all possible combinations of distinct endpoints between the first and second section and between the second and third section. This is equivalent to ordered pairs of integers chosen from the valid indices into the list (except zero, since that would make the first partition empty). Thus:
>>> from itertools import combinations
>>> def three_parts(sequence):
... for i, j in combinations(range(1, len(sequence)), 2):
... yield (sequence[:i], sequence[i:j], sequence[j:])
...
>>> list(three_parts('example'))
[('e', 'x', 'ample'), ('e', 'xa', 'mple'), ('e', 'xam', 'ple'), ('e', 'xamp', 'le'), ('e', 'xampl', 'e'), ('ex', 'a', 'mple'), ('ex', 'am', 'ple'), ('ex', 'amp', 'le'), ('ex', 'ampl', 'e'), ('exa', 'm', 'ple'), ('exa', 'mp', 'le'), ('exa', 'mpl', 'e'), ('exam', 'p', 'le'), ('exam', 'pl', 'e'), ('examp', 'l', 'e')]
My code is:
s=["9"]*int(input())
for I in range(len(s)):
while not set(s[I])<=set('01'):s[i]=input(i)
print(bin(sum([int(x,2)for x in s]))[2:])
It is a binary adder but I don't think the range len or the inside can be replaced to make it smaller/better.
I think it's useful for tqdm if you have a large loop and you want to track progress. This will output a progress bar:
from tqdm import tqdm
empty_list = np.full(len(items), np.nan)
for i in tqdm(range(len(items))):
empty_list[i] = do_something(items[i])
This will not show progress, at least in the case I was using it for:
empty_list = np.full(len(items), np.nan)
for i, _ in tqdm(enumerate(items)):
empty_list[i] = do_something(items[i])
Just showed number of iterations. Not as helpful.