Make this simple for/if block more pythonic - python

I need to store in a list the indexes of those values in 3 lists which exceed a given maximum limit. This is what I got:
# Data lists.
a = [3,4,5,12,6,8,78,5,6]
b = [6,4,1,2,8,784,43,6,2]
c = [8,4,32,6,1,7,2,9,23]
# Maximum limit.
max_limit = 20.
# Store indexes in list.
indexes = []
for i, a_elem in enumerate(a):
if a_elem > max_limit or b[i] > max_limit or c[i] > max_limit:
indexes.append(i)
This works but I find it quite ugly. How can I make it more elegant/pythonic?

You could replace your for loop with:
indexes = []
for i, triplet in enumerate(zip(a, b, c)):
if any(e > max_limit for e in triplet):
indexes.append(i)
... which you could then reduce to a list comprehension:
indexes = [i for i, t in enumerate(zip(a, b, c)) if any(e > max_limit for e in t)]
... although that seems a little unwieldy to me - this is really about personal taste, but I prefer to keep listcomps simple; the three-line for loop is clearer in my opinion.
As pointed out by user2357112, you can reduce the apparent complexity of the list comprehension with max():
indexes = [i for i, t in enumerate(zip(a, b, c)) if max(t) > max_limit]
... although this won't short-circuit in the same way that the any() version (and your own code) does, so will probably be slightly slower.

You could try
if max(a_elem, b[i], c[i]) > max_limit:
indexes.append(i)
The logic here is finding out if any one of these three values needs to be greater than max_limit. If the greatest element of these three is greater than max_limit, your condition is satisfied.

I like the exceeders = line best myself
import collections
# Data lists.
a = [3,4,5,12,6,8,78,5,6]
b = [6,4,1,2,8,784,43,6,2]
c = [8,4,32,6,1,7,2,9,23]
Triad = collections.namedtuple('Triad', 'a b c')
triads = [Triad(*args) for args in zip(a, b, c)]
triads = [t for t in zip(a, b, c)] # if you don't need namedtuple
# Maximum limit.
max_limit = 20.
# Store indexes in list.
indexes = [for i, t in enumerate(triads) if max(t) > max_limit]
print indexes
# store the bad triads themselves in a list for
# greater pythonic
exceeders = [t for t in triads if max(t) > max_limit]
print exceeder
As I commented above, using parallel arrays to represent data that are related makes simple code much less simple than it need be.
added in response to comment
Perhaps I gave you too many alternatives, so I shall give only one way instead. One feature that all of the answers have in common is that they fuse the separate "data lists" into rows using zip:
triads = [t for t in zip(a, b, c)]
exceeders = [t for t in triads if max(t) > max_limit]
That's it: two lines. The important point is that storing the index of anything in a list is a C-style way of doing things and you asked for a Pythonic way. Keeping a list of indices means that anytime you want to do something with the data at that index, you have to do an indirection. After those two lines execute, exceeders has the value:
[(5, 1, 32), (8, 784, 7), (78, 43, 2), (6, 2, 23)]
Where each member of the list has the "column" of your three data rows that was found to exceed your limit.
Now you might say "but I really wanted the indices instead". If that is so, there is another part of your problem which you didn't show us which also relies on list indexing. If so, you are still doing things in a C/C++/Java way and "Pythonic" will remain evasive.

>>> maximums = map(max, zip(a, b, c))
>>> [i for i, num in enumerate(maximums) if num > max_limit]
[2, 5, 6, 8]
Old answer
Previously, I posted the mess below. The list comp above is much more manageable.
>>> next(zip(*filter(lambda i: i[1] > max_limit, enumerate(map(max, zip(a, b, c))))))
(2, 5, 6, 8)

m = lambda l: [i for i, e in enumerate(l) if e>max_limit]
indexes = sorted(set(m(a) + m(b) + m(c)))

Related

generate list of no repeated tuples and doesn't have both of (a,b) and (b,a) tuples, python

How can I generate a list of tuples whose elements are not repeated? In addition, if there is (a,b) tuple in list, (b,a) would not be generated in this list.
I use code below from here, but it doesn't provide second condition:
[tuple(i) for i in np.random.randint(5242, size=(500,2))]
I'm not sure you're going to get any kind of one-liner to do that cleanly. I'd just do something like:
num_set = set()
while len(num_set) < 500:
a, b = random.randint(0, 5242), random.randint(0, 5242)
if (b, a) not in num_set:
num_set.add((a, b))
num_list = list(num_set)
Looks like you're interested in something more like a set of sets rather than purely tuples. If your objects are sortable, you can use this common hack:
included_set = set()
included_list = list()
input_list = np.random.randint(5242, size=(500,2))
for (a, b) in input_list:
sorted_version = tuple(sorted((a, b)))
if sorted_version not in included_set:
included_set.add((a, b))
included_list.append((a, b))
If your objects are not sortable, but are hashable and comparable, you could tweak the above to work anyway:
for (a, b) in input_list:
if (a, b) not in included_set and (b, a) not in included_set:
included_set.add((a, b))
included_list.append((a, b))
Note that you only need to keep separate included_list and included_set if you want to retain the ordering of the input list. If not, and if you don't care about the tuple ordering (a, b), just use:
uniques = {tuple(sorted(tup)) for tup in input_list}
you could just use random.sample from the python built-in library:
nums = random.sample(range(5242), 1000)
res = [tuple(v) for v in zip(nums[::2], nums[1::2])]

How does exact list comparison work?

I was making a code in which I needed to compare two lists for exact matches, and I found this code (I had to add print so it would output the result):
a = [1,2,3,4,5]
b = [9,8,7,6,5]
print [i for i, j in zip(a, b) if i == j]
This code outputs [5] because it prints the list, and if I change the code to
a = [1,2,3,5,4]
b = [9,8,7,6,5]
print [i for i, j in zip(a, b) if i == j]
It outputs [] because the list is empty.
This is all fine and good because it solves my list comparison problem, but I have almost no idea why or how it works. I would greatly appreciate either a detailed or partial explanation if you have one.
That's called a "list comprehension", let's break it down here:
[i for i, j in zip(a, b) if i == j] can be roughly understood as making a list of [i], where i (and j) come from zip(a, b) but only if i == j.
zip(a, b) takes two arrays, a and b, and combines them in such a way that the end result looks like [(1, 9), (2, 8), ...]
So effectively, you're processing the result of zip(a, b) and iterating over tuples i, j to return only if i == j is "truthy". Explaining "truthy" is a bit out of scope for this answer, but in this case, the expression i == j evaluates to True if i and j have the same value. I.e., 5 == 5 is True, where 5 == 4 is False.
This is a list comprehension, which is a python syntax feature. You can read about it in the python tutorial here.
It is being combined with the zip() built-in function. Documentation for that function is here.
A tl;dr is: the zip function makes pairs from both lists, and the list comprehension will filter out the pairs that don't compare equal, selecting only the pairs that match
To get the code working with the random indexing of the elements, you can try the following:
a = [1,2,3,5,4]
b = [9,8,7,6,5]
print [i for i in a for j in b if i == j]
This will output:
[5]

Fixing a 4 nested for loops in Python

So i'm trying to implement the agglomerative clustering algorithm and to check the distances between each cluster i use this:
a, b = None, None
c = max
for i in range(len(map)-1):
for n in range(len(map[i])):
for j in range(i+1, len(map)):
for m in range(len(map[j])):
//dist is distance func.
d = dist(map[i][n], map[j][m])
if c > d:
a, b, c = i, j, d
print(a, ' ', b)
return a, b
map looks like this: { 0: [[1,2,3], [2,2,2]], 1: [[3,3,3]], 2: [[4,4,4], [5,5,5]] }
What I expect from this is for each row item to compare with every row/col of every other row. So something like this:
comparisons:
[1,2,3] and [3,3,3], [1,2,3] and [4,4,4], [1,2,3] and [5,5,5], [2,2,2] and [3,3,3] and so on
When I run this it only works 1 time and fails any subsequent try after at line 6 with KeyError.
I suspect that the problem is either here or in merging clusters.
If map is a dict of values, you have a general problem with your indexing:
for m in range(len(map[j])):
You use range() to create numerical indices. However, what you need j to be in this example is a valid key of the dictionary map.
EDIT:
That is - of course - assuming that you did not use 0-based incremented integers as the key of map, in which cause you might as well have gone with a list. In general you seem to be relying on the ordering provided in a list or OrderedDict (or dict in Python3.6+ as an implementation detail). See for j in range(i+1, len(map)): as a good example. Therefore I would advise using a list.
EDIT 2: Alternatively, create a list of the map.keys() and use it to index the map:
a, b = None, None
c = max
keys = list(map.keys())
for i in range(len(map)-1):
for n in range(len(map[keys[i]])):
for j in range(i+1, len(map)):
for m in range(len(map[keys[j]])):
#dist is distance func.
d = dist(map[keys[i]][n], map[keys[j]][m])
if c > d:
a, b, c = i, j, d
print(a, ' ', b)
return a, b
Before accessing to map[j] check is it valid or not like:
if j in map.keys():
#whatever
or put it in try/except:
try:
#...
except KeyError:
#....
Edit:
its better to use for loop like this:
for i in map.keys():
#.....

removing duplicate tuples in python

I have a list of 50 numbers, [0,1,2,...49] and I would like to create a list of tuples without duplicates, where i define (a,b) to be a duplicate of (b,a). Similarly, I do not want tuples of the form (a,a).
I have this:
pairs = set([])
mylist = range(0,50)
for i in mylist:
for j in mylist:
pairs.update([(i,j)])
set((a,b) if a<=b else (b,a) for a,b in pairs)
print len(pairs)
>>> 2500
I get 2500 whereas I expect to get, i believe, 1225 (n(n-1)/2).
What is wrong?
You want all combinations. Python provides a module, itertools, with all sorts of combinatorial utilities like this. Where you can, I would stick with using itertool, it almost certainly faster and more memory efficient than anything you would cook up yourself. It is also battle-tested. You should not reinvent the wheel.
>>> import itertools
>>> combs = list(itertools.combinations(range(50),2))
>>> len(combs)
1225
>>>
However, as others have noted, in the case where you have a sequence (i.e. something indexable) such as a list, and you want N choose k, where k=2 the above could simply be implemented by a nested for-loop over the indices, taking care to generate your indices intelligently:
>>> result = []
>>> for i in range(len(numbers)):
... for j in range(i + 1, len(numbers)):
... result.append((numbers[i], numbers[j]))
...
>>> len(result)
1225
However, itertool.combinations takes any iterable, and also takes a second argument, r which deals with cases where k can be something like 7, (and you don't want to write a staircase).
Your approach essentially takes the cartesian product, and then filters. This is inefficient, but if you wanted to do that, the best way is to use frozensets:
>>> combinations = set()
>>> for i in numbers:
... for j in numbers:
... if i != j:
... combinations.add(frozenset([i,j]))
...
>>> len(combinations)
1225
And one more pass to make things tuples:
>>> combinations = [tuple(fz) for fz in combinations]
Try This,
pairs = set([])
mylist = range(0,50)
for i in mylist:
for j in mylist:
if (i < j):
pairs.append([(i,j)])
print len(pairs)
problem in your code snippet is that you filter out unwanted values but you don't assign back to pairs so the length is the same... also: this formula yields the wrong result because it considers (20,20) as valid for instance.
But you should just create the proper list at once:
pairs = set()
for i in range(0,50):
for j in range(i+1,50):
pairs.add((i,j))
print (len(pairs))
result:
1225
With that method you don't even need a set since it's guaranteed that you don't have duplicates in the first place:
pairs = []
for i in range(0,50):
for j in range(i+1,50):
pairs.append((i,j))
or using list comprehension:
pairs = [(i,j) for i in range(0,50) for j in range(i+1,50)]

Sort list of lists by unique reversed absolute condition

Context - developing algorithm to determine loop flows in a power flow network.
Issue:
I have a list of lists, each list represents a loop within the network determined via my algorithm. Unfortunately, the algorithm will also pick up the reversed duplicates.
i.e.
L1 = [a, b, c, -d, -a]
L2 = [a, d, c, -b, -a]
(Please note that c should not be negative, it is correct as written due to the structure of the network and defined flows)
Now these two loops are equivalent, simply following the reverse structure throughout the network.
I wish to retain L1, whilst discarding L2 from the list of lists.
Thus if I have a list of 6 loops, of which 3 are reversed duplicates I wish to retain all three.
Additionally, The loop does not have to follow the format specified above. It can be shorter, longer, and the sign structure (e.g. pos pos pos neg neg) will not occur in all instances.
I have been attempting to sort this by reversing the list and comparing the absolute values.
I am completely stumped and any assistance would be appreciated.
Based upon some of the code provided by mgibson I was able to create the following.
def Check_Dup(Loops):
Act = []
while Loops:
L = Loops.pop()
Act.append(L)
Loops = Popper(Loops, L)
return Act
def Popper(Loops, L):
for loop in Loops:
Rev = loop[::-1]
if all (abs(x) == abs(y) for x, y in zip(loop_check, Rev)):
Loops.remove(loop)
return Loops
This code should run until there are no loops left discarding the duplicates each time. I'm accepting mgibsons answers as it provided the necessary keys to create the solution
I'm not sure I get your question, but reversing a list is easy:
a = [1,2]
a_rev = a[::-1] #new list -- if you just want an iterator, reversed(a) also works.
To compare the absolute values of a and a_rev:
all( abs(x) == abs(y) for x,y in zip(a,a_rev) )
which can be simplified to:
all( abs(x) == abs(y) for x,y in zip(a,reversed(a)) )
Now, in order to make this as efficient as possible, I would first sort the arrays based on the absolute value:
your_list_of_lists.sort(key = lambda x : map(abs,x) )
Now you know that if two lists are going to be equal, they have to be adjacent in the list and you can just pull that out using enumerate:
def cmp_list(x,y):
return True if x == y else all( abs(a) == abs(b) for a,b in zip(a,b) )
duplicate_idx = [ idx for idx,val in enumerate(your_list_of_lists[1:])
if cmp_list(val,your_list_of_lists[idx]) ]
#now remove duplicates:
for idx in reversed(duplicate_idx):
_ = your_list_of_lists.pop(idx)
If your (sub) lists are either strictly increasing or strictly decreasing, this becomes MUCH simpler.
lists = list(set( tuple(sorted(x)) for x in your_list_of_lists ) )
I don't see how they can be equivalent if you have c in both directions - one of them must be -c
>>> a,b,c,d = range(1,5)
>>> L1 = [a, b, c, -d, -a]
>>> L2 = [a, d, -c, -b, -a]
>>> L1 == [-x for x in reversed(L2)]
True
now you can write a function to collapse those two loops into a single value
>>> def normalise(loop):
... return min(loop, [-x for x in reversed(L2)])
...
>>> normalise(L1)
[1, 2, 3, -4, -1]
>>> normalise(L2)
[1, 2, 3, -4, -1]
A good way to eliminate duplicates is to use a set, we just need to convert the lists to tuples
>>> L=[L1, L2]
>>> set(tuple(normalise(loop)) for loop in L)
set([(1, 2, 3, -4, -1)])
[pair[0] for pair in frozenset(sorted( (c,negReversed(c)) ) for c in cycles)]
Where:
def negReversed(list):
return tuple(-x for x in list[::-1])
and where cycles must be tuples.
This takes each cycle, computes its duplicate, and sorts them (putting them in a pair that are canonically equivalent). The set frozenset(...) uniquifies any duplicates. Then you extract the canonical element (in this case I arbitrarily chose it to be pair[0]).
Keep in mind that your algorithm might be returning cycles starting in arbitrary places. If this is the case (i.e. your algorithm might return either [1,2,-3] or [-3,1,2]), then you need to consider these as equivalent necklaces
There are many ways to canonicalize necklaces. The above way is less efficient because we don't care about canonicalizing the necklace directly: we just treat the entire equivalence class as the canonical element, by turning each cycle (a,b,c,d,e) into {(a,b,c,d,e), (e,a,b,c,d), (d,e,a,b,c), (c,d,e,a,b), (b,c,d,e,a)}. In your case since you consider negatives to be equivalent, you would turn each cycle into {(a,b,c,d,e), (e,a,b,c,d), (d,e,a,b,c), (c,d,e,a,b), (b,c,d,e,a), (-a,-b,-c,-d,-e), (-e,-a,-b,-c,-d), (-d,-e,-a,-b,-c), (-c,-d,-e,-a,-b), (-b,-c,-d,-e,-a)}. Make sure to use frozenset for performance, as set is not hashable:
eqClass.pop() for eqClass in {frozenset(eqClass(c)) for c in cycles}
where:
def eqClass(cycle):
for rotation in rotations(cycle):
yield rotation
yield (-x for x in rotation)
where rotation is something like Efficient way to shift a list in python but yields a tuple

Categories