Count elements in a nested list in an elegant way - python

I have nested tuples in a list like
l = [(1, 'a', 'b'), (2, 'b', 'c'), (3, 'e', 'a')]
I want to know how many 'a' and 'b' in the list in total. So I currently use the following code to get the result.
amount_a_and_b = len([None for _, elem2, elem3 in l if elem2 == 'a' or elem3 == 'b'])
But I got amount_a_and_b = 1, so how to get the right answer?
Also, is there a more elegant way (less code or higher performance or using builtins) to do this?

I'd flatten the list with itertools.chain.from_iterable() and pass it to a collections.Counter() object:
from collections import Counter
from itertools import chain
counts = Counter(chain.from_iterable(l))
amount_a_and_b = counts['a'] + counts['b']
Or use sum() to count how many times a value appears in the flattened sequence:
from itertools import chain
amount_a_and_b = sum(1 for v in chain.from_iterable(l) if v in {'a', 'b'})
The two approaches are pretty much comparable in speed on Python 3.5.1 on my Macbook Pro (OS X 10.11):
>>> from timeit import timeit
>>> from collections import Counter
>>> from itertools import chain
>>> l = [(1, 'a', 'b'), (2, 'b', 'c'), (3, 'e', 'a')] * 1000 # make it interesting
>>> def counter():
... counts = Counter(chain.from_iterable(l))
... counts['a'] + counts['b']
...
>>> def summing():
... sum(1 for v in chain.from_iterable(l) if v in {'a', 'b'})
...
>>> timeit(counter, number=1000)
0.5640139860006457
>>> timeit(summing, number=1000)
0.6066895100011607

You want to avoid putting data in a datastructure. The [...] syntax constructs a new list and fills it with the content you put in ... , after which the length of the array is taken and the array is never used. If the list if very large, this uses a lot of memory, and it is inelegant in general. You can also use iterators to loop over the existing data structure, e.g., like so:
sum(sum(c in ('a', 'b') for c in t) for t in l)
The c in ('a', 'b') predicate is a bool which evaluates to a 0 or 1 when cast to an int, causing the sum() to only count the tuple entry if the predicate evaluates to True.

Just for fun, functional method using reduce:
>>> l = [(1, 'a', 'b'), (2, 'b', 'c'), (3, 'e', 'a')]
>>> from functools import reduce
>>> reduce(lambda x, y: (1 if 'a' in y else 0) + (1 if 'b' in y else 0) + x, l, 0)
4

You can iterate over both the list and the sub-lists in one list comprehension:
len([i for sub_list in l for i in sub_list if i in ("a", "b")])
I think that's fairly concise.
To avoid creating a temporary list, you could use a generator expression to create a sequence of 1s and pass that to sum:
sum(1 for sub_list in l for i in sub_list if i in ("a", "b"))

Although this question already has an accepted answer, just wondering why all of them as so complex. I would think that this would suffice.
>>> l = [(1, 'a', 'b'), (2, 'b', 'c'), (3, 'e', 'a')]
>>> total = sum(tup.count('a') + tup.count('b') for tup in l)
>>> total
4
Or
>>> total = sum(1 for tup in l for v in tup if v in {'a', 'b'})

Related

Merge elements in list of lists

I have a list of lists like this:
A = [('b', 'a', 'a', 'a', 'a'), ('b', 'a', 'a', 'a', 'a')]
How can I merge the all elements of each inner list to get result A = ['baaaa', 'baaaa']?
I would prefer to do this outside of a loop, if possible, to speed up the code.
If you don't want to write a loop you can use map and str.join
>>> list(map(''.join, A))
['baaaa', 'baaaa']
However, the loop using a list comprehension is almost as short to write, and I think is clearer:
>>> [''.join(e) for e in A]
['baaaa', 'baaaa']
You can use str.join:
>>> ["".join(t) for t in A]
['baaaa', 'baaaa']
>>>
>>>
>>> list(map(''.join, A) #with map
['baaaa', 'baaaa']
>>>
>>> help(str.join)
Help on method_descriptor:
join(...)
S.join(iterable) -> str
Return a string which is the concatenation of the strings in the
iterable. The separator between elements is S.
>>>
Use the join method of the empty string. This means: "make a string concatenating every element of a tuple (for example ('b', 'a', 'a', 'a', 'a') ) with '' (empty string) between each of them.
Thus, what you are looking for is:
[''.join(x) for x in A]
If you prefer functional programming. You can use the function reduce. Here is how you can achieve the same result using reduce function as follows.
Note that, reduce was a built in function in python 2.7 but in python
3 it is moved to library functools
from functools import reduce
It is only required to import reduce if you are using python 3 else no need to import reduce from functools
A = [('b', 'a', 'a', 'a', 'a'), ('b', 'a', 'a', 'a', 'a')]
result = [reduce(lambda a, b: a+b, i) for i in A]
If you don't want to use loop or even list comprehension, here is another way
list(map(lambda i: reduce(lambda a, b: a+b, i), A))

How to get a split up a list of numbers and insert into another list

Currently I have a file with 6 rows of numbers and each row containing 9 numbers. The point is to test each row of numbers in the file if it completes a magic square. So for example, say a row of numbers from the file is 4 3 8 9 5 1 2 7 6. The first three numbers need to be the first row in a matrix. The next three numbers need to be the second row, and same for the third.
Therefore you would need to end up with a matrix of:
[['4','3','8'],['9','5','1'],['2','7','6']]
I need to test the matrix to see if it is a valid magic square (Rows add up to 15, columns add to 15, and diagonals add to 15).
My code is currently:
def readfile(fname):
"""Return a list of lines from the file"""
f = open(fname, 'r')
lines = f.read()
lines = lines.split()
f.close()
return lines
def assignValues(lines):
magicSquare = []
rows = 3
columns = 3
for row in range(rows):
magicSquare.append([0] * columns)
for row in range(len(magicSquare)):
for column in range(len(magicSquare[row])):
magicSquare[row][column] = lines[column]
return magicSquare
def main():
lines = readfile(input_fname)
matrix = assignValues(lines)
print(matrix)
Whenever I run my code to test it, I'm getting:
[['4', '3', '8'], ['4', '3', '8'], ['4', '3', '8']]
So as you can see I am only getting the first 3 numbers into my matrix.
Finally, my question is how would I go by continuing my matrix with the following 6 numbers of the line of numbers? I'm not sure if it is something I can do in my loop, or if I am splitting my lines wrong, or am I completely on the wrong track?
Thanks.
To test if each row in your input file contains magic square data you need to re-organize the code slightly. I've used a different technique to Francis to fill the matrix. It might be a bit harder to understand how zip(*[iter(seq)] * size) works, but it's a very useful pattern. Please let me know if you need an explanation for it.
My code uses a list of tuples for the matrix, rather than a list of lists, but tuples are more suitable here anyway, since the data in the matrix doesn't need to be modified. Also, I convert the input data from str into int, since you need to do arithmetic on the numbers to test if matrix is a magic square.
#! /usr/bin/env python
def make_square(seq, size):
return zip(*[iter(seq)] * size)
def main():
fname = 'mydata'
size = 3
with open(fname, 'r') as f:
for line in f:
nums = [int(s) for s in line.split()]
matrix = make_square(nums, size)
print matrix
#Now call the function to test if the data in matrix
#really is a magic square.
#test_square(matrix)
if __name__ == '__main__':
main()
Here's a modified version of make_square() that returns a list of lists instead of a list of tuples, but please bear in mind that a list of tuples is actually better than a list of lists if you don't need the mutability that lists give you.
def make_square(seq, size):
square = zip(*[iter(seq)] * size)
return [list(t) for t in square]
I suppose I should mention that there's actually only one possible 3 x 3 magic square that uses all the numbers from 1 to 9, not counting rotations and reflections. But I guess there's no harm in doing a brute-force demonstration of that fact. :)
Also, I have Python code that I wrote years ago (when I was first learning Python) which generates magic squares of size n x n for odd n >= 5. Let me know if you'd like to see it.
zip and iterator objects
Here's some code that briefly illustrates what the zip() and iter() functions do.
''' Fun with zip '''
numbers = [1, 2, 3, 4, 5, 6]
letters = ['a', 'b', 'c', 'd', 'e', 'f']
#Using zip to create a list of tuples containing pairs of elements of numbers & letters
print zip(numbers, letters)
#zip works on other iterable objects, including strings
print zip(range(1, 7), 'abcdef')
#zip can handle more than 2 iterables
print zip('abc', 'def', 'ghi', 'jkl')
#zip can be used in a for loop to process two (or more) iterables simultaneously
for n, l in zip(numbers, letters):
print n, l
#Using zip in a list comprehension to make a list of lists
print [[l, n] for n, l in zip(numbers, letters)]
#zip stops if one of the iterables runs out of elements
print [[n, l] for n, l in zip((1, 2), letters)]
print [(n, l) for n, l in zip((3, 4), letters)]
#Turning an iterable into an iterator object using the iter function
iletters = iter(letters)
#When we take some elements from an iterator object it remembers where it's up to
#so when we take more elements from it, it continues from where it left off.
print [[n, l] for n, l in zip((1, 2, 3), iletters)]
print [(n, l) for n, l in zip((4, 5), iletters)]
#This list will just contain a single tuple because there's only 1 element left in iletters
print [(n, l) for n, l in zip((6, 7), iletters)]
#Rebuild the iletters iterator object
iletters = iter('abcdefghijkl')
#See what happens when we zip multiple copies of the same iterator object.
print zip(iletters, iletters, iletters)
#It can be convenient to put multiple copies of an iterator object into a list
iletters = iter('abcdefghijkl')
gang = [iletters] * 3
#The gang consists of 3 references to the same iterator object
print gang
#We can pass each iterator in the gang to zip as a separate argument
#by using the "splat" syntax
print zip(*gang)
#A more compact way of doing the same thing:
print zip(* [iter('abcdefghijkl')]*3)
Here's the same code running in the interactive interpreter so you can easily see the output of each statement.
>>> numbers = [1, 2, 3, 4, 5, 6]
>>> letters = ['a', 'b', 'c', 'd', 'e', 'f']
>>>
>>> #Using zip to create a list of tuples containing pairs of elements of numbers & letters
... print zip(numbers, letters)
[(1, 'a'), (2, 'b'), (3, 'c'), (4, 'd'), (5, 'e'), (6, 'f')]
>>>
>>> #zip works on other iterable objects, including strings
... print zip(range(1, 7), 'abcdef')
[(1, 'a'), (2, 'b'), (3, 'c'), (4, 'd'), (5, 'e'), (6, 'f')]
>>>
>>> #zip can handle more than 2 iterables
... print zip('abc', 'def', 'ghi', 'jkl')
[('a', 'd', 'g', 'j'), ('b', 'e', 'h', 'k'), ('c', 'f', 'i', 'l')]
>>>
>>> #zip can be used in a for loop to process two (or more) iterables simultaneously
... for n, l in zip(numbers, letters):
... print n, l
...
1 a
2 b
3 c
4 d
5 e
6 f
>>> #Using zip in a list comprehension to make a list of lists
... print [[l, n] for n, l in zip(numbers, letters)]
[['a', 1], ['b', 2], ['c', 3], ['d', 4], ['e', 5], ['f', 6]]
>>>
>>> #zip stops if one of the iterables runs out of elements
... print [[n, l] for n, l in zip((1, 2), letters)]
[[1, 'a'], [2, 'b']]
>>> print [(n, l) for n, l in zip((3, 4), letters)]
[(3, 'a'), (4, 'b')]
>>>
>>> #Turning an iterable into an iterator object using using the iter function
... iletters = iter(letters)
>>>
>>> #When we take some elements from an iterator object it remembers where it's up to
... #so when we take more elements from it, it continues from where it left off.
... print [[n, l] for n, l in zip((1, 2, 3), iletters)]
[[1, 'a'], [2, 'b'], [3, 'c']]
>>> print [(n, l) for n, l in zip((4, 5), iletters)]
[(4, 'd'), (5, 'e')]
>>>
>>> #This list will just contain a single tuple because there's only 1 element left in iletters
... print [(n, l) for n, l in zip((6, 7), iletters)]
[(6, 'f')]
>>>
>>> #Rebuild the iletters iterator object
... iletters = iter('abcdefghijkl')
>>>
>>> #See what happens when we zip multiple copies of the same iterator object.
... print zip(iletters, iletters, iletters)
[('a', 'b', 'c'), ('d', 'e', 'f'), ('g', 'h', 'i'), ('j', 'k', 'l')]
>>>
>>> #It can be convenient to put multiple copies of an iterator object into a list
... iletters = iter('abcdefghijkl')
>>> gang = [iletters] * 3
>>>
>>> #The gang consists of 3 references to the same iterator object
... print gang
[<iterator object at 0xb737eb8c>, <iterator object at 0xb737eb8c>, <iterator object at 0xb737eb8c>]
>>>
>>> #We can pass each iterator in the gang to zip as a separate argument
... #by using the "splat" syntax
... print zip(*gang)
[('a', 'b', 'c'), ('d', 'e', 'f'), ('g', 'h', 'i'), ('j', 'k', 'l')]
>>>
>>> #A more compact way of doing the same thing:
... print zip(* [iter('abcdefghijkl')]*3)
[('a', 'b', 'c'), ('d', 'e', 'f'), ('g', 'h', 'i'), ('j', 'k', 'l')]
>>>
it only gets the first 3 column always because
magicSquare[row][column] = lines[column]
thus
def assignValues(lines):
magicSquare = []
rows = 3
columns = 3
for row in range(rows):
magicSquare.append([0] * columns)
for line in range((sizeof(lines)/9)) #since the input is already split this means that the size of 'lines' divided by 9 is equal to the number of rows of numbers
for row in range(len(magicSquare)):
for column in range(len(magicSquare[row])):
magicSquare[row][column] = lines[(9*line)+(3*row)+column]
return magicSquare
note that (3*row)+column will move to it 3 columns to the right every iteration
and that (9*line)+(3*row)+column will move to it 9 columns (a whole row) to the right every iteration
once you get this you are now ready to process in finding out for the magic square
def testMagicSquare(matrix):
rows = 3
columns = 3
for a in len(matrix)
test1 = 0
test2 = 0
test3 = 0
for b in range(3)
if(sum(matrix[a][b])==15) test1=1 #flag true if whole row is 15 but turns false if a row is not 15
else test1=0
if((matrix[a][0][b]+matrix[a][1][b]+matrix[a][2][b])==15) test2=1 #flag true if column is 15 but turns false if a column is not 15
else test2=0
if(((matrix[a][0][0]+matrix[a][1][1]+matrix[a][2][2])==15) and
((matrix[a][0][2]+matrix[a][1][1]+matrix[a][2][0])==15)) test3=1 #flag true if diagonal is 15 but turns false if diagonal is not 15
else test3=0
if(test1>0 and test2>0 and test3>0) println('line ' + a + ' is a magic square')
else println('line ' + a + ' is not a magic square')

finding and moving a tuple in a list of tuples

What is the most efficient way of finding a certain tuple based on e.g. the second element of that tuple in a list and move that tuple to the top of the list
Something of the form:
LL=[('a','a'),('a','b'),('a','c'),('a','d')]
LL.insert(0,LL.pop(LL.index( ... )))
where I would like something in index() that would give me the position of the tuple that has 'c' as second element.
Is there a classic python 1-line approach to do that?
>>> LL.insert(0,LL.pop([x for x, y in enumerate(LL) if y[1] == 'c'][0]))
>>> LL
[('a', 'c'), ('a', 'a'), ('a', 'b'), ('a', 'd')]
>>>
To find position you can:
positions = [i for i, tup in enumerate(LL) if tup[1] == 'c']
You can now take the index of the desired element, pop it and push to the beginning of the list
pos = positions[0]
LL.insert(0, LL.pop(pos))
But you can also sort your list using the item in the tuple as key:
sorted(LL, key=lambda tup: tup[1] == 'c', reverse=True)
if you don't care about order of the other elements
2 lines, however 1 line solutions are all inefficient
>>> LL=[('a','a'),('a','b'),('a','c'),('a','d')]
>>> i = next((i for i, (x, y) in enumerate(LL) if y == 'c'), 0) # 0 default index
>>> LL[0], LL[i] = LL[i], LL[0]
>>> LL
[('a', 'c'), ('a', 'b'), ('a', 'a'), ('a', 'd')]
This does nothing if the index is not found
>>> LL=[('a','a'),('a','b'),('a','c'),('a','d')]
>>> i = next((i for i, (x, y) in enumerate(LL) if y == 'e'), 0) # 0 default index
>>> LL[0], LL[i] = LL[i], LL[0]
>>> LL
[('a', 'a'), ('a', 'b'), ('a', 'c'), ('a', 'd')]
The problem with terse, pythonic, 'fancy schmancy' solutions is the code might not be easily maintained and/or reused in other closely aligned contexts.
It seems best to just use 'boiler plate' code to do the search, and then continue with the application specific requirements.
So here is an example of easy to understand search code that can be easily 'plugged into' when these questions come up, including those situations when we need to know if the key is found.
def searchTupleList(list_of_tuples, coord_value, coord_index):
for i in range(0, len(list_of_tuples)):
if list_of_tuples[i][coord_index] == coord_value:
return i # matching index in list
return -1 # not found

1d list indexing python: enhance MaskableList

A common problem of mine is the following:
As input I have (n is some int >1)
W = numpy.array(...)
L = list(...)
where
len(W) == n
>> true
shape(L)[0] == n
>> true
And I want to sort the list L regarding the values of W and a comparator. My idea was to do the following:
def my_zip_sort(W,L):
srt = argsort(W)
return zip(L[srt],W[srt])
This should work like this:
a = ['a', 'b', 'c', 'd']
b = zeros(4)
b[0]=3;b[1]=2;b[2]=[1];b[3]=4
my_zip_sort(a,b)
>> [(c,1)(b,2)(a,3)(d,4)]
But this does not, because
TypeError: only integer arrays with one element can be converted to an index
thus, I need to do another loop:
def my_zip_sort(W,L):
srt = argsort(W)
res = list()
for i in L:
res.append((L[srt[i]],W[srt[i]]))
return res
I found a thread about a MaskableList, but this does not work for me (as you can read in the comments), because I would not only need to hold or discard particular values of my list, but also need to re-order them:
a.__class__
>> msk.MaskableList
srt = argsort(b)
a[srt]
>> ['a', 'b', 'd']
Concluding:
I want to find a way to sort a list of objects by constraints in an array. I found a way myself, which is kind of nice except for the list-indexing. Can you help me to write a class that works likewise to MaskableList for this task, which has a good performance?
You don't need to extend list do avoid the for-loop. A list-comprehension is sufficient and probably the best you can do here, if you expect a new list of tuples:
def my_zip_sort(W, L):
srt = argsort(W)
return [(L[i], W[i]) for i in srt]
Example:
n = 5
W = np.random.randint(10,size=5)
L = [chr(ord('A') + i) for i in W]
L # => ['A', 'C', 'H', 'G', 'C']
srt = np.argsort(W)
result = [(L[i], W[i]) for i in srt]
print result
[('A', 0), ('C', 2), ('C', 2), ('G', 6), ('H', 7)]

Building a list of tuples from two lists

I wrote this function:
def buildAllPairs(l1, l2):
l=[]
for s in l1:
for p in l2:
l.append((s, p))
return l
but it works only when i use numbers in the lists, for the alphabet comes out NameError, could somebody tell me why is this happening?
Use the itertools.product function:
>>> import itertools
>>> list(itertools.product([1, 'a'], [2, 'b']))
[(1, 2), (1, 'b'), ('a', 2), ('a', 'b')]
Note that itertools.product() itself returns an itertools.product object, essentially a generator, instead of a list.

Categories