Split list into dictionary - python

I am trying to split my list into a dictionary and I wrote a function for it. It takes in a list and gets the list length. If the length of the list is 253 , it creates a dictionary with 26 keys - this is calculated by rounding up the number to a higher 10 (250 will create 25 keys, 251-259 will create 26 keys). Each key will store a chuck of the original list (as of now I am storing 10 elements per list). I feel
def limit_files(file_list, at_a_time=10):
l = file_list
n = len(l)
d, r = divmod(n, at_a_time)
num_keys = d + 1 if r else d
slice = n // num_keys
vals = (l[i:i+slice] for i in range(0, n, slice+1))
dct = dict(zip(range(1,num_keys+1),vals))
return (dct)
I just wanted to know if there is way to improve the code

You can use itertools.izip_longest to group items in your list as equally divided chunks.
import itertools
def limit_files(file_list, at_a_time=10):
d, r = divmod(len(file_list), at_a_time)
num_keys = d + 1 if r > 0 else d
chunks = itertools.izip_longest(*([iter(x)] * at_a_time))
return dict((x, y) for x, y in enumerate(chunks))
Note that, it will be padding with None values to fill out the last chunk.
>>> limit_files([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
{0: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
1: (11, 12, None, None, None, None, None, None, None, None)}
Alternatively, without using padding values, you can just iterate through the range and call itertools.islice until the iterator exhausts as follows:
def limit_files(file_list, at_a_time=10):
d, r = divmod(len(file_list), at_a_time)
num_keys = d + 1 if r > 0 else d
iterator = iter(file_list)
dictionary = {}
for chunk_id in xrange(num_keys):
chunk = list(itertools.islice(iterator, at_a_time))
if not chunk:
break
dictionary[chunk_id] = chunk
return dictionary
>>> limit_files([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
{0: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 1: [11, 12]}

Related

Watching a counter, tally total and counting missed counts

I am attempting to create a piece of code that will watch a counter with an output something like:
a1 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30]
I want the code to be able to tally the total and tell me how many counts are missed for example if this happened:
a1 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12,13,14,15,16,17,18,19,20,21,22,23,24, 25, 26, 5, 6, 7, 8, 9, 10, 11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,1,2]
I would get a total of 92 still, but get feedback that 8 are missing.
I have gotten very close with the following code:
Blk_Tot = 0
CBN = 0
LBN = 0
x = 0
y = 0
z = 0
MissedBlocks = 0
for i in range(len(a1)):
CBN = a1[i]
if CBN - LBN <= 0:
if LBN == 30:
y = 30 - abs(CBN - LBN)
elif LBN < 30:
z = 30 - LBN
y = 30 - abs(CBN - LBN) + z
print(z)
Blk_Tot = Blk_Tot + y
else:
x = CBN - LBN
Blk_Tot = Blk_Tot + x
if x > 1:
MissedBlocks = MissedBlocks - 1 + x
LBN = CBN
print(Blk_Tot)
print(MissedBlocks)
If I delete anywhere between 1 and 30 it works perfectly, however if I delete across 30, say 29,30,1,2 it breaks.I don't expect it to be able to miss 30 in a row and still be able to come up with a proper total however.
Anyone have any ideas on how this might be achieved? I feel like I am missing an obvious answer :D
Sorry I think I was unclear, a1 is a counter coming from an external device that counts from 1 to 30 and then wraps around to 1 again. Each count is actually part of a message to show that the message was received; so say 1 2 4, I know that the 3rd message is missing. What I am trying to do is found out the total that should have been recieved and how many are missing from the count.
Update after an idea from the posts below, another method of doing this maybe:
Input:
123456
List[1,2,3,4,5,6]
1.Check first input to see which part of the list it is in and start from there (in case we don't start from zero)
2.every time an input is received check if that matches the next value in the array
3.if not then how many steps does it take to find that value
You don't need to keep track if you past the 30 line.
Just compare with the ideal sequence and count the missing numbers.
No knowledge if parts missing at the end.
No knowledge if more than 30 parts are missing in a block.
from itertools import cycle
def idealSeqGen():
for i in cycle(range(1,31)):
yield(i)
def receivedSeqGen():
a1 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,
5, 6, 7, 8, 9, 10, 11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,
1,2]
for i in a1:
yield(i)
receivedSeq = receivedSeqGen()
idealSeq = idealSeqGen()
missing = 0
ideal = next(idealSeq)
try:
while True:
received = next(receivedSeq)
while received != ideal:
missing += 1
ideal = next(idealSeq)
ideal = next(idealSeq)
except StopIteration:
pass
print (f'There are {missing} items missing')
Edit
The loop part can be a little bit simpler
missing = 0
try:
while True:
ideal = next(idealSeq)
received = next(receivedSeq)
while received != ideal:
missing += 1
ideal = next(idealSeq)
except StopIteration:
pass
print (f'There are {missing} items missing')
In general, if you want to count the number of differences between two lists, you can easily use a dictionary. The other answer would also work, but it is highly inefficient for even slightly larger lists.
def counter(lst):
# create a dictionary with count of each element
d = {}
for i in lst:
if d.get(i, None):
d[i] += 1
else:
d[i] = 1
return d
def compare(d1, d2):
# d1 and d2 are dictionaries
ans = 0
for i in d1.values():
if d2.get(i, None):
# comapares the common values in both lists
ans += abs(d1[i]-d2[i])
d2[i] = 0
else:
#for elements only in the first list
ans += d1[i]
for i in d2.values():
# for elements only in the second list
if d2[i]>0:
ans += d2[i]
return ans
l1 = [...]
l2 = [...]
print(compare(counter(l1), counter(l2)))
New code to check for missing elements from a repeating sequence pattern
Now that I have understood your question more clearly, here's the code. The assumption in this code is the list will always be in ascending order from 1 thru 30 and then repeats again from 1. There can be missing elements between 1 and 30 but the order will always be in ascending order between 1 and 30.
If the source data is as shown in list a1, then the code will result in 8 missing elements.
a1 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,
5, 6, 7, 8, 9, 10, 11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,
1,2]
a2 = a1.copy()
c = 1
missing = 0
while a2:
if a2[0] == c:
c+=1
a2.pop(0)
elif a2[0] > c:
missing +=1
c+=1
elif a2[0] < c:
missing += 31-c
c = 1
if c == 31: c=1
print (f'There are {missing} items missing in the list')
The output of this will be:
There are 8 items missing in the list
Let me know if this addresses your question
earlier code to compare two lists
You cannot use set as the items are repeated. So you need to sort them and find out how many times each element is in both lists. The difference will give you the missing counts. You may have an element in a1 but not in a2 or vice versa. So finding out the absolute count of missing items will give you the results.
I will update the response with better variables in my next update.
Here's how I did it:
code with comments:
a1 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30]
a2 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,
5, 6, 7, 8, 9, 10, 11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,1,2]
#step 1: Find out which list is longer. We will use that as the master list
if len(a1) > len(a2):
master_list = a1.copy()
second_list = a2.copy()
else:
master_list = a2.copy()
second_list = a1.copy()
#step 2: We must sort both master and second list
# so we can compare against each other
master_list.sort()
second_list.sort()
#set the counter to zero
missing = 0
#iterate through the master list and find all values in master against second list
#for each iteration, remove the value in master[0] from both master and second list
#when you have iterated through the full list, you will get an empty master_list
#this will help you to use while statement to iterate until master_list is empty
while master_list:
#pick the first element of master list to search for
x = master_list[0]
#count the number of times master_list[0] is found in both master and second list
a_count = master_list.count(x)
b_count = second_list.count(x)
#absolute difference of both gives you how many are missing from each other
#master may have 4 occurrences and second may have 2 occurrences. abs diff is 2
#master may have 2 occurrences and second may have 5 occurrences. abs diff is 3
missing += abs(a_count - b_count)
#now remove all occurrences of master_list[0] from both master and second list
master_list = [i for i in master_list if i != x]
second_list = [i for i in second_list if i != x]
#iterate until master_list is empty
#you may end up with a few items in second_list that are not found in master list
#add them to the missing items list
#thats your absolute total of all missing items between lists a1 and a2
#if you want to know the difference between the bigger list and shorter one,
#then don't add the missing items from second list
missing += len(second_list)
#now print the count of missig elements between the two lists
print ('Total number of missing elements are:', missing)
The output from this is:
Total number of missing elements are: 7
If you want to find out which elements are missing, then you need to add a few more lines of code.
In the above example, elements 27,28,29,30, 4, 5 are missing from a2 and 31 from a1. So total number of missing elements is 7.
code without comments:
a1 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30]
a2 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,
5, 6, 7, 8, 9, 10, 11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,1,2]
if len(a1) > len(a2):
master_list = a1.copy()
second_list = a2.copy()
else:
master_list = a2.copy()
second_list = a1.copy()
master_list.sort()
second_list.sort()
missing = 0
while master_list:
x = master_list[0]
a_count = master_list.count(x)
b_count = second_list.count(x)
missing += abs(a_count - b_count)
master_list = [i for i in master_list if i != x]
second_list = [i for i in second_list if i != x]
missing += len(second_list)
print ('Total number of missing elements are:', missing)

How to account for ties in returning top n values?

I have a dictionary of top values that I have collected from Twitter, here's a small example:
{'aberdeen': 5,
'amsterdam': 6,
'amsterdam?fussa': 6,
'anchorage': 12,
'andalucia?granada': 5,
'ann arbor': 6,
'aral': 6,
'arlington': 6,
'asia?london': 6,
I have two functions of returning the top 'n' results:
def top_items(item_counts, n=3):
counts = Counter(item_counts)
most_common = counts.most_common(n)
for item in zip(*most_common):
return list(item)
and
def top_items2(item_counts, n=3):
top_count = sorted(item_counts.items(), key=lambda x: x[1], reverse=True)[:n]
return [x[0] for x in top_count]
Both retrieve the top n results. However, I'm having issues trying to account for ties. In this example, the top cities are ['los angeles', 'chicago', 'denver'], however denver and nyc both have a count of 8, but it returns denver because its first alphabetically.
How could I edit my code to return the keys of my dictionary of the top n values including ties?
Thanks!
def top_items(item_counts, n=3):
counts = Counter(item_counts)
top_n_values = set([x[1] for x in c.most_common(n)])
output = {k:v for k,v in d.items() if v in top_n_values}
return sorted(output, key=output.get, reverse=True)
d = {'aberdeen': 5,
'amsterdam': 6,
'amsterdam?fussa': 6,
'anchorage': 12,
'andalucia?granada': 5,
'ann arbor': 6,
'aral': 6,
'arlington': 6,
'asia?london': 6,}
top_items(d)
Output
['anchorage',
'amsterdam',
'amsterdam?fussa',
'ann arbor',
'aral',
'arlington',
'asia?london']
You can first find the top values by sorting and compare them to all the values:
di = {'aberdeen': 5,
'amsterdam': 6,
'amsterdam?fussa': 6,
'anchorage': 12,
'andalucia?granada': 5,
'ann arbor': 6,
'aral': 6,
'arlington': 6,
'asia?london': 6
}
def getTopResults(di: dict, n):
final = {}
topValues = sorted(di.values())[:n+1]
for k, v in di.items():
if v in topValues:
final[k] = v
return final
print(getTopResults(di, 2)) # {'aberdeen': 5, 'amsterdam': 6, 'amsterdam?fussa': 6, 'andalucia?granada': 5, 'ann arbor': 6, 'aral': 6, 'arlington': 6, 'asia?london': 6}
This function does same in a recursive way, assuming input list is already sorted in asc or desc order:
get_next(3, [])
get_next(3, [1,2,3,4,5])
get_next(3, [1,1,1,1,1,2,2,3,4,5,6])
get_next(3, [1,2,2,2,2,3,4,5])
def get_next(n,xs, last=None):
if len(xs)>0:
head, *tail=xs
if head == last:
n+=1 # need to go 1 further if tie
if n>0:
return [head] + get_next(n-1, tail, head)
return []

Python adding a list to a slice of another list

Here's basic problem:
>>> listb = [ 1, 2, 3, 4, 5, 6, 7 ]
>>> slicea = slice(2,5)
>>> listb[slicea]
[3, 4, 5]
>>> lista = listb[slicea]
>>> lista
[3, 4, 5]
>>> listb[slicea] += lista
>>> listb
[1, 2, 3, 4, 5, 3, 4, 5, 6, 7]
listb should be
[1, 2, 6, 8, 10, 6, 7]
But 3, 4, 5 was inserted after 3, 4, 5 not added to it.
tl;dr
I have this code that's not working:
self.lib_tree.item(song)['values'][select_values] = adj_list
self.lib_tree.item(album)['values'][select_values] += adj_list
self.lib_tree.item(artist)['values'][select_values] += adj_list
The full code is this:
def toggle_select(self, song, album, artist):
# 'values' 0=Access, 1=Size, 2=Selected Size, 3=StatTime, 4=StatSize,
# 5=Count, 6=Seconds, 7=SelSize, 8=SelCount, 9=SelSeconds
# Set slice to StatSize, Count, Seconds
total_values = slice(4, 7) # start at index, stop before index
select_values = slice(7, 10) # start at index, stop before index
tags = self.lib_tree.item(song)['tags']
if "songsel" in tags:
# We will toggle off and subtract from selected parent totals
tags.remove("songsel")
self.lib_tree.item(song, tags=(tags))
# Get StatSize, Count and Seconds
adj_list = [element * -1 for element in \
self.lib_tree.item(song)['values'][total_values]]
else:
tags.append("songsel")
self.lib_tree.item(song, tags=(tags))
# Get StatSize, Count and Seconds
adj_list = self.lib_tree.item(song)['values'][total_values] # 1 past
self.lib_tree.item(song)['values'][select_values] = adj_list
self.lib_tree.item(album)['values'][select_values] += adj_list
self.lib_tree.item(artist)['values'][select_values] += adj_list
if self.debug_toggle < 10:
self.debug_toggle += 1
print('artist,album,song:',self.lib_tree.item(artist, 'text'), \
self.lib_tree.item(album, 'text'), \
self.lib_tree.item(song, 'text'))
print('adj_list:',adj_list)
The adj_list has the correct values showing up in debug.
How do I add a list of values to the slice of a list?
The behavior you want is not a feature of any Python built-in type; + with built-in sequences means concatenation, not element-wise addition. But numpy arrays will do what you want, so I'd suggest looking into numpy. Simple example:
>>> import numpy as np
>>> a = np.array([2,3,4], dtype=np.int64)
>>> b = np.array([5,6,7], dtype=np.int64)
>>> a += b
>>> a
array([ 7, 9, 11])
>>> print(a)
[ 7 9 11]
>>> print(a.tolist())
[7, 9, 11]
Note that the output (both repr and str forms) looks a little different from Python lists, but you can convert back to a plain Python list if needed.

Various list concatenation method and their performance

I was working on an algorithm and in that, we are trying to write every line in the code such that it adds up a good performance to the final code.
In one situation we have to add lists (more than two specifically). I know some of the ways to join more than two lists also I have looked upon StackOverflow but none of the answers are giving account on the performance of the method.
Can anyone show, what are the ways we can join more than two lists and their respective performance?
Edit : The size of the list is varying from 2 to 13 (to be specific).
Edit Duplicate : I have been specifically asking for the ways we can add and their respected questions and in duplicate question its limited to only 4 methods
There are multiples ways using which you can join more than two list.
Assuming that we have three list,
a = ['1']
b = ['2']
c = ['3']
Then, for joining two or more lists in python,
1)
You can simply concatenate them,
output = a + b + c
2)
You can do it using list comprehension as well,
res_list = [y for x in [a,b,c] for y in x]
3)
You can do it using extend() as well,
a.extend(b)
a.extend(c)
print(a)
4)
You can do it by using * operator as well,
res = [*a,*b,*c]
For calculating performance, I have used timeit module present in python.
The performance of the following methods are;
4th method < 1st method < 3rd method < 2nd [method on the basis of
time]
That means If you are going to use " * operator " for concatenation of more than two lists then you will get the best performance.
Hope you got what you were looking for.
Edit:: Image showing performance of all the methods (Calculated using timeit)
I did some simple measurements, here are my results:
import timeit
from itertools import chain
a = [*range(1, 10)]
b = [*range(1, 10)]
c = [*range(1, 10)]
tests = ("""output = list(chain(a, b, c))""",
"""output = a + b + c""",
"""output = [*chain(a, b, c)]""",
"""output = a.copy();output.extend(b);output.extend(c);""",
"""output = [*a, *b, *c]""",
"""output = a.copy();output+=b;output+=c;""",
"""output = a.copy();output+=[*b, *c]""",
"""output = a.copy();output += b + c""")
results = sorted((timeit.timeit(stmt=test, number=1, globals=globals()), test) for test in tests)
for i, (t, stmt) in enumerate(results, 1):
print(f'{i}.\t{t}\t{stmt}')
Prints on my machine (AMD 2400G, Python 3.6.7):
1. 6.010000106471125e-07 output = [*a, *b, *c]
2. 7.109999842214165e-07 output = a.copy();output += b + c
3. 7.720000212430023e-07 output = a.copy();output+=b;output+=c;
4. 7.820001428626711e-07 output = a + b + c
5. 1.0520000159885967e-06 output = a.copy();output+=[*b, *c]
6. 1.4030001693754457e-06 output = a.copy();output.extend(b);output.extend(c);
7. 1.4820000160398195e-06 output = [*chain(a, b, c)]
8. 2.525000127207022e-06 output = list(chain(a, b, c))
If you are going to concatenate a variable number of lists together, your input is going to be a list of lists (or some equivalent collection). The performance tests need to take this into account because you are not going to be able to do things like list1+list2+list3.
Here are some test results (1000 repetitions):
option1 += loop 0.00097 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4]
option2 itertools.chain 0.00138 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4]
option3 functools.reduce 0.00174 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4]
option4 comprehension 0.00188 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4]
option5 extend loop 0.00127 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4]
option6 deque 0.00180 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4]
This would indicate that a += loop through the list of lists is the fastest approach
And the source to produce them:
allLists = [ list(range(10)) for _ in range(5) ]
def option1():
result = allLists[0].copy()
for lst in allLists[1:]:
result += lst
return result
from itertools import chain
def option2(): return list(chain(*allLists))
from functools import reduce
def option3():
return list(reduce(lambda a,b:a+b,allLists))
def option4(): return [ e for l in allLists for e in l ]
def option5():
result = allLists[0].copy()
for lst in allLists[1:]:
result.extend(lst)
return result
from collections import deque
def option6():
result = deque()
for lst in allLists:
result.extend(lst)
return list(result)
from timeit import timeit
count = 1000
t = timeit(lambda:option1(), number = count)
print(f"option1 += loop {t:.5f}",option1()[:15])
t = timeit(lambda:option2(), number = count)
print(f"option2 itertools.chain {t:.5f}",option2()[:15])
t = timeit(lambda:option3(), number = count)
print(f"option3 functools.reduce {t:.5f}",option3()[:15])
t = timeit(lambda:option4(), number = count)
print(f"option4 comprehension {t:.5f}",option4()[:15])
t = timeit(lambda:option5(), number = count)
print(f"option5 extend loop {t:.5f}",option5()[:15])
t = timeit(lambda:option6(), number = count)
print(f"option6 deque {t:.5f}",option6()[:15])

Compare equal length lists to find positions that share the same element

I want to compare a list of lists that have the same length, but differ in their content. My script should return only the positions that share exactly the same element (in all lists).
For example:
l = [[1,2,3,4,5,6,7,8],[9,8,8,4,3,4,5,7,8],[5,6,7,4,9,9,9,8],[0,0,1,4,7,6,3,8]]
and as a result I get a list of positions p = [3,7] as in all list we have '4' and '8' at positions 3 and 7, respectively.
These elements can be strings as well, I'm just giving an example with integers. Thanks for any help!
l = [[1,2,3,4,5,6,7,8],[9,8,8,4,3,4,5,7,8],[5,6,7,4,9,9,9,8],[0,0,1,4,7,6,3,8]]
p = [i for i, j in enumerate(zip(*l)) if all(j[0]==k for k in j[1:])]
# p == [3] - because of some typo in your original list, probably too many elements in the second list.
This is just the one-liner (list comprehension) version of this, more verbose:
p = []
for i, j in enumerate(zip(*l)):
if all(j[0]==k for k in j[1:]):
p.append(i)
zip(*l) gives you:
[(1, 9, 5, 0),
(2, 8, 6, 0),
(3, 8, 7, 1),
(4, 4, 4, 4),
(5, 3, 9, 7),
(6, 4, 9, 6),
(7, 5, 9, 3),
(8, 7, 8, 8)]
enumerate() puts numbers 0, 1, 2, ... to each tuple within that list.
all(j[0]==k for k in j[1:]) compares the first element of the tuple with all remaining elements and returns True if all of them are equal, False otherwise (it returns False as soon as it finds a different element, so it's faster)
I liked eumiro solution, but I did with a set
p = [i for i, j in enumerate(zip(*l)) if len(set(j)) == 1]
l = [[1,2,3,4,5,6,7,8],[9,8,8,4,3,4,5,7,8],[5,6,7,4,9,9,9,8],[0,0,1,4,7,6,3,8]]
r = []
for i in range(len(l[0])):
e = l[0][i]
same = True
for j in range(1, len(l)):
if e != l[j][i]:
same = False
break
if same:
r.append(i)
print r
prints only [3], as l[1] does not have 8 at position 7. It have one more element.
li = [[1,2,3,4,5,6,7,8],[9,8,8,4,3,6,5,8],[5,6,7,4,9,9,9,8],[0,0,1,4,7,6,3,8]]
first = li[0]
r = range(len(first))
for current in li[1:]:
r = [ i for i in r if current[i]==first[i]]
print [first[i] for i in r]
result
[4, 8]
.
Comparing execution's times:
from time import clock
li = [[1,2,3,4,5,6,7,8,9,10],
[9,8,8,4,5,6,5,8,9,13],
[5,6,7,4,9,9,9,8,9,12],
[0,0,1,4,7,6,3,8,9,5]]
n = 10000
te = clock()
for turn in xrange(n):
first = li[0]
r = range(len(first))
for current in li[1:]:
r = [ i for i in r if current[i]==first[i]]
x = [first[i] for i in r]
t1 = clock()-te
print 't1 =',t1
print x
te = clock()
for turn in xrange(n):
y = [j[0] for i, j in enumerate(zip(*li)) if all(j[0]==k for k in j[1:])]
t2 = clock()-te
print 't2 =',t2
print y
print 't2/t1 =',t2/t1
print
result
t1 = 0.176347273187
[4, 8, 9]
t2 = 0.579408755442
[4, 8, 9]
t2/t1 = 3.28561221827
.
With
li = [[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,2,22,26,24,25],
[9,8,8,4,5,6,5,8,9,13,18,12,15,14,15,15,4,16,19,20,2,158,35,24,13],
[5,6,7,4,9,9,9,8,9,12,45,12,4,19,15,20,24,18,19,20,2,58,23,24,25],
[0,0,1,4,7,6,3,8,9,5,12,12,12,15,15,15,5,3,14,20,9,18,28,24,14]]
result
t1 = 0.343173188632
[4, 8, 9, 12, 15, 20, 24]
t2 = 1.21259110432
[4, 8, 9, 12, 15, 20, 24]
t2/t1 = 3.53346690385

Categories