Compare rotated lists in python - python

I'm trying to compare two lists to determine if one is a rotation (cyclic permutation) of the other, e.g.:
a = [1, 2, 3]
b = [1, 2, 3] or [2, 3, 1] or [3, 1, 2]
are all matches, whereas:
b = [3, 2, 1] is not
To do this I've got the following code:
def _matching_lists(a, b):
return not [i for i, j in zip(a,b) if i != j]
def _compare_rotated_lists(a, b):
rotations = [b[i:] + b[:i] for i in range(len(b))]
matches = [i for i in range(len(rotations)) if _matching_lists(a, rotations[i])]
return matches
This builds a list of all possible rotations of b and then compares each one. Is it possible to do this without building the intermediate list? Performance isn't important since the lists will typically only be four items long. My primary concern is clarity of code.
The lists will always have the same length.
Best answer (keeping the list of matching rotations) seems to be:
def _compare_rotated_lists(a, b):
return [i for i in range(len(b)) if a == b[i:] + b[:i]]

You don't need the function _matching_lists, as you can just use ==:
>>> [1,2,3] == [1,2,3]
True
>>> [1,2,3] == [3,1,2]
False
I suggest using any() to return as soon a match is found, and using a generator expression to avoid constructing a list of rotations in memory:
def _compare_rotated_lists(a, b):
"""Return `True` if the list `a` is equal to a rotation of the list `b`."""
return any(a == b[i:] + b[:i] for i in range(len(b)))
You might consider checking that the lists are the same length, to reject the easy case quickly.
return len(a) == len(b) and any(a == b[i:] + b[:i] for i in range(len(b)))
As discussed in comments, if you know that the elements of a and b are hashable, you can do the initial comparison using collections.Counter:
return Counter(a) == Counter(b) and any(a == b[i:] + b[:i] for i in range(len(b)))
and if you know that the elements of a and b are comparable, you can do the initial comparison using sorted:
return sorted(a) == sorted(b) and any(a == b[i:] + b[:i] for i in range(len(b)))

If I understood correctly, you want to find if b is a permutation of a, but not a reversed? There's a very simple, readable, and general solution:
>>> from itertools import permutations
>>> a = (1, 2, 3)
>>> b = (3, 1, 2)
>>> c = (3, 2, 1)
>>> results = set(permutations(a)) - set((a, tuple(sorted(a, reverse=True))))
>>> b in results
True
>>> c in results
False

How about:
def canon(seq):
n = seq.index(min(seq))
return seq[n:] + seq[:n]
def is_rotation(a, b):
return canon(a) == canon(b)
print is_rotation('abcd', 'cdab') # True
print is_rotation('abcd', 'cdba') # False
No need to generate all rotations just to find out if two lists are rotation of each other.

I tested this code with a few examples, and it worked well.
def compare(a,b):
firstInA = a[0]
firstInB = b.index(firstInA)
secondInA = a[1]
secondInB = b.index(secondInA)
if (secondInB == firstInB + 1) or (secondInB == 0 and firstInB == 2):
return True
else:
return False
I tried:
a = [1,2,3]
b = [1,2,3]
print(compare(a,b))
c = [1,2,3]
d = [3,1,2]
print(compare(c,d))
e = [1,2,3]
f = [3,2,1]
print(compare(e,f))
They returned True,True,False
This only works with lists of size 3. If you want more, within the if statement, add a thirdInA and thirdInB, and you will always need to have one less than the length of the list, because if you know all but one is in place, then there is only on spot left for the last to be.

Related

Is there a short way to check uniqueness of values without using 'if' and multiple 'and's?

I am writing some code and I need to compare some values. The point is that none of the variables should have the same value as another. For example:
a=1
b=2
c=3
if a != b and b != c and a != c:
#do something
Now, it is easy to see that in a case of code with more variables, the if statement becomes very long and full of ands. Is there a short way to tell Python that no 2 variable values should be the same.
You can try making sets.
a, b, c = 1, 2, 3
if len({a,b,c}) == 3:
# Do something
If your variables are kept as a list, it becomes even more simple:
a = [1,2,3,4,4]
if len(set(a)) == len(a):
# Do something
Here is the official documentation of python sets.
This works only for hashable objects such as integers, as given in the question. For non-hashable objects, see #chepner's more general solution.
This is definitely the way you should go for hashable objects, since it takes O(n) time for the number of objects n. The combinatorial method for non-hashable objects take O(n^2) time.
Assuming hashing is not an option, use itertools.combinations and all.
from itertools import combinations
if all(x != y for x, y in combinations([a,b,c], 2)):
# All values are unique
It depends a bit on the kind of values that you have.
If they are well-behaved and hashable then you can (as others already pointed out) simply use a set to find out how many unique values you have and if that doesn't equal the number of total values you have at least two values that are equal.
def all_distinct(*values):
return len(set(values)) == len(values)
all_distinct(1, 2, 3) # True
all_distinct(1, 2, 2) # False
Hashable values and lazy
In case you really have lots of values and want to abort as soon as one match is found you could also lazily create the set. It's more complicated and probably slower if all values are distinct but it provides short-circuiting in case a duplicate is found:
def all_distinct(*values):
seen = set()
seen_add = seen.add
last_count = 0
for item in values:
seen_add(item)
new_count = len(seen)
if new_count == last_count:
return False
last_count = new_count
return True
all_distinct(1, 2, 3) # True
all_distinct(1, 2, 2) # False
However if the values are not hashable this will not work because set requires hashable values.
Unhashable values
In case you don't have hashable values you could use a plain list to store the already processed values and just check if each new item is already in the list:
def all_distinct(*values):
seen = []
for item in values:
if item in seen:
return False
seen.append(item)
return True
all_distinct(1, 2, 3) # True
all_distinct(1, 2, 2) # False
all_distinct([1, 2], [2, 3], [3, 4]) # True
all_distinct([1, 2], [2, 3], [1, 2]) # False
This will be slower because checking if a value is in a list requires to compare it to each item in the list.
A (3rd-party) library solution
In case you don't mind an additional dependency you could also use one of my libraries (available on PyPi and conda-forge) for this task iteration_utilities.all_distinct. This function can handle both hashable and unhashable values (and a mix of these):
from iteration_utilities import all_distinct
all_distinct([1, 2, 3]) # True
all_distinct([1, 2, 2]) # False
all_distinct([[1, 2], [2, 3], [3, 4]]) # True
all_distinct([[1, 2], [2, 3], [1, 2]]) # False
General comments
Note that all of the above mentioned approaches rely on the fact that equality means "not not-equal" which is the case for (almost) all built-in types but doesn't necessarily be the case!
However I want to point out chepners answers which doesn't require hashability of the values and doesn't rely on "equality means not not-equal" by explicitly checking for !=. It's also short-circuiting so it behaves like your original and approach.
Performance
To get a rough idea about the performance I'm using another of my libraries (simple_benchmark)
I used distinct hashable inputs (left) and unhashable inputs (right). For hashable inputs the set-approaches performed best, while for unhashable inputs the list-approaches performed better. The combinations-based approach seemed slowest in both cases:
I also tested the performance in case there are duplicates, for convenience I regarded the case when the first two elements were equal (otherwise the setup was identical to the previous case):
from iteration_utilities import all_distinct
from itertools import combinations
from simple_benchmark import BenchmarkBuilder
# First benchmark
b1 = BenchmarkBuilder()
#b1.add_function()
def all_distinct_set(values):
return len(set(values)) == len(values)
#b1.add_function()
def all_distinct_set_sc(values):
seen = set()
seen_add = seen.add
last_count = 0
for item in values:
seen_add(item)
new_count = len(seen)
if new_count == last_count:
return False
last_count = new_count
return True
#b1.add_function()
def all_distinct_list(values):
seen = []
for item in values:
if item in seen:
return False
seen.append(item)
return True
b1.add_function(alias='all_distinct_iu')(all_distinct)
#b1.add_function()
def all_distinct_combinations(values):
return all(x != y for x, y in combinations(values, 2))
#b1.add_arguments('number of hashable inputs')
def argument_provider():
for exp in range(1, 12):
size = 2**exp
yield size, range(size)
r1 = b1.run()
r1.plot()
# Second benchmark
b2 = BenchmarkBuilder()
b2.add_function(alias='all_distinct_iu')(all_distinct)
b2.add_functions([all_distinct_combinations, all_distinct_list])
#b2.add_arguments('number of unhashable inputs')
def argument_provider():
for exp in range(1, 12):
size = 2**exp
yield size, [[i] for i in range(size)]
r2 = b2.run()
r2.plot()
# Third benchmark
b3 = BenchmarkBuilder()
b3.add_function(alias='all_distinct_iu')(all_distinct)
b3.add_functions([all_distinct_set, all_distinct_set_sc, all_distinct_combinations, all_distinct_list])
#b3.add_arguments('number of hashable inputs')
def argument_provider():
for exp in range(1, 12):
size = 2**exp
yield size, [0, *range(size)]
r3 = b3.run()
r3.plot()
# Fourth benchmark
b4 = BenchmarkBuilder()
b4.add_function(alias='all_distinct_iu')(all_distinct)
b4.add_functions([all_distinct_combinations, all_distinct_list])
#b4.add_arguments('number of hashable inputs')
def argument_provider():
for exp in range(1, 12):
size = 2**exp
yield size, [[0], *[[i] for i in range(size)]]
r4 = b4.run()
r4.plot()
The slightly neater way is to stick all the variables in a list, then create a new set from the list. If the list and the set aren't the same length, some of the variables were equal, since sets can't contain duplicates:
vars = [a, b, c]
no_dupes = set(vars)
if len(vars) != len(no_dupes):
# Some of them had the same value
This assumes the values are hashable; which they are in your example.
You can use all with list.count as well, it is reasonable, may not be the best, but worth to answer:
>>> a, b, c = 1, 2, 3
>>> l = [a, b, c]
>>> all(l.count(i) < 2 for i in l)
True
>>> a, b, c = 1, 2, 1
>>> l = [a, b, c]
>>> all(l.count(i) < 2 for i in l)
False
>>>
Also this solution works with unhashable objects in the list.
A way that only works with hashable objects in the list:
>>> a, b, c = 1, 2, 3
>>> l = [a, b, c]
>>> len({*l}) == len(l)
True
>>>
Actually:
>>> from timeit import timeit
>>> timeit(lambda: {*l}, number=1000000)
0.5163292075532642
>>> timeit(lambda: set(l), number=1000000)
0.7005311807841572
>>>
{*l} is faster than set(l), more info here.

Create a complement of list preserving duplicate values

Given list a = [1, 2, 2, 3] and its sublist b = [1, 2] find a list complementing b in such a way that sorted(a) == sorted(b + complement). In the example above the complement would be a list of [2, 3].
It is tempting to use list comprehension:
complement = [x for x in a if x not in b]
or sets:
complement = list(set(a) - set(b))
However, both of this ways will return complement = [3].
An obvious way of doing it would be:
complement = a[:]
for element in b:
complement.remove(element)
But that feels deeply unsatisfying and not very Pythonic. Am I missing an obvious idiom or is this the way?
As pointed out below what about performance this is O(n^2) Is there more efficient way?
The only more declarative and thus Pythonic way that pops into my mind and that improves performance for large b (and a) is to use some sort of counter with decrement:
from collections import Counter
class DecrementCounter(Counter):
def decrement(self,x):
if self[x]:
self[x] -= 1
return True
return False
Now we can use list comprehension:
b_count = DecrementCounter(b)
complement = [x for x in a if not b_count.decrement(x)]
Here we thus keep track of the counts in b, for each element in a we look whether it is part of b_count. If that is indeed the case we decrement the counter and ignore the element. Otherwise we add it to the complement. Note that this only works, if we are sure such complement exists.
After you have constructed the complement, you can check if the complement exists with:
not bool(+b_count)
If this is False, then such complement cannot be constructed (for instance a=[1] and b=[1,3]). So a full implementation could be:
b_count = DecrementCounter(b)
complement = [x for x in a if not b_count.decrement(x)]
if +b_count:
raise ValueError('complement cannot be constructed')
If dictionary lookup runs in O(1) (which it usually does, only in rare occasions it is O(n)), then this algorithm runs in O(|a|+|b|) (so the sum of the sizes of the lists). Whereas the remove approach will usually run in O(|a|×|b|).
In order to reduce complexity to your already valid approach, you could use collections.Counter (which is a specialized dictionary with fast lookup) to count items in both lists.
Then update the count by substracting values, and in the end filter the list by only keeping items whose count is > 0 and rebuild it/chain it using itertools.chain
from collections import Counter
import itertools
a = [1, 2, 2, 2, 3]
b = [1, 2]
print(list(itertools.chain.from_iterable(x*[k] for k,x in (Counter(a)-Counter(b)).items() if x > 0)))
result:
[2, 2, 3]
O(n log n)
a = [1, 2, 2, 3]
b = [1, 2]
a.sort()
b.sort()
L = []
i = j = 0
while i < len(a) and j < len(b):
if a[i] < b[j]:
L.append(a[i])
i += 1
elif a[i] > b[j]:
L.append(b[j])
j += 1
else:
i += 1
j += 1
while i < len(a):
L.append(a[i])
i += 1
while j < len(b):
L.append(b[j])
j += 1
print(L)
If the order of elements in the complement doesn't matter, then collections.Counter is all that is needed:
from collections import Counter
a = [1, 2, 3, 2]
b = [1, 2]
complement = list((Counter(a) - Counter(b)).elements()) # complement = [2, 3]
If the order of items in the complement should be the same order as in the original list, then use something like this:
from collections import Counter, defaultdict
from itertools import count
a = [1,2,3,2]
b = [2,1]
c = Counter(b)
d = defaultdict(count)
complement = [x for x in a if next(d[x]) >= c[x]] # complement = [3, 2]
Main idea: if the values are not unique, make them unique
def add_duplicate_position(items):
element_counter = {}
for item in items:
element_counter[item] = element_counter.setdefault(item,-1) + 1
yield element_counter[item], item
assert list(add_duplicate_position([1, 2, 2, 3])) == [(0, 1), (0, 2), (1, 2), (0, 3)]
def create_complementary_list_with_duplicates(a,b):
a = list(add_duplicate_position(a))
b = set(add_duplicate_position(b))
return [item for _,item in [x for x in a if x not in b]]
a = [1, 2, 2, 3]
b = [1, 2]
assert create_complementary_list_with_duplicates(a,b) == [2, 3]

Check if all elements of one array is in another array

I have these two arrays:
A = [1,2,3,4,5,6,7,8,9,0]
And:
B = [4,5,6,7]
Is there a way to check if B is a sublist in A with the same exact order of items?
issubset should help you
set(B).issubset(set(A))
e.g.:
>>> A= [1,2,3,4]
>>> B= [2,3]
>>> set(B).issubset(set(A))
True
edit: wrong, this solution does not imply the order of the elements!
How about this:
A = [1,2,3,4,5,6,7,8,9,0]
B = [4,5,6,7]
C = [7,8,9,0]
D = [4,6,7,5]
def is_slice_in_list(s,l):
len_s = len(s) #so we don't recompute length of s on every iteration
return any(s == l[i:len_s+i] for i in xrange(len(l) - len_s+1))
Result:
>>> is_slice_in_list(B,A)
True
>>> is_slice_in_list(C,A)
True
>>> is_slice_in_list(D,A)
False
Using slicing:
for i in range(len(A) - len(B)):
if A[i:i+len(B)] == B:
return True
return False
Something like that will work if your A is larger than B.
I prefer to use index to identify the starting point. With this small example, it is faster than the iterative solutions:
def foo(A,B):
n=-1
while True:
try:
n = A.index(B[0],n+1)
except ValueError:
return False
if A[n:n+len(B)]==B:
return True
Times with this are fairly constant regardless of B (long, short, present or not). Times for the iterative solutions vary with where B starts.
To make this more robust I've tested against
A = [1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 9, 8, 7, 6, 5, 4, 3, 2, 1]
which is longer, and repeats values.
You can use scipy.linalg.hankel to create all the sub-arrays in one line and then check if your array is in there. A quick example is as follows:
from scipy import linalg
A = [1,2,3,4,5,6,7,8,9,0]
B = [4,5,6,7]
hankel_mat = linalg.hankel(A, A[::-1][:len(B)])[:-1*len(B)+1] # Creating a matrix with a shift of 1 between rows, with len(B) columns
B in hankel_mat # Should return True if B exists in the same order inside A
This will not work if B is longer than A, but in that case I believe there is no point in checking :)
import array
def Check_array(c):
count = 0
count2 = 0
a = array.array('i',[4, 11, 20, -4, -3, 11, 3, 0, 50]);
b = array.array('i', [20, -3, 0]);
for i in range(0,len(b)):
for j in range(count2,len(a)):
if a[j]==b[i]:
count = count + 1
count2 = j
break
if count == len(b):
return bool (True);
else:
return bool (False);
res = Check_array(8)
print(res)
By subverting your list to string, you can easily verify if the string "4567" is in the string "1234567890".
stringA = ''.join([str(_) for _ in A])
stringB = ''.join([str(_) for _ in B])
stringB in stringA
>>> True
Dressed as a one line (cause is cooler)
isBinA = ''.join([str(_) for _ in B]) in ''.join([str(_) for _ in A])
isBinA
>>> True
A = [1,2,3,4,5,6,7,8,9,0]
B = [4,5,6,7]
(A and B) == B
True

How can I compare the values of two lists in python?

I want to compare the values of two lists.
For example:
a = [1, 2, 3]
b = [1, 2, 3]
I need to check if a is same as b or not. How do I do that?
a == b
This is a very simple test, it checks if all the values are equal.
If you want to check if a and b both reference the same list, you can use is.
>>> a = [1, 2, 3]
>>> b = [1, 2, 3]
>>> a is b # a and b have the same values but refer to different lists in memory
False
>>> a = [1, 2, 3]
>>> b = a
>>> a is b # both refer to the same list
True
simply use
a == b
the operator == will compare the value of a and b, no matter whether they refer to the same object.
#jamylak's answer is what I would go with. But if you're looking for "several options", here's a bunch:
>>> a = [1,2,3]
>>> b = [1,2,3]
>>> a == b
True
OR
def check(a,b):
if len(a) != len(b):
return False
for i in xrange(len(a)):
if a[i] != b[i]:
return False
return True
OR
>>> len(a)==len(b) and all((a[i]==b[i] for i in xrange(len(a))))
True
OR
def check(a,b):
if len(a) != len(b):
return False
for i,j in itertools.izip(a,b):
if i != j:
return False
return True
OR
>>> all((i==j for i,j in itertools.izip(a,b)))
True
OR (if the list is made up of just numbers)
>>> all((i is j for i,j in itertools.izip(a,b)))
True
OR
>>> all((i is j for i,j in itertools.izip(a,b)))
True
Hope that satiates your appetite ;]

Pythonic way to check that the lengths of lots of lists are the same

I have a number of lists that I'm going to use in my program, but I need to be sure that they are all the same length, or I'm going to get problems later on in my code.
What's the best way to do this in Python?
For example, if I have three lists:
a = [1, 2, 3]
b = ['a', 'b']
c = [5, 6, 7]
I could do something like:
l = [len(a), len(b), len(c)]
if max(l) == min(l):
# They're the same
Is there a better or more Pythonic way to do this?
Assuming you have a non-empty list of lists, e.g.
my_list = [[1, 2, 3], ['a', 'b'], [5, 6, 7]]
you could use
n = len(my_list[0])
if all(len(x) == n for x in my_list):
# whatever
This will short-circuit, so it will stop checking when the first list with a wrong length is encountered.
len(set(len(x) for x in l)) <= 1
Latter I ended up writing:
def some(x):
"""Replacement for len(set(x)) > 1"""
if isinstance(x, (set, frozenset)):
return len(x) > 1
s = set()
for e in x:
s.add(e)
if len(s) > 1:
return True
return False
def lone(x):
"""Replacement for len(set(x)) <= 1"""
return not some(x)
Which allows the above to be written as:
lone(len(x) for x in l)
This will stop taking the lengths of the lists as soon as it finds a list with a different length.
A bit of functional Python:
>>> len(set(map(len, (a, b, c)))) == 1
False
Each call to max and min will traverse the whole list, but you don't really need to do that; you can check for the desired property with one traversal:
def allsamelength(lst_of_lsts):
if len(lst_of_lsts) in (0,1): return True
lfst = len(lst_of_lsts[0])
return all(len(lst) == lfst for lst in lst_of_lsts[1:])
This will also short-circuit if one of the lists has a different length from the first.
If l is list of lengths:
l = [len(a), len(b), len(c)]
if len(set(l))==1:
print 'Yay. List lengths are same.'
Otherwise, using the original lists, one could create a list of lists:
d=[a,b,c]
if len(set(len(x) for x in d)) ==1:
print 'Yay. List lengths are same.'

Categories