assertAlmostEqual in Python unit-test for collections of floats - python

The assertAlmostEqual(x, y) method in Python's unit testing framework tests whether x and y are approximately equal assuming they are floats.
The problem with assertAlmostEqual() is that it only works on floats. I'm looking for a method like assertAlmostEqual() which works on lists of floats, sets of floats, dictionaries of floats, tuples of floats, lists of tuples of floats, sets of lists of floats, etc.
For instance, let x = 0.1234567890, y = 0.1234567891. x and y are almost equal because they agree on each and every digit except for the last one. Therefore self.assertAlmostEqual(x, y) is True because assertAlmostEqual() works for floats.
I'm looking for a more generic assertAlmostEquals() which also evaluates the following calls to True:
self.assertAlmostEqual_generic([x, x, x], [y, y, y]).
self.assertAlmostEqual_generic({1: x, 2: x, 3: x}, {1: y, 2: y, 3: y}).
self.assertAlmostEqual_generic([(x,x)], [(y,y)]).
Is there such a method or do I have to implement it myself?
Clarifications:
assertAlmostEquals() has an optional parameter named places and the numbers are compared by computing the difference rounded to number of decimal places. By default places=7, hence self.assertAlmostEqual(0.5, 0.4) is False while self.assertAlmostEqual(0.12345678, 0.12345679) is True. My speculative assertAlmostEqual_generic() should have the same functionality.
Two lists are considered almost equal if they have almost equal numbers in exactly the same order. formally, for i in range(n): self.assertAlmostEqual(list1[i], list2[i]).
Similarly, two sets are considered almost equal if they can be converted to almost equal lists (by assigning an order to each set).
Similarly, two dictionaries are considered almost equal if the key set of each dictionary is almost equal to the key set of the other dictionary, and for each such almost equal key pair there's a corresponding almost equal value.
In general: I consider two collections almost equal if they're equal except for some corresponding floats which are just almost equal to each other. In other words, I would like to really compare objects but with a low (customized) precision when comparing floats along the way.

if you don't mind using NumPy (which comes with your Python(x,y)), you may want to look at the np.testing module which defines, among others, a assert_almost_equal function.
The signature is np.testing.assert_almost_equal(actual, desired, decimal=7, err_msg='', verbose=True)
>>> x = 1.000001
>>> y = 1.000002
>>> np.testing.assert_almost_equal(x, y)
AssertionError:
Arrays are not almost equal to 7 decimals
ACTUAL: 1.000001
DESIRED: 1.000002
>>> np.testing.assert_almost_equal(x, y, 5)
>>> np.testing.assert_almost_equal([x, x, x], [y, y, y], 5)
>>> np.testing.assert_almost_equal((x, x, x), (y, y, y), 5)

As of python 3.5 you may compare using
math.isclose(a, b, rel_tol=1e-9, abs_tol=0.0)
As described in pep-0485.
The implementation should be equivalent to
abs(a-b) <= max( rel_tol * max(abs(a), abs(b)), abs_tol )

Here's how I've implemented a generic is_almost_equal(first, second) function:
First, duplicate the objects you need to compare (first and second), but don't make an exact copy: cut the insignificant decimal digits of any float you encounter inside the object.
Now that you have copies of first and second for which the insignificant decimal digits are gone, just compare first and second using the == operator.
Let's assume we have a cut_insignificant_digits_recursively(obj, places) function which duplicates obj but leaves only the places most significant decimal digits of each float in the original obj. Here's a working implementation of is_almost_equals(first, second, places):
from insignificant_digit_cutter import cut_insignificant_digits_recursively
def is_almost_equal(first, second, places):
'''returns True if first and second equal.
returns true if first and second aren't equal but have exactly the same
structure and values except for a bunch of floats which are just almost
equal (floats are almost equal if they're equal when we consider only the
[places] most significant digits of each).'''
if first == second: return True
cut_first = cut_insignificant_digits_recursively(first, places)
cut_second = cut_insignificant_digits_recursively(second, places)
return cut_first == cut_second
And here's a working implementation of cut_insignificant_digits_recursively(obj, places):
def cut_insignificant_digits(number, places):
'''cut the least significant decimal digits of a number,
leave only [places] decimal digits'''
if type(number) != float: return number
number_as_str = str(number)
end_of_number = number_as_str.find('.')+places+1
if end_of_number > len(number_as_str): return number
return float(number_as_str[:end_of_number])
def cut_insignificant_digits_lazy(iterable, places):
for obj in iterable:
yield cut_insignificant_digits_recursively(obj, places)
def cut_insignificant_digits_recursively(obj, places):
'''return a copy of obj except that every float loses its least significant
decimal digits remaining only [places] decimal digits'''
t = type(obj)
if t == float: return cut_insignificant_digits(obj, places)
if t in (list, tuple, set):
return t(cut_insignificant_digits_lazy(obj, places))
if t == dict:
return {cut_insignificant_digits_recursively(key, places):
cut_insignificant_digits_recursively(val, places)
for key,val in obj.items()}
return obj
The code and its unit tests are available here: https://github.com/snakile/approximate_comparator. I welcome any improvement and bug fix.

If you don't mind using the numpy package then numpy.testing has the assert_array_almost_equal method.
This works for array_like objects, so it is fine for arrays, lists and tuples of floats, but does it not work for sets and dictionaries.
The documentation is here.

There is no such method, you'd have to do it yourself.
For lists and tuples the definition is obvious, but note that the other cases you mention aren't obvious, so it's no wonder such a function isn't provided. For instance, is {1.00001: 1.00002} almost equal to {1.00002: 1.00001}? Handling such cases requires making a choice about whether closeness depends on keys or values or both. For sets you are unlikely to find a meaningful definition, since sets are unordered, so there is no notion of "corresponding" elements.

You may have to implement it yourself, while its true that list and sets can be iterated the same way, dictionaries are a different story, you iterate their keys not values, and the third example seems a bit ambiguous to me, do you mean to compare each value within the set, or each value from each set.
heres a simple code snippet.
def almost_equal(value_1, value_2, accuracy = 10**-8):
return abs(value_1 - value_2) < accuracy
x = [1,2,3,4]
y = [1,2,4,5]
assert all(almost_equal(*values) for values in zip(x, y))

None of these answers work for me. The following code should work for python collections, classes, dataclasses, and namedtuples. I might have forgotten something, but so far this works for me.
import unittest
from collections import namedtuple, OrderedDict
from dataclasses import dataclass
from typing import Any
def are_almost_equal(o1: Any, o2: Any, max_abs_ratio_diff: float, max_abs_diff: float) -> bool:
"""
Compares two objects by recursively walking them trough. Equality is as usual except for floats.
Floats are compared according to the two measures defined below.
:param o1: The first object.
:param o2: The second object.
:param max_abs_ratio_diff: The maximum allowed absolute value of the difference.
`abs(1 - (o1 / o2)` and vice-versa if o2 == 0.0. Ignored if < 0.
:param max_abs_diff: The maximum allowed absolute difference `abs(o1 - o2)`. Ignored if < 0.
:return: Whether the two objects are almost equal.
"""
if type(o1) != type(o2):
return False
composite_type_passed = False
if hasattr(o1, '__slots__'):
if len(o1.__slots__) != len(o2.__slots__):
return False
if any(not are_almost_equal(getattr(o1, s1), getattr(o2, s2),
max_abs_ratio_diff, max_abs_diff)
for s1, s2 in zip(sorted(o1.__slots__), sorted(o2.__slots__))):
return False
else:
composite_type_passed = True
if hasattr(o1, '__dict__'):
if len(o1.__dict__) != len(o2.__dict__):
return False
if any(not are_almost_equal(k1, k2, max_abs_ratio_diff, max_abs_diff)
or not are_almost_equal(v1, v2, max_abs_ratio_diff, max_abs_diff)
for ((k1, v1), (k2, v2))
in zip(sorted(o1.__dict__.items()), sorted(o2.__dict__.items()))
if not k1.startswith('__')): # avoid infinite loops
return False
else:
composite_type_passed = True
if isinstance(o1, dict):
if len(o1) != len(o2):
return False
if any(not are_almost_equal(k1, k2, max_abs_ratio_diff, max_abs_diff)
or not are_almost_equal(v1, v2, max_abs_ratio_diff, max_abs_diff)
for ((k1, v1), (k2, v2)) in zip(sorted(o1.items()), sorted(o2.items()))):
return False
elif any(issubclass(o1.__class__, c) for c in (list, tuple, set)):
if len(o1) != len(o2):
return False
if any(not are_almost_equal(v1, v2, max_abs_ratio_diff, max_abs_diff)
for v1, v2 in zip(o1, o2)):
return False
elif isinstance(o1, float):
if o1 == o2:
return True
else:
if max_abs_ratio_diff > 0: # if max_abs_ratio_diff < 0, max_abs_ratio_diff is ignored
if o2 != 0:
if abs(1.0 - (o1 / o2)) > max_abs_ratio_diff:
return False
else: # if both == 0, we already returned True
if abs(1.0 - (o2 / o1)) > max_abs_ratio_diff:
return False
if 0 < max_abs_diff < abs(o1 - o2): # if max_abs_diff < 0, max_abs_diff is ignored
return False
return True
else:
if not composite_type_passed:
return o1 == o2
return True
class EqualityTest(unittest.TestCase):
def test_floats(self) -> None:
o1 = ('hi', 3, 3.4)
o2 = ('hi', 3, 3.400001)
self.assertTrue(are_almost_equal(o1, o2, 0.0001, 0.0001))
self.assertFalse(are_almost_equal(o1, o2, 0.00000001, 0.00000001))
def test_ratio_only(self):
o1 = ['hey', 10000, 123.12]
o2 = ['hey', 10000, 123.80]
self.assertTrue(are_almost_equal(o1, o2, 0.01, -1))
self.assertFalse(are_almost_equal(o1, o2, 0.001, -1))
def test_diff_only(self):
o1 = ['hey', 10000, 1234567890.12]
o2 = ['hey', 10000, 1234567890.80]
self.assertTrue(are_almost_equal(o1, o2, -1, 1))
self.assertFalse(are_almost_equal(o1, o2, -1, 0.1))
def test_both_ignored(self):
o1 = ['hey', 10000, 1234567890.12]
o2 = ['hey', 10000, 0.80]
o3 = ['hi', 10000, 0.80]
self.assertTrue(are_almost_equal(o1, o2, -1, -1))
self.assertFalse(are_almost_equal(o1, o3, -1, -1))
def test_different_lengths(self):
o1 = ['hey', 1234567890.12, 10000]
o2 = ['hey', 1234567890.80]
self.assertFalse(are_almost_equal(o1, o2, 1, 1))
def test_classes(self):
class A:
d = 12.3
def __init__(self, a, b, c):
self.a = a
self.b = b
self.c = c
o1 = A(2.34, 'str', {1: 'hey', 345.23: [123, 'hi', 890.12]})
o2 = A(2.34, 'str', {1: 'hey', 345.231: [123, 'hi', 890.121]})
self.assertTrue(are_almost_equal(o1, o2, 0.1, 0.1))
self.assertFalse(are_almost_equal(o1, o2, 0.0001, 0.0001))
o2.hello = 'hello'
self.assertFalse(are_almost_equal(o1, o2, -1, -1))
def test_namedtuples(self):
B = namedtuple('B', ['x', 'y'])
o1 = B(3.3, 4.4)
o2 = B(3.4, 4.5)
self.assertTrue(are_almost_equal(o1, o2, 0.2, 0.2))
self.assertFalse(are_almost_equal(o1, o2, 0.001, 0.001))
def test_classes_with_slots(self):
class C(object):
__slots__ = ['a', 'b']
def __init__(self, a, b):
self.a = a
self.b = b
o1 = C(3.3, 4.4)
o2 = C(3.4, 4.5)
self.assertTrue(are_almost_equal(o1, o2, 0.3, 0.3))
self.assertFalse(are_almost_equal(o1, o2, -1, 0.01))
def test_dataclasses(self):
#dataclass
class D:
s: str
i: int
f: float
#dataclass
class E:
f2: float
f4: str
d: D
o1 = E(12.3, 'hi', D('hello', 34, 20.01))
o2 = E(12.1, 'hi', D('hello', 34, 20.0))
self.assertTrue(are_almost_equal(o1, o2, -1, 0.4))
self.assertFalse(are_almost_equal(o1, o2, -1, 0.001))
o3 = E(12.1, 'hi', D('ciao', 34, 20.0))
self.assertFalse(are_almost_equal(o2, o3, -1, -1))
def test_ordereddict(self):
o1 = OrderedDict({1: 'hey', 345.23: [123, 'hi', 890.12]})
o2 = OrderedDict({1: 'hey', 345.23: [123, 'hi', 890.0]})
self.assertTrue(are_almost_equal(o1, o2, 0.01, -1))
self.assertFalse(are_almost_equal(o1, o2, 0.0001, -1))

Use Pandas
Another way is to convert each of the two dicts etc into pandas dataframes and then use pd.testing.assert_frame_equal() to compare the two. I have used this successfully to compare lists of dicts.
Previous answers often don't work on structures involving dictionaries, but this one should. I haven't exhaustively tested this on highly nested structures, but imagine pandas would handle them correctly.
Example 1: compare two dicts
To illustrate this I will use your example data of a dict, since the other methods don't work with dicts. Your dict was:
x, y = 0.1234567890, 0.1234567891
{1: x, 2: x, 3: x}, {1: y, 2: y, 3: y}
Then we can do:
pd.testing.assert_frame_equal(
pd.DataFrame.from_dict({1: x, 2: x, 3: x}, orient='index') ,
pd.DataFrame.from_dict({1: y, 2: y, 3: y}, orient='index') )
This doesn't raise an error, meaning that they are equal to a certain degree of precision.
However if we were to do
pd.testing.assert_frame_equal(
pd.DataFrame.from_dict({1: x, 2: x, 3: x}, orient='index') ,
pd.DataFrame.from_dict({1: y, 2: y, 3: y + 1}, orient='index') ) #add 1 to last value
then we are rewarded with the following informative message:
AssertionError: DataFrame.iloc[:, 0] (column name="0") are different
DataFrame.iloc[:, 0] (column name="0") values are different (33.33333 %)
[index]: [1, 2, 3]
[left]: [0.123456789, 0.123456789, 0.123456789]
[right]: [0.1234567891, 0.1234567891, 1.1234567891]
For further details see pd.testing.assert_frame_equal documentation , particularly parameters check_exact, rtol, atol for info about how to specify required degree of precision either relative or actual.
Example 2: Nested dict of dicts
a = {i*10 : {1:1.1,2:2.1} for i in range(4)}
b = {i*10 : {1:1.1000001,2:2.100001} for i in range(4)}
# a = {0: {1: 1.1, 2: 2.1}, 10: {1: 1.1, 2: 2.1}, 20: {1: 1.1, 2: 2.1}, 30: {1: 1.1, 2: 2.1}}
# b = {0: {1: 1.1000001, 2: 2.100001}, 10: {1: 1.1000001, 2: 2.100001}, 20: {1: 1.1000001, 2: 2.100001}, 30: {1: 1.1000001, 2: 2.100001}}
and then do
pd.testing.assert_frame_equal( pd.DataFrame(a), pd.DataFrame(b) )
- it doesn't raise an error: all values fairly similar.
However, if we change a value e.g.
b[30][2] += 1
# b = {0: {1: 1.1000001, 2: 2.1000001}, 10: {1: 1.1000001, 2: 2.1000001}, 20: {1: 1.1000001, 2: 2.1000001}, 30: {1: 1.1000001, 2: 3.1000001}}
and then run the same test, we get the following clear error message:
AssertionError: DataFrame.iloc[:, 3] (column name="30") are different
DataFrame.iloc[:, 3] (column name="30") values are different (50.0 %)
[index]: [1, 2]
[left]: [1.1, 2.1]
[right]: [1.1000001, 3.1000001]

Looking at this myself, I used the addTypeEqualityFunc method of the UnitTest library in combination with math.isclose.
Sample setup:
import math
from unittest import TestCase
class SomeFixtures(TestCase):
#classmethod
def float_comparer(cls, a, b, msg=None):
if len(a) != len(b):
raise cls.failureException(msg)
if not all(map(lambda args: math.isclose(*args), zip(a, b))):
raise cls.failureException(msg)
def some_test(self):
self.addTypeEqualityFunc(list, self.float_comparer)
self.assertEqual([1.0, 2.0, 3.0], [1.0, 2.0, 3.0])

I would still use self.assertEqual() for it stays the most informative when shit hits the fan. You can do that by rounding, eg.
self.assertEqual(round_tuple((13.949999999999999, 1.121212), 2), (13.95, 1.12))
where round_tuple is
def round_tuple(t: tuple, ndigits: int) -> tuple:
return tuple(round(e, ndigits=ndigits) for e in t)
def round_list(l: list, ndigits: int) -> list:
return [round(e, ndigits=ndigits) for e in l]
According to the python docs (see https://stackoverflow.com/a/41407651/1031191) you can get away with rounding issues like 13.94999999, because 13.94999999 == 13.95 is True.

You can also recursively call the already present unittest.assertAlmostEquals() and keep track of what element you are comparing, by adding a method to your unittest.
E.g. for lists of lists and list of tuples of floats:
def assertListAlmostEqual(self, first, second, delta=None, context=None):
"""Asserts lists of lists or tuples to check if they compare and
shows which element is wrong when comparing two lists
"""
self.assertEqual(len(first), len(second), msg="List have different length")
context = [first, second] if context is None else context
for i in range(0, len(first)):
if isinstance(first[0], tuple):
context.append(i)
self.assertListAlmostEqual(first[i], second[i], delta, context=context)
if isinstance(first[0], list):
context.append(i)
self.assertListAlmostEqual(first[i], second[i], delta, context=context)
elif isinstance(first[0], float):
msg = "Difference in \n{} and \n{}\nFaulty element index={}".format(context[0], context[1], context[2:]+[i]) \
if context is not None else None
self.assertAlmostEqual(first[i], second[i], delta, msg=msg)
Outputs something like:
line 23, in assertListAlmostEqual
self.assertAlmostEqual(first[i], second[i], delta, msg=msg)
AssertionError: 5.0 != 6.0 within 7 places (1.0 difference) : Difference in
[(0.0, 5.0), (8.0, 2.0), (10.0, 1.999999), (11.0, 1.9999989090909092)] and
[(0.0, 6.0), (8.0, 2.0), (10.0, 1.999999), (11.0, 1.9999989)]
Faulty element index=[0, 1]

An alternative approach is to convert your data into a comparable form by e.g turning each float into a string with fixed precision.
def comparable(data):
"""Converts `data` to a comparable structure by converting any floats to a string with fixed precision."""
if isinstance(data, (int, str)):
return data
if isinstance(data, float):
return '{:.4f}'.format(data)
if isinstance(data, list):
return [comparable(el) for el in data]
if isinstance(data, tuple):
return tuple([comparable(el) for el in data])
if isinstance(data, dict):
return {k: comparable(v) for k, v in data.items()}
Then you can:
self.assertEquals(comparable(value1), comparable(value2))

Related

How to sort mixed-type sequences (e.g. with numbers and strings) in python3.x? [duplicate]

I'm trying to replicate (and if possible improve on) Python 2.x's sorting behaviour in 3.x, so that mutually orderable types like int, float etc. are sorted as expected, and mutually unorderable types are grouped within the output.
Here's an example of what I'm talking about:
>>> sorted([0, 'one', 2.3, 'four', -5]) # Python 2.x
[-5, 0, 2.3, 'four', 'one']
>>> sorted([0, 'one', 2.3, 'four', -5]) # Python 3.x
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: str() < int()
My previous attempt at this, using a class for the key parameter to sorted() (see
Why does this key class for sorting heterogeneous sequences behave oddly?) is fundamentally broken, because its approach of
Trying to compare values, and
If that fails, falling back to comparing the string representation of their types
can lead to intransitive ordering, as explained by BrenBarn's excellent answer.
A naïve approach, which I initially rejected without even trying to code it, would be to use a key function that returns a (type, value) tuple:
def motley(value):
return repr(type(value)), value
However, this doesn't do what I want. In the first place, it breaks the natural ordering of mutually orderable types:
>>> sorted([0, 123.4, 5, -6, 7.89])
[-6, 0, 5, 7.89, 123.4]
>>> sorted([0, 123.4, 5, -6, 7.89], key=motley)
[7.89, 123.4, -6, 0, 5]
Secondly, it raises an exception when the input contains two objects of the same intrinsically unorderable type:
>>> sorted([{1:2}, {3:4}], key=motley)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: dict() < dict()
... which admittedly is the standard behaviour in both Python 2.x and 3.x – but ideally I'd like such types to be grouped together (I don't especially care about their ordering, but it would seem in keeping with Python's guarantee of stable sorting that they retain their original order).
I can work around the first of these problems for numeric types by special-casing them:
from numbers import Real
from decimal import Decimal
def motley(value):
numeric = Real, Decimal
if isinstance(value, numeric):
typeinfo = numeric
else:
typeinfo = type(value)
return repr(typeinfo), value
... which works as far as it goes:
>>> sorted([0, 'one', 2.3, 'four', -5], key=motley)
[-5, 0, 2.3, 'four', 'one']
... but doesn't account for the fact that there may be other distinct (possibly user-defined) types which are mutually orderable, and of course still fails with intrinsically unorderable types:
>>> sorted([{1:2}, {3:4}], key=motley)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: dict() < dict()
Is there another approach which solves both the problem of arbitrary, distinct-but-mutually-orderable types and that of intrinsically unorderable types?
Stupid idea: make a first pass to divide all the different items in groups that can be compared between each other, sort the individual groups and finally concatenate them. I assume that an item is comparable to all members of a group, if it is comparable with the first member of a group. Something like this (Python3):
import itertools
def python2sort(x):
it = iter(x)
groups = [[next(it)]]
for item in it:
for group in groups:
try:
item < group[0] # exception if not comparable
group.append(item)
break
except TypeError:
continue
else: # did not break, make new group
groups.append([item])
print(groups) # for debugging
return itertools.chain.from_iterable(sorted(group) for group in groups)
This will have quadratic running time in the pathetic case that none of the items are comparable, but I guess the only way to know that for sure is to check all possible combinations. See the quadratic behavior as a deserved punishment for anyone trying to sort a long list of unsortable items, like complex numbers. In a more common case of a mix of some strings and some integers, the speed should be similar to the speed of a normal sort. Quick test:
In [19]: x = [0, 'one', 2.3, 'four', -5, 1j, 2j, -5.5, 13 , 15.3, 'aa', 'zz']
In [20]: list(python2sort(x))
[[0, 2.3, -5, -5.5, 13, 15.3], ['one', 'four', 'aa', 'zz'], [1j], [2j]]
Out[20]: [-5.5, -5, 0, 2.3, 13, 15.3, 'aa', 'four', 'one', 'zz', 1j, 2j]
It seems to be a 'stable sort' as well, since the groups are formed in the order the incomparable items are encountered.
This answer aims to faithfully re-create the Python 2 sort order, in Python 3, in every detail.
The actual Python 2 implementation is quite involved, but object.c's default_3way_compare does the final fallback after instances have been given a chance to implement normal comparison rules. This is after individual types have been given a chance to compare (via the __cmp__ or __lt__ hooks).
Implementing that function as pure Python in a wrapper, plus emulating the exceptions to the rules (dict and complex numbers specifically) gives us the same Python 2 sorting semantics in Python 3:
from numbers import Number
# decorator for type to function mapping special cases
def per_type_cmp(type_):
try:
mapping = per_type_cmp.mapping
except AttributeError:
mapping = per_type_cmp.mapping = {}
def decorator(cmpfunc):
mapping[type_] = cmpfunc
return cmpfunc
return decorator
class python2_sort_key(object):
_unhandled_types = {complex}
def __init__(self, ob):
self._ob = ob
def __lt__(self, other):
_unhandled_types = self._unhandled_types
self, other = self._ob, other._ob # we don't care about the wrapper
# default_3way_compare is used only if direct comparison failed
try:
return self < other
except TypeError:
pass
# hooks to implement special casing for types, dict in Py2 has
# a dedicated __cmp__ method that is gone in Py3 for example.
for type_, special_cmp in per_type_cmp.mapping.items():
if isinstance(self, type_) and isinstance(other, type_):
return special_cmp(self, other)
# explicitly raise again for types that won't sort in Python 2 either
if type(self) in _unhandled_types:
raise TypeError('no ordering relation is defined for {}'.format(
type(self).__name__))
if type(other) in _unhandled_types:
raise TypeError('no ordering relation is defined for {}'.format(
type(other).__name__))
# default_3way_compare from Python 2 as Python code
# same type but no ordering defined, go by id
if type(self) is type(other):
return id(self) < id(other)
# None always comes first
if self is None:
return True
if other is None:
return False
# Sort by typename, but numbers are sorted before other types
self_tname = '' if isinstance(self, Number) else type(self).__name__
other_tname = '' if isinstance(other, Number) else type(other).__name__
if self_tname != other_tname:
return self_tname < other_tname
# same typename, or both numbers, but different type objects, order
# by the id of the type object
return id(type(self)) < id(type(other))
#per_type_cmp(dict)
def dict_cmp(a, b, _s=object()):
if len(a) != len(b):
return len(a) < len(b)
adiff = min((k for k in a if a[k] != b.get(k, _s)), key=python2_sort_key, default=_s)
if adiff is _s:
# All keys in a have a matching value in b, so the dicts are equal
return False
bdiff = min((k for k in b if b[k] != a.get(k, _s)), key=python2_sort_key)
if adiff != bdiff:
return python2_sort_key(adiff) < python2_sort_key(bdiff)
return python2_sort_key(a[adiff]) < python2_sort_key(b[bdiff])
I incorporated handling dictionary sorting as implemented in Python 2, since that'd be supported by the type itself via a __cmp__ hook. I've stuck to the Python 2 ordering for the keys and values as well, naturally.
I've also added special casing for complex numbers, as Python 2 raises an exception when you try sort to these:
>>> sorted([0.0, 1, (1+0j), False, (2+3j)])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: no ordering relation is defined for complex numbers
You may have to add more special cases if you want to emulate Python 2 behaviour exactly.
If you wanted to sort complex numbers anyway you'll need to consistently put them with the non-numbers group; e.g.:
# Sort by typename, but numbers are sorted before other types
if isinstance(self, Number) and not isinstance(self, complex):
self_tname = ''
else:
self_tname = type(self).__name__
if isinstance(other, Number) and not isinstance(other, complex):
other_tname = ''
else:
other_tname = type(other).__name__
Some test cases:
>>> sorted([0, 'one', 2.3, 'four', -5], key=python2_sort_key)
[-5, 0, 2.3, 'four', 'one']
>>> sorted([0, 123.4, 5, -6, 7.89], key=python2_sort_key)
[-6, 0, 5, 7.89, 123.4]
>>> sorted([{1:2}, {3:4}], key=python2_sort_key)
[{1: 2}, {3: 4}]
>>> sorted([{1:2}, None, {3:4}], key=python2_sort_key)
[None, {1: 2}, {3: 4}]
Not running Python 3 here, but maybe something like this would work. Test to see if doing a "less than" compare on "value" creates an exception and then do "something" to handle that case, like convert it to a string.
Of course you'd still need more special handling if there are other types in your list that are not the same type but are mutually orderable.
from numbers import Real
from decimal import Decimal
def motley(value):
numeric = Real, Decimal
if isinstance(value, numeric):
typeinfo = numeric
else:
typeinfo = type(value)
try:
x = value < value
except TypeError:
value = repr(value)
return repr(typeinfo), value
>>> print sorted([0, 'one', 2.3, 'four', -5, (2+3j), (1-3j)], key=motley)
[-5, 0, 2.3, (1-3j), (2+3j), 'four', 'one']
One way for Python 3.2+ is to use functools.cmp_to_key().
With this you can quickly implement a solution that tries to compare the values and then falls back on comparing the string representation of the types. You can also avoid an error being raised when comparing unordered types and leave the order as in the original case:
from functools import cmp_to_key
def cmp(a,b):
try:
return (a > b) - (a < b)
except TypeError:
s1, s2 = type(a).__name__, type(b).__name__
return (s1 > s2) - (s1 < s2)
Examples (input lists taken from Martijn Pieters's answer):
sorted([0, 'one', 2.3, 'four', -5], key=cmp_to_key(cmp))
# [-5, 0, 2.3, 'four', 'one']
sorted([0, 123.4, 5, -6, 7.89], key=cmp_to_key(cmp))
# [-6, 0, 5, 7.89, 123.4]
sorted([{1:2}, {3:4}], key=cmp_to_key(cmp))
# [{1: 2}, {3: 4}]
sorted([{1:2}, None, {3:4}], key=cmp_to_key(cmp))
# [None, {1: 2}, {3: 4}]
This has the disadvantage that the three-way compare is always conducted, increasing the time complexity. However, the solution is low overhead, short, clean and I think cmp_to_key() was developed for this kind of Python 2 emulation use case.
I tried to implement the Python 2 sorting c code in python 3 as faithfully as possible.
Use it like so: mydata.sort(key=py2key()) or mydata.sort(key=py2key(lambda x: mykeyfunc))
def default_3way_compare(v, w): # Yes, this is how Python 2 sorted things :)
tv, tw = type(v), type(w)
if tv is tw:
return -1 if id(v) < id(w) else (1 if id(v) > id(w) else 0)
if v is None:
return -1
if w is None:
return 1
if isinstance(v, (int, float)):
vname = ''
else:
vname = type(v).__name__
if isinstance(w, (int, float)):
wname = ''
else:
wname = type(w).__name__
if vname < wname:
return -1
if vname > wname:
return 1
return -1 if id(type(v)) < id(type(w)) else 1
def py2key(func=None): # based on cmp_to_key
class K(object):
__slots__ = ['obj']
__hash__ = None
def __init__(self, obj):
self.obj = func(obj) if func else obj
def __lt__(self, other):
try:
return self.obj < other.obj
except TypeError:
return default_3way_compare(self.obj, other.obj) < 0
def __gt__(self, other):
try:
return self.obj > other.obj
except TypeError:
return default_3way_compare(self.obj, other.obj) > 0
def __eq__(self, other):
try:
return self.obj == other.obj
except TypeError:
return default_3way_compare(self.obj, other.obj) == 0
def __le__(self, other):
try:
return self.obj <= other.obj
except TypeError:
return default_3way_compare(self.obj, other.obj) <= 0
def __ge__(self, other):
try:
return self.obj >= other.obj
except TypeError:
return default_3way_compare(self.obj, other.obj) >= 0
return K
We can solve this problem in the following way.
Group by type.
Find which types are comparable by attempting to compare a single representative of each type.
Merge groups of comparable types.
Sort merged groups, if possible.
yield from (sorted) merged groups
We can get a deterministic and orderable key function from types by using repr(type(x)). Note that the 'type hierarchy' here is determined by the repr of the types themselves. A flaw in this method is that if two types have identical __repr__ (the types themselves, not the instances), you will 'confuse' types. This can be solved by using a key function that returns a tuple (repr(type), id(type)), but I have not implemented that in this solution.
The advantage of my method over Bas Swinkel's is a cleaner handling of a group of un-orderable elements. We do not have quadratic behavior; instead, the function gives up after the first attempted ordering during sorted()).
My method functions worst in the scenario where there are an extremely large number of different types in the iterable. This is a rare scenario, but I suppose it could come up.
def py2sort(iterable):
by_type_repr = lambda x: repr(type(x))
iterable = sorted(iterable, key = by_type_repr)
types = {type_: list(group) for type_, group in groupby(iterable, by_type_repr)}
def merge_compatible_types(types):
representatives = [(type_, items[0]) for (type_, items) in types.items()]
def mergable_types():
for i, (type_0, elem_0) in enumerate(representatives, 1):
for type_1, elem_1 in representatives[i:]:
if _comparable(elem_0, elem_1):
yield type_0, type_1
def merge_types(a, b):
try:
types[a].extend(types[b])
del types[b]
except KeyError:
pass # already merged
for a, b in mergable_types():
merge_types(a, b)
return types
def gen_from_sorted_comparable_groups(types):
for _, items in types.items():
try:
items = sorted(items)
except TypeError:
pass #unorderable type
yield from items
types = merge_compatible_types(types)
return list(gen_from_sorted_comparable_groups(types))
def _comparable(x, y):
try:
x < y
except TypeError:
return False
else:
return True
if __name__ == '__main__':
print('before py2sort:')
test = [2, -11.6, 3, 5.0, (1, '5', 3), (object, object()), complex(2, 3), [list, tuple], Fraction(11, 2), '2', type, str, 'foo', object(), 'bar']
print(test)
print('after py2sort:')
print(py2sort(test))
I'd like to recommend starting this sort of task (like imitation of another system's behaviour very close to this one) with detailed clarifying of the target system. How should it work with different corner cases. One of the best ways to do it - write a bunch of tests to ensure correct behaviour. Having such tests gives:
Better understandind which elements should precede which
Basic documenting
Makes system robust against some refactoring and adding functionality. For example if one more rule is added - how to get sure previous are not gets broken?
One can write such test cases:
sort2_test.py
import unittest
from sort2 import sorted2
class TestSortNumbers(unittest.TestCase):
"""
Verifies numbers are get sorted correctly.
"""
def test_sort_empty(self):
self.assertEqual(sorted2([]), [])
def test_sort_one_element_int(self):
self.assertEqual(sorted2([1]), [1])
def test_sort_one_element_real(self):
self.assertEqual(sorted2([1.0]), [1.0])
def test_ints(self):
self.assertEqual(sorted2([1, 2]), [1, 2])
def test_ints_reverse(self):
self.assertEqual(sorted2([2, 1]), [1, 2])
class TestSortStrings(unittest.TestCase):
"""
Verifies numbers are get sorted correctly.
"""
def test_sort_one_element_str(self):
self.assertEqual(sorted2(["1.0"]), ["1.0"])
class TestSortIntString(unittest.TestCase):
"""
Verifies numbers and strings are get sorted correctly.
"""
def test_string_after_int(self):
self.assertEqual(sorted2([1, "1"]), [1, "1"])
self.assertEqual(sorted2([0, "1"]), [0, "1"])
self.assertEqual(sorted2([-1, "1"]), [-1, "1"])
self.assertEqual(sorted2(["1", 1]), [1, "1"])
self.assertEqual(sorted2(["0", 1]), [1, "0"])
self.assertEqual(sorted2(["-1", 1]), [1, "-1"])
class TestSortIntDict(unittest.TestCase):
"""
Verifies numbers and dict are get sorted correctly.
"""
def test_string_after_int(self):
self.assertEqual(sorted2([1, {1: 2}]), [1, {1: 2}])
self.assertEqual(sorted2([0, {1: 2}]), [0, {1: 2}])
self.assertEqual(sorted2([-1, {1: 2}]), [-1, {1: 2}])
self.assertEqual(sorted2([{1: 2}, 1]), [1, {1: 2}])
self.assertEqual(sorted2([{1: 2}, 1]), [1, {1: 2}])
self.assertEqual(sorted2([{1: 2}, 1]), [1, {1: 2}])
Next one may have such sorting function:
sort2.py
from numbers import Real
from decimal import Decimal
from itertools import tee, filterfalse
def sorted2(iterable):
"""
:param iterable: An iterable (array or alike)
entity which elements should be sorted.
:return: List with sorted elements.
"""
def predicate(x):
return isinstance(x, (Real, Decimal))
t1, t2 = tee(iterable)
numbers = filter(predicate, t1)
non_numbers = filterfalse(predicate, t2)
sorted_numbers = sorted(numbers)
sorted_non_numbers = sorted(non_numbers, key=str)
return sorted_numbers + sorted_non_numbers
Usage is quite simple and is documented in tests:
>>> from sort2 import sorted2
>>> sorted2([1,2,3, "aaa", {3:5}, [1,2,34], {-8:15}])
[1, 2, 3, [1, 2, 34], 'aaa', {-8: 15}, {3: 5}]
To avoid the use of exceptions and going for a type based solution, i came up with this:
#! /usr/bin/python3
import itertools
def p2Sort(x):
notImpl = type(0j.__gt__(0j))
it = iter(x)
first = next(it)
groups = [[first]]
types = {type(first):0}
for item in it:
item_type = type(item)
if item_type in types.keys():
groups[types[item_type]].append(item)
else:
types[item_type] = len(types)
groups.append([item])
#debuggng
for group in groups:
print(group)
for it in group:
print(type(it),)
#
for i in range(len(groups)):
if type(groups[i][0].__gt__(groups[i][0])) == notImpl:
continue
groups[i] = sorted(groups[i])
return itertools.chain.from_iterable(group for group in groups)
x = [0j, 'one', 2.3, 'four', -5, 3j, 0j, -5.5, 13 , 15.3, 'aa', 'zz']
print(list(p2Sort(x)))
Note that an additional dictionary to hold the different types in list and a type holding variable (notImpl) is needed. Further note, that floats and ints aren't mixed here.
Output:
================================================================================
05.04.2017 18:27:57
~/Desktop/sorter.py
--------------------------------------------------------------------------------
[0j, 3j, 0j]
<class 'complex'>
<class 'complex'>
<class 'complex'>
['one', 'four', 'aa', 'zz']
<class 'str'>
<class 'str'>
<class 'str'>
<class 'str'>
[2.3, -5.5, 15.3]
<class 'float'>
<class 'float'>
<class 'float'>
[-5, 13]
<class 'int'>
<class 'int'>
[0j, 3j, 0j, 'aa', 'four', 'one', 'zz', -5.5, 2.3, 15.3, -5, 13]
Here is one method of accomplishing this:
lst = [0, 'one', 2.3, 'four', -5]
a=[x for x in lst if type(x) == type(1) or type(x) == type(1.1)]
b=[y for y in lst if type(y) == type('string')]
a.sort()
b.sort()
c = a+b
print(c)
#martijn-pieters I don't know if list in python2 also has a __cmp__ to handle comparing list objects or how it was handled in python2.
Anyway, in addition to the #martijn-pieters's answer, I used the following list comparator, so at least it doesn't give different sorted output based on different order of elements in the same input set.
#per_type_cmp(list)
def list_cmp(a, b):
for a_item, b_item in zip(a, b):
if a_item == b_item:
continue
return python2_sort_key(a_item) < python2_sort_key(b_item)
return len(a) < len(b)
So, joining it with original answer by Martijn:
from numbers import Number
# decorator for type to function mapping special cases
def per_type_cmp(type_):
try:
mapping = per_type_cmp.mapping
except AttributeError:
mapping = per_type_cmp.mapping = {}
def decorator(cmpfunc):
mapping[type_] = cmpfunc
return cmpfunc
return decorator
class python2_sort_key(object):
_unhandled_types = {complex}
def __init__(self, ob):
self._ob = ob
def __lt__(self, other):
_unhandled_types = self._unhandled_types
self, other = self._ob, other._ob # we don't care about the wrapper
# default_3way_compare is used only if direct comparison failed
try:
return self < other
except TypeError:
pass
# hooks to implement special casing for types, dict in Py2 has
# a dedicated __cmp__ method that is gone in Py3 for example.
for type_, special_cmp in per_type_cmp.mapping.items():
if isinstance(self, type_) and isinstance(other, type_):
return special_cmp(self, other)
# explicitly raise again for types that won't sort in Python 2 either
if type(self) in _unhandled_types:
raise TypeError('no ordering relation is defined for {}'.format(
type(self).__name__))
if type(other) in _unhandled_types:
raise TypeError('no ordering relation is defined for {}'.format(
type(other).__name__))
# default_3way_compare from Python 2 as Python code
# same type but no ordering defined, go by id
if type(self) is type(other):
return id(self) < id(other)
# None always comes first
if self is None:
return True
if other is None:
return False
# Sort by typename, but numbers are sorted before other types
self_tname = '' if isinstance(self, Number) else type(self).__name__
other_tname = '' if isinstance(other, Number) else type(other).__name__
if self_tname != other_tname:
return self_tname < other_tname
# same typename, or both numbers, but different type objects, order
# by the id of the type object
return id(type(self)) < id(type(other))
#per_type_cmp(dict)
def dict_cmp(a, b, _s=object()):
if len(a) != len(b):
return len(a) < len(b)
adiff = min((k for k in a if a[k] != b.get(k, _s)), key=python2_sort_key, default=_s)
if adiff is _s:
# All keys in a have a matching value in b, so the dicts are equal
return False
bdiff = min((k for k in b if b[k] != a.get(k, _s)), key=python2_sort_key)
if adiff != bdiff:
return python2_sort_key(adiff) < python2_sort_key(bdiff)
return python2_sort_key(a[adiff]) < python2_sort_key(b[bdiff])
#per_type_cmp(list)
def list_cmp(a, b):
for a_item, b_item in zip(a, b):
if a_item == b_item:
continue
return python2_sort_key(a_item) < python2_sort_key(b_item)
return len(a) < len(b)
PS: It makes more sense to create it as an comment but I didn't have enough reputation to make a comment. So, I'm creating it as an answer instead.

How can I get 2.x-like sorting behaviour in Python 3.x?

I'm trying to replicate (and if possible improve on) Python 2.x's sorting behaviour in 3.x, so that mutually orderable types like int, float etc. are sorted as expected, and mutually unorderable types are grouped within the output.
Here's an example of what I'm talking about:
>>> sorted([0, 'one', 2.3, 'four', -5]) # Python 2.x
[-5, 0, 2.3, 'four', 'one']
>>> sorted([0, 'one', 2.3, 'four', -5]) # Python 3.x
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: str() < int()
My previous attempt at this, using a class for the key parameter to sorted() (see
Why does this key class for sorting heterogeneous sequences behave oddly?) is fundamentally broken, because its approach of
Trying to compare values, and
If that fails, falling back to comparing the string representation of their types
can lead to intransitive ordering, as explained by BrenBarn's excellent answer.
A naïve approach, which I initially rejected without even trying to code it, would be to use a key function that returns a (type, value) tuple:
def motley(value):
return repr(type(value)), value
However, this doesn't do what I want. In the first place, it breaks the natural ordering of mutually orderable types:
>>> sorted([0, 123.4, 5, -6, 7.89])
[-6, 0, 5, 7.89, 123.4]
>>> sorted([0, 123.4, 5, -6, 7.89], key=motley)
[7.89, 123.4, -6, 0, 5]
Secondly, it raises an exception when the input contains two objects of the same intrinsically unorderable type:
>>> sorted([{1:2}, {3:4}], key=motley)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: dict() < dict()
... which admittedly is the standard behaviour in both Python 2.x and 3.x – but ideally I'd like such types to be grouped together (I don't especially care about their ordering, but it would seem in keeping with Python's guarantee of stable sorting that they retain their original order).
I can work around the first of these problems for numeric types by special-casing them:
from numbers import Real
from decimal import Decimal
def motley(value):
numeric = Real, Decimal
if isinstance(value, numeric):
typeinfo = numeric
else:
typeinfo = type(value)
return repr(typeinfo), value
... which works as far as it goes:
>>> sorted([0, 'one', 2.3, 'four', -5], key=motley)
[-5, 0, 2.3, 'four', 'one']
... but doesn't account for the fact that there may be other distinct (possibly user-defined) types which are mutually orderable, and of course still fails with intrinsically unorderable types:
>>> sorted([{1:2}, {3:4}], key=motley)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: dict() < dict()
Is there another approach which solves both the problem of arbitrary, distinct-but-mutually-orderable types and that of intrinsically unorderable types?
Stupid idea: make a first pass to divide all the different items in groups that can be compared between each other, sort the individual groups and finally concatenate them. I assume that an item is comparable to all members of a group, if it is comparable with the first member of a group. Something like this (Python3):
import itertools
def python2sort(x):
it = iter(x)
groups = [[next(it)]]
for item in it:
for group in groups:
try:
item < group[0] # exception if not comparable
group.append(item)
break
except TypeError:
continue
else: # did not break, make new group
groups.append([item])
print(groups) # for debugging
return itertools.chain.from_iterable(sorted(group) for group in groups)
This will have quadratic running time in the pathetic case that none of the items are comparable, but I guess the only way to know that for sure is to check all possible combinations. See the quadratic behavior as a deserved punishment for anyone trying to sort a long list of unsortable items, like complex numbers. In a more common case of a mix of some strings and some integers, the speed should be similar to the speed of a normal sort. Quick test:
In [19]: x = [0, 'one', 2.3, 'four', -5, 1j, 2j, -5.5, 13 , 15.3, 'aa', 'zz']
In [20]: list(python2sort(x))
[[0, 2.3, -5, -5.5, 13, 15.3], ['one', 'four', 'aa', 'zz'], [1j], [2j]]
Out[20]: [-5.5, -5, 0, 2.3, 13, 15.3, 'aa', 'four', 'one', 'zz', 1j, 2j]
It seems to be a 'stable sort' as well, since the groups are formed in the order the incomparable items are encountered.
This answer aims to faithfully re-create the Python 2 sort order, in Python 3, in every detail.
The actual Python 2 implementation is quite involved, but object.c's default_3way_compare does the final fallback after instances have been given a chance to implement normal comparison rules. This is after individual types have been given a chance to compare (via the __cmp__ or __lt__ hooks).
Implementing that function as pure Python in a wrapper, plus emulating the exceptions to the rules (dict and complex numbers specifically) gives us the same Python 2 sorting semantics in Python 3:
from numbers import Number
# decorator for type to function mapping special cases
def per_type_cmp(type_):
try:
mapping = per_type_cmp.mapping
except AttributeError:
mapping = per_type_cmp.mapping = {}
def decorator(cmpfunc):
mapping[type_] = cmpfunc
return cmpfunc
return decorator
class python2_sort_key(object):
_unhandled_types = {complex}
def __init__(self, ob):
self._ob = ob
def __lt__(self, other):
_unhandled_types = self._unhandled_types
self, other = self._ob, other._ob # we don't care about the wrapper
# default_3way_compare is used only if direct comparison failed
try:
return self < other
except TypeError:
pass
# hooks to implement special casing for types, dict in Py2 has
# a dedicated __cmp__ method that is gone in Py3 for example.
for type_, special_cmp in per_type_cmp.mapping.items():
if isinstance(self, type_) and isinstance(other, type_):
return special_cmp(self, other)
# explicitly raise again for types that won't sort in Python 2 either
if type(self) in _unhandled_types:
raise TypeError('no ordering relation is defined for {}'.format(
type(self).__name__))
if type(other) in _unhandled_types:
raise TypeError('no ordering relation is defined for {}'.format(
type(other).__name__))
# default_3way_compare from Python 2 as Python code
# same type but no ordering defined, go by id
if type(self) is type(other):
return id(self) < id(other)
# None always comes first
if self is None:
return True
if other is None:
return False
# Sort by typename, but numbers are sorted before other types
self_tname = '' if isinstance(self, Number) else type(self).__name__
other_tname = '' if isinstance(other, Number) else type(other).__name__
if self_tname != other_tname:
return self_tname < other_tname
# same typename, or both numbers, but different type objects, order
# by the id of the type object
return id(type(self)) < id(type(other))
#per_type_cmp(dict)
def dict_cmp(a, b, _s=object()):
if len(a) != len(b):
return len(a) < len(b)
adiff = min((k for k in a if a[k] != b.get(k, _s)), key=python2_sort_key, default=_s)
if adiff is _s:
# All keys in a have a matching value in b, so the dicts are equal
return False
bdiff = min((k for k in b if b[k] != a.get(k, _s)), key=python2_sort_key)
if adiff != bdiff:
return python2_sort_key(adiff) < python2_sort_key(bdiff)
return python2_sort_key(a[adiff]) < python2_sort_key(b[bdiff])
I incorporated handling dictionary sorting as implemented in Python 2, since that'd be supported by the type itself via a __cmp__ hook. I've stuck to the Python 2 ordering for the keys and values as well, naturally.
I've also added special casing for complex numbers, as Python 2 raises an exception when you try sort to these:
>>> sorted([0.0, 1, (1+0j), False, (2+3j)])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: no ordering relation is defined for complex numbers
You may have to add more special cases if you want to emulate Python 2 behaviour exactly.
If you wanted to sort complex numbers anyway you'll need to consistently put them with the non-numbers group; e.g.:
# Sort by typename, but numbers are sorted before other types
if isinstance(self, Number) and not isinstance(self, complex):
self_tname = ''
else:
self_tname = type(self).__name__
if isinstance(other, Number) and not isinstance(other, complex):
other_tname = ''
else:
other_tname = type(other).__name__
Some test cases:
>>> sorted([0, 'one', 2.3, 'four', -5], key=python2_sort_key)
[-5, 0, 2.3, 'four', 'one']
>>> sorted([0, 123.4, 5, -6, 7.89], key=python2_sort_key)
[-6, 0, 5, 7.89, 123.4]
>>> sorted([{1:2}, {3:4}], key=python2_sort_key)
[{1: 2}, {3: 4}]
>>> sorted([{1:2}, None, {3:4}], key=python2_sort_key)
[None, {1: 2}, {3: 4}]
Not running Python 3 here, but maybe something like this would work. Test to see if doing a "less than" compare on "value" creates an exception and then do "something" to handle that case, like convert it to a string.
Of course you'd still need more special handling if there are other types in your list that are not the same type but are mutually orderable.
from numbers import Real
from decimal import Decimal
def motley(value):
numeric = Real, Decimal
if isinstance(value, numeric):
typeinfo = numeric
else:
typeinfo = type(value)
try:
x = value < value
except TypeError:
value = repr(value)
return repr(typeinfo), value
>>> print sorted([0, 'one', 2.3, 'four', -5, (2+3j), (1-3j)], key=motley)
[-5, 0, 2.3, (1-3j), (2+3j), 'four', 'one']
One way for Python 3.2+ is to use functools.cmp_to_key().
With this you can quickly implement a solution that tries to compare the values and then falls back on comparing the string representation of the types. You can also avoid an error being raised when comparing unordered types and leave the order as in the original case:
from functools import cmp_to_key
def cmp(a,b):
try:
return (a > b) - (a < b)
except TypeError:
s1, s2 = type(a).__name__, type(b).__name__
return (s1 > s2) - (s1 < s2)
Examples (input lists taken from Martijn Pieters's answer):
sorted([0, 'one', 2.3, 'four', -5], key=cmp_to_key(cmp))
# [-5, 0, 2.3, 'four', 'one']
sorted([0, 123.4, 5, -6, 7.89], key=cmp_to_key(cmp))
# [-6, 0, 5, 7.89, 123.4]
sorted([{1:2}, {3:4}], key=cmp_to_key(cmp))
# [{1: 2}, {3: 4}]
sorted([{1:2}, None, {3:4}], key=cmp_to_key(cmp))
# [None, {1: 2}, {3: 4}]
This has the disadvantage that the three-way compare is always conducted, increasing the time complexity. However, the solution is low overhead, short, clean and I think cmp_to_key() was developed for this kind of Python 2 emulation use case.
I tried to implement the Python 2 sorting c code in python 3 as faithfully as possible.
Use it like so: mydata.sort(key=py2key()) or mydata.sort(key=py2key(lambda x: mykeyfunc))
def default_3way_compare(v, w): # Yes, this is how Python 2 sorted things :)
tv, tw = type(v), type(w)
if tv is tw:
return -1 if id(v) < id(w) else (1 if id(v) > id(w) else 0)
if v is None:
return -1
if w is None:
return 1
if isinstance(v, (int, float)):
vname = ''
else:
vname = type(v).__name__
if isinstance(w, (int, float)):
wname = ''
else:
wname = type(w).__name__
if vname < wname:
return -1
if vname > wname:
return 1
return -1 if id(type(v)) < id(type(w)) else 1
def py2key(func=None): # based on cmp_to_key
class K(object):
__slots__ = ['obj']
__hash__ = None
def __init__(self, obj):
self.obj = func(obj) if func else obj
def __lt__(self, other):
try:
return self.obj < other.obj
except TypeError:
return default_3way_compare(self.obj, other.obj) < 0
def __gt__(self, other):
try:
return self.obj > other.obj
except TypeError:
return default_3way_compare(self.obj, other.obj) > 0
def __eq__(self, other):
try:
return self.obj == other.obj
except TypeError:
return default_3way_compare(self.obj, other.obj) == 0
def __le__(self, other):
try:
return self.obj <= other.obj
except TypeError:
return default_3way_compare(self.obj, other.obj) <= 0
def __ge__(self, other):
try:
return self.obj >= other.obj
except TypeError:
return default_3way_compare(self.obj, other.obj) >= 0
return K
We can solve this problem in the following way.
Group by type.
Find which types are comparable by attempting to compare a single representative of each type.
Merge groups of comparable types.
Sort merged groups, if possible.
yield from (sorted) merged groups
We can get a deterministic and orderable key function from types by using repr(type(x)). Note that the 'type hierarchy' here is determined by the repr of the types themselves. A flaw in this method is that if two types have identical __repr__ (the types themselves, not the instances), you will 'confuse' types. This can be solved by using a key function that returns a tuple (repr(type), id(type)), but I have not implemented that in this solution.
The advantage of my method over Bas Swinkel's is a cleaner handling of a group of un-orderable elements. We do not have quadratic behavior; instead, the function gives up after the first attempted ordering during sorted()).
My method functions worst in the scenario where there are an extremely large number of different types in the iterable. This is a rare scenario, but I suppose it could come up.
def py2sort(iterable):
by_type_repr = lambda x: repr(type(x))
iterable = sorted(iterable, key = by_type_repr)
types = {type_: list(group) for type_, group in groupby(iterable, by_type_repr)}
def merge_compatible_types(types):
representatives = [(type_, items[0]) for (type_, items) in types.items()]
def mergable_types():
for i, (type_0, elem_0) in enumerate(representatives, 1):
for type_1, elem_1 in representatives[i:]:
if _comparable(elem_0, elem_1):
yield type_0, type_1
def merge_types(a, b):
try:
types[a].extend(types[b])
del types[b]
except KeyError:
pass # already merged
for a, b in mergable_types():
merge_types(a, b)
return types
def gen_from_sorted_comparable_groups(types):
for _, items in types.items():
try:
items = sorted(items)
except TypeError:
pass #unorderable type
yield from items
types = merge_compatible_types(types)
return list(gen_from_sorted_comparable_groups(types))
def _comparable(x, y):
try:
x < y
except TypeError:
return False
else:
return True
if __name__ == '__main__':
print('before py2sort:')
test = [2, -11.6, 3, 5.0, (1, '5', 3), (object, object()), complex(2, 3), [list, tuple], Fraction(11, 2), '2', type, str, 'foo', object(), 'bar']
print(test)
print('after py2sort:')
print(py2sort(test))
I'd like to recommend starting this sort of task (like imitation of another system's behaviour very close to this one) with detailed clarifying of the target system. How should it work with different corner cases. One of the best ways to do it - write a bunch of tests to ensure correct behaviour. Having such tests gives:
Better understandind which elements should precede which
Basic documenting
Makes system robust against some refactoring and adding functionality. For example if one more rule is added - how to get sure previous are not gets broken?
One can write such test cases:
sort2_test.py
import unittest
from sort2 import sorted2
class TestSortNumbers(unittest.TestCase):
"""
Verifies numbers are get sorted correctly.
"""
def test_sort_empty(self):
self.assertEqual(sorted2([]), [])
def test_sort_one_element_int(self):
self.assertEqual(sorted2([1]), [1])
def test_sort_one_element_real(self):
self.assertEqual(sorted2([1.0]), [1.0])
def test_ints(self):
self.assertEqual(sorted2([1, 2]), [1, 2])
def test_ints_reverse(self):
self.assertEqual(sorted2([2, 1]), [1, 2])
class TestSortStrings(unittest.TestCase):
"""
Verifies numbers are get sorted correctly.
"""
def test_sort_one_element_str(self):
self.assertEqual(sorted2(["1.0"]), ["1.0"])
class TestSortIntString(unittest.TestCase):
"""
Verifies numbers and strings are get sorted correctly.
"""
def test_string_after_int(self):
self.assertEqual(sorted2([1, "1"]), [1, "1"])
self.assertEqual(sorted2([0, "1"]), [0, "1"])
self.assertEqual(sorted2([-1, "1"]), [-1, "1"])
self.assertEqual(sorted2(["1", 1]), [1, "1"])
self.assertEqual(sorted2(["0", 1]), [1, "0"])
self.assertEqual(sorted2(["-1", 1]), [1, "-1"])
class TestSortIntDict(unittest.TestCase):
"""
Verifies numbers and dict are get sorted correctly.
"""
def test_string_after_int(self):
self.assertEqual(sorted2([1, {1: 2}]), [1, {1: 2}])
self.assertEqual(sorted2([0, {1: 2}]), [0, {1: 2}])
self.assertEqual(sorted2([-1, {1: 2}]), [-1, {1: 2}])
self.assertEqual(sorted2([{1: 2}, 1]), [1, {1: 2}])
self.assertEqual(sorted2([{1: 2}, 1]), [1, {1: 2}])
self.assertEqual(sorted2([{1: 2}, 1]), [1, {1: 2}])
Next one may have such sorting function:
sort2.py
from numbers import Real
from decimal import Decimal
from itertools import tee, filterfalse
def sorted2(iterable):
"""
:param iterable: An iterable (array or alike)
entity which elements should be sorted.
:return: List with sorted elements.
"""
def predicate(x):
return isinstance(x, (Real, Decimal))
t1, t2 = tee(iterable)
numbers = filter(predicate, t1)
non_numbers = filterfalse(predicate, t2)
sorted_numbers = sorted(numbers)
sorted_non_numbers = sorted(non_numbers, key=str)
return sorted_numbers + sorted_non_numbers
Usage is quite simple and is documented in tests:
>>> from sort2 import sorted2
>>> sorted2([1,2,3, "aaa", {3:5}, [1,2,34], {-8:15}])
[1, 2, 3, [1, 2, 34], 'aaa', {-8: 15}, {3: 5}]
To avoid the use of exceptions and going for a type based solution, i came up with this:
#! /usr/bin/python3
import itertools
def p2Sort(x):
notImpl = type(0j.__gt__(0j))
it = iter(x)
first = next(it)
groups = [[first]]
types = {type(first):0}
for item in it:
item_type = type(item)
if item_type in types.keys():
groups[types[item_type]].append(item)
else:
types[item_type] = len(types)
groups.append([item])
#debuggng
for group in groups:
print(group)
for it in group:
print(type(it),)
#
for i in range(len(groups)):
if type(groups[i][0].__gt__(groups[i][0])) == notImpl:
continue
groups[i] = sorted(groups[i])
return itertools.chain.from_iterable(group for group in groups)
x = [0j, 'one', 2.3, 'four', -5, 3j, 0j, -5.5, 13 , 15.3, 'aa', 'zz']
print(list(p2Sort(x)))
Note that an additional dictionary to hold the different types in list and a type holding variable (notImpl) is needed. Further note, that floats and ints aren't mixed here.
Output:
================================================================================
05.04.2017 18:27:57
~/Desktop/sorter.py
--------------------------------------------------------------------------------
[0j, 3j, 0j]
<class 'complex'>
<class 'complex'>
<class 'complex'>
['one', 'four', 'aa', 'zz']
<class 'str'>
<class 'str'>
<class 'str'>
<class 'str'>
[2.3, -5.5, 15.3]
<class 'float'>
<class 'float'>
<class 'float'>
[-5, 13]
<class 'int'>
<class 'int'>
[0j, 3j, 0j, 'aa', 'four', 'one', 'zz', -5.5, 2.3, 15.3, -5, 13]
Here is one method of accomplishing this:
lst = [0, 'one', 2.3, 'four', -5]
a=[x for x in lst if type(x) == type(1) or type(x) == type(1.1)]
b=[y for y in lst if type(y) == type('string')]
a.sort()
b.sort()
c = a+b
print(c)
#martijn-pieters I don't know if list in python2 also has a __cmp__ to handle comparing list objects or how it was handled in python2.
Anyway, in addition to the #martijn-pieters's answer, I used the following list comparator, so at least it doesn't give different sorted output based on different order of elements in the same input set.
#per_type_cmp(list)
def list_cmp(a, b):
for a_item, b_item in zip(a, b):
if a_item == b_item:
continue
return python2_sort_key(a_item) < python2_sort_key(b_item)
return len(a) < len(b)
So, joining it with original answer by Martijn:
from numbers import Number
# decorator for type to function mapping special cases
def per_type_cmp(type_):
try:
mapping = per_type_cmp.mapping
except AttributeError:
mapping = per_type_cmp.mapping = {}
def decorator(cmpfunc):
mapping[type_] = cmpfunc
return cmpfunc
return decorator
class python2_sort_key(object):
_unhandled_types = {complex}
def __init__(self, ob):
self._ob = ob
def __lt__(self, other):
_unhandled_types = self._unhandled_types
self, other = self._ob, other._ob # we don't care about the wrapper
# default_3way_compare is used only if direct comparison failed
try:
return self < other
except TypeError:
pass
# hooks to implement special casing for types, dict in Py2 has
# a dedicated __cmp__ method that is gone in Py3 for example.
for type_, special_cmp in per_type_cmp.mapping.items():
if isinstance(self, type_) and isinstance(other, type_):
return special_cmp(self, other)
# explicitly raise again for types that won't sort in Python 2 either
if type(self) in _unhandled_types:
raise TypeError('no ordering relation is defined for {}'.format(
type(self).__name__))
if type(other) in _unhandled_types:
raise TypeError('no ordering relation is defined for {}'.format(
type(other).__name__))
# default_3way_compare from Python 2 as Python code
# same type but no ordering defined, go by id
if type(self) is type(other):
return id(self) < id(other)
# None always comes first
if self is None:
return True
if other is None:
return False
# Sort by typename, but numbers are sorted before other types
self_tname = '' if isinstance(self, Number) else type(self).__name__
other_tname = '' if isinstance(other, Number) else type(other).__name__
if self_tname != other_tname:
return self_tname < other_tname
# same typename, or both numbers, but different type objects, order
# by the id of the type object
return id(type(self)) < id(type(other))
#per_type_cmp(dict)
def dict_cmp(a, b, _s=object()):
if len(a) != len(b):
return len(a) < len(b)
adiff = min((k for k in a if a[k] != b.get(k, _s)), key=python2_sort_key, default=_s)
if adiff is _s:
# All keys in a have a matching value in b, so the dicts are equal
return False
bdiff = min((k for k in b if b[k] != a.get(k, _s)), key=python2_sort_key)
if adiff != bdiff:
return python2_sort_key(adiff) < python2_sort_key(bdiff)
return python2_sort_key(a[adiff]) < python2_sort_key(b[bdiff])
#per_type_cmp(list)
def list_cmp(a, b):
for a_item, b_item in zip(a, b):
if a_item == b_item:
continue
return python2_sort_key(a_item) < python2_sort_key(b_item)
return len(a) < len(b)
PS: It makes more sense to create it as an comment but I didn't have enough reputation to make a comment. So, I'm creating it as an answer instead.

Extending grouping code to handle more general inputs

I need to group either a list of floats, or a list of (named)tuples of varying length, into groups based on whether or not the key is bigger or smaller than a given value.
For example given a list of powers of 2 less than 1, and a list of cutoffs:
twos = [2**(-(i+1)) for i in range(0,10)]
cutoffs = [0.5, 0.125, 0.03125]
Then function
split_into_groups(twos, cutoffs)
should return
[[0.5], [0.25, 0.125], [0.0625, 0.03125], [0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625]]
I've implemented the function like this:
def split_by_prob(items, cutoff, groups, key=None):
for k,g in groupby(enumerate(items), lambda (j,x): x<cutoff):
groups.append((map(itemgetter(1),g)))
return groups
def split_into_groups(items, cutoffs, key=None):
groups = items
final = []
for i in cutoffs:
groups = split_by_prob(groups,i,[],key)
if len(groups) > 1:
final.append(groups[0])
groups = groups.pop()
else:
final.append(groups[0])
return final
final.append(groups)
return final
The tests that these currently pass are:
>>> split_by_prob(twos, 0.5, [])
[[0.5], [0.25, 0.125, 0.0625, 0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625]]
>>> split_into_groups(twos, cutoffs)
[[0.5], [0.25, 0.125], [0.0625, 0.03125], [0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625]]
>>> split_into_groups(twos, cutoffs_p10)
[[0.5, 0.25, 0.125], [0.0625, 0.03125, 0.015625], [0.0078125, 0.00390625, 0.001953125], [0.0009765625]]
Where cutoffs_p10 = [10**(-(i+1)) for i in range(0,5)]
I can straightforwardly extend this to a list of tuples of the form
items = zip(range(0,10), twos)
by changing
def split_by_prob(items, cutoff, groups, key=None):
for k,g in groupby(enumerate(items), lambda (j,x): x<cutoff):
groups.append((map(itemgetter(1),g)))
return groups
to
def split_by_prob(items, cutoff, groups, key=None):
for k,g in groupby(enumerate(items), lambda (j,x): x[1]<cutoff):
groups.append((map(itemgetter(1),g)))
return groups
How do I go about extending the original method by adding a key that defaults to a list of floats (or ints etc) but one that could handle tuples and namedtuples?
For example something like:
split_into_groups(items, cutoffs, key=items[0])
would return
[[(0,0.5)], [(1,0.25), (2,0.125)], [(3,0.0625), (4,0.03125)], [(5,0.015625), (6,0.0078125), (7,0.00390625), (8,0.001953125), (9,0.0009765625)]]
In my answer I assume, the cutoffs are at the end in increasing order - just to simplify the situation.
Discriminator detecting a slot
class Discriminator(object):
def __init__(self, cutoffs):
self.cutoffs = sorted(cutoffs)
self.maxslot = len(cutoffs)
def findslot(self, num):
cutoffs = self.cutoffs
for slot, edge in enumerate(self.cutoffs):
if num < edge:
return slot
return self.maxslot
grouper to put items into slots
from collections import defaultdict
def grouper(cutoffs, items, key=None):
if not key:
key = lambda itm: itm
discr = Discriminator(cutoffs)
result = defaultdict(list)
for item in items:
num = key(item)
result[discr.findslot(num)].append(item)
return result
def split_into_groups(cutoffs, numbers, key=None):
groups = grouper(cutoffs, numbers, key)
slot_ids = sorted(groups.keys())
return [groups[slot_id] for slot_id in slot_ids]
Conclusions about Discriminator and grouper
Proposed Discriminator works even for unsorted items.
Conclusions about key
In fact, providing the key functions is easier, than it originally looked.
It is just a function provided via parameter, so it becomes an alias for the transformation function to call to get the value, we want to use for comparing, grouping etc.
There is special case of None, for such a situation we have to use some identity function.
Simplest one is
func = lambda itm: itm
Note: all the functions above were tested by a test suite (incl. use of key function, but I removed it from this answer as it was becoming far too long.

Python: determining whether any item in sequence is equal to any other

I'd like to compare multiple objects and return True only if all objects are not equal among themselves. I tried using the code below, but it doesn't work. If obj1 and obj3 are equal and obj2 and obj3 are not equal, the result is True.
obj1 != obj2 != obj3
I have more than 3 objects to compare. Using the code below is out of question:
all([obj1 != obj2, obj1 != obj3, obj2 != obj3])
#Michael Hoffman's answer is good if the objects are all hashable. If not, you can use itertools.combinations:
>>> all(a != b for a, b in itertools.combinations(['a', 'b', 'c', 'd', 'a'], 2))
False
>>> all(a != b for a, b in itertools.combinations(['a', 'b', 'c', 'd'], 2))
True
If the objects are all hashable, then you can see whether a frozenset of the sequence of objects has the same length as the sequence itself:
def all_different(objs):
return len(frozenset(objs)) == len(objs)
Example:
>>> all_different([3, 4, 5])
True
>>> all_different([3, 4, 5, 3])
False
If the objects are unhashable but orderable (for example, lists) then you can transform the itertools solution from O(n^2) to O(n log n) by sorting:
def all_different(*objs):
s = sorted(objs)
return all(x != y for x, y in zip(s[:-1], s[1:]))
Here's a full implementation:
def all_different(*objs):
try:
return len(frozenset(objs)) == len(objs)
except TypeError:
try:
s = sorted(objs)
return all(x != y for x, y in zip(s[:-1], s[1:]))
except TypeError:
return all(x != y for x, y in itertools.combinations(objs, 2))
from itertools import combinations
all(x != y for x, y in combinations(objs, 2))
You can check that all of the items in a list are unique by converting it to a set.
my_obs = [obj1, obj2, obj3]
all_not_equal = len(set(my_obs)) == len(my_obs)

How to convert generator or iterator to list recursively

I want to convert generator or iterator to list recursively.
I wrote a code in below, but it looks naive and ugly, and may be dropped case in doctest.
Q1. Help me good version.
Q2. How to specify object is immutable or not?
import itertools
def isiterable(datum):
return hasattr(datum, '__iter__')
def issubscriptable(datum):
return hasattr(datum, "__getitem__")
def eagerlize(obj):
""" Convert generator or iterator to list recursively.
return a eagalized object of given obj.
This works but, whether it return a new object, break given one.
test 1.0 iterator
>>> q = itertools.permutations('AB', 2)
>>> eagerlize(q)
[('A', 'B'), ('B', 'A')]
>>>
test 2.0 generator in list
>>> q = [(2**x for x in range(3))]
>>> eagerlize(q)
[[1, 2, 4]]
>>>
test 2.1 generator in tuple
>>> q = ((2**x for x in range(3)),)
>>> eagerlize(q)
([1, 2, 4],)
>>>
test 2.2 generator in tuple in generator
>>> q = (((x, (y for y in range(x, x+1))) for x in range(3)),)
>>> eagerlize(q)
([(0, [0]), (1, [1]), (2, [2])],)
>>>
test 3.0 complex test
>>> def test(r):
... for x in range(3):
... r.update({'k%s'%x:x})
... yield (n for n in range(1))
>>>
>>> def creator():
... r = {}
... t = test(r)
... return r, t
>>>
>>> a, b = creator()
>>> q = {'b' : a, 'a' : b}
>>> eagerlize(q)
{'a': [[0], [0], [0]], 'b': {'k2': 2, 'k1': 1, 'k0': 0}}
>>>
test 3.1 complex test (other dict order)
>>> a, b = creator()
>>> q = {'b' : b, 'a' : a}
>>> eagerlize(q)
{'a': {'k2': 2, 'k1': 1, 'k0': 0}, 'b': [[0], [0], [0]]}
>>>
test 4.0 complex test with tuple
>>> a, b = creator()
>>> q = {'b' : (b, 10), 'a' : (a, 10)}
>>> eagerlize(q)
{'a': ({'k2': 2, 'k1': 1, 'k0': 0}, 10), 'b': ([[0], [0], [0]], 10)}
>>>
test 4.1 complex test with tuple (other dict order)
>>> a, b = creator()
>>> q = {'b' : (b, 10), 'a' : (a, 10)}
>>> eagerlize(q)
{'a': ({'k2': 2, 'k1': 1, 'k0': 0}, 10), 'b': ([[0], [0], [0]], 10)}
>>>
"""
def loop(obj):
if isiterable(obj):
for k, v in obj.iteritems() if isinstance(obj, dict) \
else enumerate(obj):
if isinstance(v, tuple):
# immutable and iterable object must be recreate,
# but realy only tuple?
obj[k] = tuple(eagerlize(list(obj[k])))
elif issubscriptable(v):
loop(v)
elif isiterable(v):
obj[k] = list(v)
loop(obj[k])
b = [obj]
loop(b)
return b[0]
def _test():
import doctest
doctest.testmod()
if __name__=="__main__":
_test()
To avoid badly affecting the original object, you basically need a variant of copy.deepcopy... subtly tweaked because you need to turn generators and iterators into lists (deepcopy wouldn't deep-copy generators anyway). Note that some effect on the original object is unfortunately inevitable, because generators and iterators are "exhausted" as a side effect of iterating all the way on them (be it to turn them into lists or for any other purpose) -- therefore, there is simply no way you can both leave the original object alone and have that generator or other iterator turned into a list in the "variant-deepcopied" result.
The copy module is unfortunately not written to be customized, so the alternative ares, either copy-paste-edit, or a subtle (sigh) monkey-patch hinging on (double-sigh) the private module variable _deepcopy_dispatch (which means your patched version might not survive a Python version upgrade, say from 2.6 to 2.7, hypothetically). Plus, the monkey-patch would have to be uninstalled after each use of your eagerize (to avoid affecting other uses of deepcopy). So, let's assume we pick the copy-paste-edit route instead.
Say we start with the most recent version, the one that's online here. You need to rename module, of course; rename the externally visible function deepcopy to eagerize at line 145; the substantial change is at lines 161-165, which in said version, annotated, are:
161 : copier = _deepcopy_dispatch.get(cls)
162 : if copier:
163 : y = copier(x, memo)
164 : else:
165 : tim_one 18729 try:
We need to insert between line 163 and 164 the logic "otherwise if it's iterable expand it to a list (i.e., use the function _deepcopy_list as the copier". So these lines become:
161 : copier = _deepcopy_dispatch.get(cls)
162 : if copier:
163 : y = copier(x, memo)
elif hasattr(cls, '__iter__'):
y = _deepcopy_list(x, memo)
164 : else:
165 : tim_one 18729 try:
That's all: just there two added lines. Note that I've left the original line numbers alone to make it perfectly clear where exactly these two lines need to be inserted, and not numbered the two new lines. You also need to rename other instances of identifier deepcopy (indirect recursive calls) to eagerize.
You should also remove lines 66-144 (the shallow-copy functionality that you don't care about) and appropriately tweak lines 1-65 (docstrings, imports, __all__, etc).
Of course, you want to work off a copy of the plaintext version of copy.py, here, not the annotated version I've been referring to (I used the annotated version just to clarify exactly where the changes were needed!-).

Categories