Implementing __getitem__ - python

Is there a way to implement __getitem__ in a way that supports integer and slice indices without manually checking the type of the argument?
I see a lot of examples of this form, but it seems very hacky to me.
def __getitem__(self,key):
if isinstance(key,int):
# do integery foo here
if isinstance(key,slice):
# do slicey bar here
On a related note, why does this problem exist in the first place? Somtimes returning an int and sometimes a slice is weird design. Calling foo[4] should call foo.__getitem__(slice(4,5,1)) or similar.

You could use exception handling; assume key is a slice object and call the indices() method on it. If that fails it must've been an integer:
def __getitem__(self, key):
try:
return [self.somelist[i] * 5 for i in key.indices(self.length)]
except AttributeError:
# not a slice object (no `indices` attribute)
return self.somelist[key] * 5
Most use-cases for custom containers don't need to support slicing, and historically, the __getitem__ method only ever had to handle integers (for sequences, that is); the __getslice__() method was there to handle slicing instead. When __getslice__ was deprecated, for backwards compatibility and for simpler APIs it was easier to have __getitem__ handle both integers and slice objects.
And that is ignoring the fact that outside sequences, key doesn't have to be an integer. Custom classes are free to support any key type they like.

Related

I'm trying to validate a python class property, but it isn't raising the error I tried to raise. How should I correct this? [duplicate]

How do I check if an object is of a given type, or if it inherits from a given type?
How do I check if the object o is of type str?
Beginners often wrongly expect the string to already be "a number" - either expecting Python 3.x input to convert type, or expecting that a string like '1' is also simultaneously an integer. This is the wrong canonical for those questions. Please carefully read the question and then use How do I check if a string represents a number (float or int)?, How can I read inputs as numbers? and/or Asking the user for input until they give a valid response as appropriate.
Use isinstance to check if o is an instance of str or any subclass of str:
if isinstance(o, str):
To check if the type of o is exactly str, excluding subclasses of str:
if type(o) is str:
See Built-in Functions in the Python Library Reference for relevant information.
Checking for strings in Python 2
For Python 2, this is a better way to check if o is a string:
if isinstance(o, basestring):
because this will also catch Unicode strings. unicode is not a subclass of str; both str and unicode are subclasses of basestring. In Python 3, basestring no longer exists since there's a strict separation of strings (str) and binary data (bytes).
Alternatively, isinstance accepts a tuple of classes. This will return True if o is an instance of any subclass of any of (str, unicode):
if isinstance(o, (str, unicode)):
The most Pythonic way to check the type of an object is... not to check it.
Since Python encourages Duck Typing, you should just try...except to use the object's methods the way you want to use them. So if your function is looking for a writable file object, don't check that it's a subclass of file, just try to use its .write() method!
Of course, sometimes these nice abstractions break down and isinstance(obj, cls) is what you need. But use sparingly.
isinstance(o, str) will return True if o is an str or is of a type that inherits from str.
type(o) is str will return True if and only if o is a str. It will return False if o is of a type that inherits from str.
After the question was asked and answered, type hints were added to Python. Type hints in Python allow types to be checked but in a very different way from statically typed languages. Type hints in Python associate the expected types of arguments with functions as runtime accessible data associated with functions and this allows for types to be checked. Example of type hint syntax:
def foo(i: int):
return i
foo(5)
foo('oops')
In this case we want an error to be triggered for foo('oops') since the annotated type of the argument is int. The added type hint does not cause an error to occur when the script is run normally. However, it adds attributes to the function describing the expected types that other programs can query and use to check for type errors.
One of these other programs that can be used to find the type error is mypy:
mypy script.py
script.py:12: error: Argument 1 to "foo" has incompatible type "str"; expected "int"
(You might need to install mypy from your package manager. I don't think it comes with CPython but seems to have some level of "officialness".)
Type checking this way is different from type checking in statically typed compiled languages. Because types are dynamic in Python, type checking must be done at runtime, which imposes a cost -- even on correct programs -- if we insist that it happen at every chance. Explicit type checks may also be more restrictive than needed and cause unnecessary errors (e.g. does the argument really need to be of exactly list type or is anything iterable sufficient?).
The upside of explicit type checking is that it can catch errors earlier and give clearer error messages than duck typing. The exact requirements of a duck type can only be expressed with external documentation (hopefully it's thorough and accurate) and errors from incompatible types can occur far from where they originate.
Python's type hints are meant to offer a compromise where types can be specified and checked but there is no additional cost during usual code execution.
The typing package offers type variables that can be used in type hints to express needed behaviors without requiring particular types. For example, it includes variables such as Iterable and Callable for hints to specify the need for any type with those behaviors.
While type hints are the most Pythonic way to check types, it's often even more Pythonic to not check types at all and rely on duck typing. Type hints are relatively new and the jury is still out on when they're the most Pythonic solution. A relatively uncontroversial but very general comparison: Type hints provide a form of documentation that can be enforced, allow code to generate earlier and easier to understand errors, can catch errors that duck typing can't, and can be checked statically (in an unusual sense but it's still outside of runtime). On the other hand, duck typing has been the Pythonic way for a long time, doesn't impose the cognitive overhead of static typing, is less verbose, and will accept all viable types and then some.
In Python 3.10, you can use | in isinstance:
>>> isinstance('1223', int | str)
True
>>> isinstance('abcd', int | str)
True
isinstance(o, str)
Link to docs
You can check for type of a variable using __name__ of a type.
Ex:
>>> a = [1,2,3,4]
>>> b = 1
>>> type(a).__name__
'list'
>>> type(a).__name__ == 'list'
True
>>> type(b).__name__ == 'list'
False
>>> type(b).__name__
'int'
For more complex type validations I like typeguard's approach of validating based on python type hint annotations:
from typeguard import check_type
from typing import List
try:
check_type('mylist', [1, 2], List[int])
except TypeError as e:
print(e)
You can perform very complex validations in very clean and readable fashion.
check_type('foo', [1, 3.14], List[Union[int, float]])
# vs
isinstance(foo, list) and all(isinstance(a, (int, float)) for a in foo)
I think the cool thing about using a dynamic language like Python is you really shouldn't have to check something like that.
I would just call the required methods on your object and catch an AttributeError. Later on this will allow you to call your methods with other (seemingly unrelated) objects to accomplish different tasks, such as mocking an object for testing.
I've used this a lot when getting data off the web with urllib2.urlopen() which returns a file like object. This can in turn can be passed to almost any method that reads from a file, because it implements the same read() method as a real file.
But I'm sure there is a time and place for using isinstance(), otherwise it probably wouldn't be there :)
The accepted answer answers the question in that it provides the answers to the asked questions.
Q: What is the best way to check whether a given object is of a given type? How about checking whether the object inherits from a given type?
A: Use isinstance, issubclass, type to check based on types.
As other answers and comments are quick to point out however, there's a lot more to the idea of "type-checking" than that in python. Since the addition of Python 3 and type hints, much has changed as well. Below, I go over some of the difficulties with type checking, duck typing, and exception handling. For those that think type checking isn't what is needed (it usually isn't, but we're here), I also point out how type hints can be used instead.
Type Checking
Type checking is not always an appropriate thing to do in python. Consider the following example:
def sum(nums):
"""Expect an iterable of integers and return the sum."""
result = 0
for n in nums:
result += n
return result
To check if the input is an iterable of integers, we run into a major issue. The only way to check if every element is an integer would be to loop through to check each element. But if we loop through the entire iterator, then there will be nothing left for intended code. We have two options in this kind of situation.
Check as we loop.
Check beforehand but store everything as we check.
Option 1 has the downside of complicating our code, especially if we need to perform similar checks in many places. It forces us to move type checking from the top of the function to everywhere we use the iterable in our code.
Option 2 has the obvious downside that it destroys the entire purpose of iterators. The entire point is to not store the data because we shouldn't need to.
One might also think that checking if checking all of the elements is too much then perhaps we can just check if the input itself is of the type iterable, but there isn't actually any iterable base class. Any type implementing __iter__ is iterable.
Exception Handling and Duck Typing
An alternative approach would be to forgo type checking altogether and focus on exception handling and duck typing instead. That is to say, wrap your code in a try-except block and catch any errors that occur. Alternatively, don't do anything and let exceptions rise naturally from your code.
Here's one way to go about catching an exception.
def sum(nums):
"""Try to catch exceptions?"""
try:
result = 0
for n in nums:
result += n
return result
except TypeError as e:
print(e)
Compared to the options before, this is certainly better. We're checking as we run the code. If there's a TypeError anywhere, we'll know. We don't have to place a check everywhere that we loop through the input. And we don't have to store the input as we iterate over it.
Furthermore, this approach enables duck typing. Rather than checking for specific types, we have moved to checking for specific behaviors and look for when the input fails to behave as expected (in this case, looping through nums and being able to add n).
However, the exact reasons which make exception handling nice can also be their downfall.
A float isn't an int, but it satisfies the behavioral requirements to work.
It is also bad practice to wrap the entire code with a try-except block.
At first these may not seem like issues, but here's some reasons that may change your mind.
A user can no longer expect our function to return an int as intended. This may break code elsewhere.
Since exceptions can come from a wide variety of sources, using the try-except on the whole code block may end up catching exceptions you didn't intend to. We only wanted to check if nums was iterable and had integer elements.
Ideally we'd like to catch exceptions our code generators and raise, in their place, more informative exceptions. It's not fun when an exception is raised from someone else's code with no explanation other than a line you didn't write and that some TypeError occured.
In order to fix the exception handling in response to the above points, our code would then become this... abomination.
def sum(nums):
"""
Try to catch all of our exceptions only.
Re-raise them with more specific details.
"""
result = 0
try:
iter(nums)
except TypeError as e:
raise TypeError("nums must be iterable")
for n in nums:
try:
result += int(n)
except TypeError as e:
raise TypeError("stopped mid iteration since a non-integer was found")
return result
You can kinda see where this is going. The more we try to "properly" check things, the worse our code is looking. Compared to the original code, this isn't readable at all.
We could argue perhaps this is a bit extreme. But on the other hand, this is only a very simple example. In practice, your code is probably much more complicated than this.
Type Hints
We've seen what happens when we try to modify our small example to "enable type checking". Rather than focusing on trying to force specific types, type hinting allows for a way to make types clear to users.
from typing import Iterable
def sum(nums: Iterable[int]) -> int:
result = 0
for n in nums:
result += n
return result
Here are some advantages to using type-hints.
The code actually looks good now!
Static type analysis may be performed by your editor if you use type hints!
They are stored on the function/class, making them dynamically usable e.g. typeguard and dataclasses.
They show up for functions when using help(...).
No need to sanity check if your input type is right based on a description or worse lack thereof.
You can "type" hint based on structure e.g. "does it have this attribute?" without requiring subclassing by the user.
The downside to type hinting?
Type hints are nothing more than syntax and special text on their own. It isn't the same as type checking.
In other words, it doesn't actually answer the question because it doesn't provide type checking. Regardless, however, if you are here for type checking, then you should be type hinting as well. Of course, if you've come to the conclusion that type checking isn't actually necessary but you want some semblance of typing, then type hints are for you.
To Hugo:
You probably mean list rather than array, but that points to the whole problem with type checking - you don't want to know if the object in question is a list, you want to know if it's some kind of sequence or if it's a single object. So try to use it like a sequence.
Say you want to add the object to an existing sequence, or if it's a sequence of objects, add them all
try:
my_sequence.extend(o)
except TypeError:
my_sequence.append(o)
One trick with this is if you are working with strings and/or sequences of strings - that's tricky, as a string is often thought of as a single object, but it's also a sequence of characters. Worse than that, as it's really a sequence of single-length strings.
I usually choose to design my API so that it only accepts either a single value or a sequence - it makes things easier. It's not hard to put a [ ] around your single value when you pass it in if need be.
(Though this can cause errors with strings, as they do look like (are) sequences.)
If you have to check for the type of str or int please use instanceof. As already mentioned by others the explanation is to also include sub classes. One important example for sub classes from my perspective are Enums with data type like IntEnum or StrEnum. Which are a pretty nice way to define related constants. However, it is kind of annoying if libraries do not accept those as such types.
Example:
import enum
class MyEnum(str, enum.Enum):
A = "a"
B = "b"
print(f"is string: {isinstance(MyEnum.A, str)}") # True
print(f"is string: {type(MyEnum.A) == str}") # False!!!
print(f"is string: {type(MyEnum.A.value) == str}") # True
In Python, you can use the built-in isinstance() function to check if an object is of a given type, or if it inherits from a given type.
To check if the object o is of type str, you would use the following code:
if isinstance(o, str):
# o is of type str
You can also use type() function to check the object type.
if type(o) == str:
# o is of type str
You can also check if the object is a sub class of a particular class using issubclass() function.
if issubclass(type(o),str):
# o is sub class of str
A simple way to check type is to compare it with something whose type you know.
>>> a = 1
>>> type(a) == type(1)
True
>>> b = 'abc'
>>> type(b) == type('')
True
I think the best way is to typing well your variables. You can do this by using the "typing" library.
Example:
from typing import NewType
UserId = NewType ('UserId', int)
some_id = UserId (524313`)
See https://docs.python.org/3/library/typing.html.

What is the best way to work with unique item, list and generator [duplicate]

I want to write a function that accepts a parameter which can be either a sequence or a single value. The type of value is str, int, etc., but I don't want it to be restricted to a hardcoded list.
In other words, I want to know if the parameter X is a sequence or something I have to convert to a sequence to avoid special-casing later. I could do
type(X) in (list, tuple)
but there may be other sequence types I'm not aware of, and no common base class.
-N.
Edit: See my "answer" below for why most of these answers don't help me. Maybe you have something better to suggest.
As of 2.6, use abstract base classes.
>>> import collections
>>> isinstance([], collections.Sequence)
True
>>> isinstance(0, collections.Sequence)
False
Furthermore ABC's can be customized to account for exceptions, such as not considering strings to be sequences. Here an example:
import abc
import collections
class Atomic(object):
__metaclass__ = abc.ABCMeta
#classmethod
def __subclasshook__(cls, other):
return not issubclass(other, collections.Sequence) or NotImplemented
Atomic.register(basestring)
After registration the Atomic class can be used with isinstance and issubclass:
assert isinstance("hello", Atomic) == True
This is still much better than a hard-coded list, because you only need to register the exceptions to the rule, and external users of the code can register their own.
Note that in Python 3 the syntax for specifying metaclasses changed and the basestring abstract superclass was removed, which requires something like the following to be used instead:
class Atomic(metaclass=abc.ABCMeta):
#classmethod
def __subclasshook__(cls, other):
return not issubclass(other, collections.Sequence) or NotImplemented
Atomic.register(str)
If desired, it's possible to write code which is compatible both both Python 2.6+ and 3.x, but doing so requires using a slightly more complicated technique which dynamically creates the needed abstract base class, thereby avoiding syntax errors due to the metaclass syntax difference. This is essentially the same as what Benjamin Peterson's six module'swith_metaclass()function does.
class _AtomicBase(object):
#classmethod
def __subclasshook__(cls, other):
return not issubclass(other, collections.Sequence) or NotImplemented
class Atomic(abc.ABCMeta("NewMeta", (_AtomicBase,), {})):
pass
try:
unicode = unicode
except NameError: # 'unicode' is undefined, assume Python >= 3
Atomic.register(str) # str includes unicode in Py3, make both Atomic
Atomic.register(bytes) # bytes will also be considered Atomic (optional)
else:
# basestring is the abstract superclass of both str and unicode types
Atomic.register(basestring) # make both types of strings Atomic
In versions before 2.6, there are type checkers in theoperatormodule.
>>> import operator
>>> operator.isSequenceType([])
True
>>> operator.isSequenceType(0)
False
The problem with all of the above
mentioned ways is that str is
considered a sequence (it's iterable,
has getitem, etc.) yet it's
usually treated as a single item.
For example, a function may accept an
argument that can either be a filename
or a list of filenames. What's the
most Pythonic way for the function to
detect the first from the latter?
Based on the revised question, it sounds like what you want is something more like:
def to_sequence(arg):
'''
determine whether an arg should be treated as a "unit" or a "sequence"
if it's a unit, return a 1-tuple with the arg
'''
def _multiple(x):
return hasattr(x,"__iter__")
if _multiple(arg):
return arg
else:
return (arg,)
>>> to_sequence("a string")
('a string',)
>>> to_sequence( (1,2,3) )
(1, 2, 3)
>>> to_sequence( xrange(5) )
xrange(5)
This isn't guaranteed to handle all types, but it handles the cases you mention quite well, and should do the right thing for most of the built-in types.
When using it, make sure whatever receives the output of this can handle iterables.
IMHO, the python way is to pass the list as *list. As in:
myfunc(item)
myfunc(*items)
Sequences are described here:
https://docs.python.org/2/library/stdtypes.html#sequence-types-str-unicode-list-tuple-bytearray-buffer-xrange
So sequences are not the same as iterable objects. I think sequence must implement
__getitem__, whereas iterable objects must implement __iter__.
So for example string are sequences and don't implement __iter__, xrange objects are sequences and don't implement __getslice__.
But from what you seen to want to do, I'm not sure you want sequences, but rather iterable objects.
So go for hasattr("__getitem__", X) you want sequences, but go rather hasattr("__iter__", X) if you don't want strings for example.
In cases like this, I prefer to just always take the sequence type or always take the scalar. Strings won't be the only types that would behave poorly in this setup; rather, any type that has an aggregate use and allows iteration over its parts might misbehave.
The simplest method would be to check if you can turn it into an iterator. ie
try:
it = iter(X)
# Iterable
except TypeError:
# Not iterable
If you need to ensure that it's a restartable or random access sequence (ie not a generator etc), this approach won't be sufficient however.
As others have noted, strings are also iterable, so if you need so exclude them (particularly important if recursing through items, as list(iter('a')) gives ['a'] again, then you may need to specifically exclude them with:
if not isinstance(X, basestring)
I'm new here so I don't know what's the correct way to do it. I want to answer my answers:
The problem with all of the above mentioned ways is that str is considered a sequence (it's iterable, has __getitem__, etc.) yet it's usually treated as a single item.
For example, a function may accept an argument that can either be a filename or a list of filenames. What's the most Pythonic way for the function to detect the first from the latter?
Should I post this as a new question? Edit the original one?
I think what I would do is check whether the object has certain methods that indicate it is a sequence. I'm not sure if there is an official definition of what makes a sequence. The best I can think of is, it must support slicing. So you could say:
is_sequence = '__getslice__' in dir(X)
You might also check for the particular functionality you're going to be using.
As pi pointed out in the comment, one issue is that a string is a sequence, but you probably don't want to treat it as one. You could add an explicit test that the type is not str.
If strings are the problem, detect a sequence and filter out the special case of strings:
def is_iterable(x):
if type(x) == str:
return False
try:
iter(x)
return True
except TypeError:
return False
You're asking the wrong question. You don't try to detect types in Python; you detect behavior.
Write another function that handles a single value. (let's call it _use_single_val).
Write one function that handles a sequence parameter. (let's call it _use_sequence).
Write a third parent function that calls the two above. (call it use_seq_or_val). Surround each call with an exception handler to catch an invalid parameter (i.e. not single value or sequence).
Write unit tests to pass correct & incorrect parameters to the parent function to make sure it catches the exceptions properly.
def _use_single_val(v):
print v + 1 # this will fail if v is not a value type
def _use_sequence(s):
print s[0] # this will fail if s is not indexable
def use_seq_or_val(item):
try:
_use_single_val(item)
except TypeError:
pass
try:
_use_sequence(item)
except TypeError:
pass
raise TypeError, "item not a single value or sequence"
EDIT: Revised to handle the "sequence or single value" asked about in the question.
Revised answer:
I don't know if your idea of "sequence" matches what the Python manuals call a "Sequence Type", but in case it does, you should look for the __Contains__ method. That is the method Python uses to implement the check "if something in object:"
if hasattr(X, '__contains__'):
print "X is a sequence"
My original answer:
I would check if the object that you received implements an iterator interface:
if hasattr(X, '__iter__'):
print "X is a sequence"
For me, that's the closest match to your definition of sequence since that would allow you to do something like:
for each in X:
print each
You could pass your parameter in the built-in len() function and check whether this causes an error. As others said, the string type requires special handling.
According to the documentation the len function can accept a sequence (string, list, tuple) or a dictionary.
You could check that an object is a string with the following code:
x.__class__ == "".__class__

Python isinstance function

I cannot understand why isinstance function as second parameter need a tuple instead of some iterable?
isinstance(some_object, (some_class1, some_class2))
works fine, but
isinstance(some_object, [some_class1, some_class2])
raise a TypeError
The reason seems to be "allowing only tuples is enough, it's simpler, it avoids the danger of some corner cases, and it seemed neater to the BDFL" (i.e. Guido). (Kudos to #Caleb for posting the key link in the comments.)
Here is an excerpt from this email conversation with Guido van Rossum that specifically addresses the case of other iterables for the isinstance function. (Click on the link for the complete conversation.)
On Thu, Jan 2, 2014 at 1:37 PM, James Powell wrote:
This is driven by a real-world example wherein a large number of
prefixes stored in a set, necessitating:
any('spam'.startswith(c) for c in prefixes)
# or
'spam'.startswith(tuple(prefixes))
Neither of these strikes me as bad. Also, depending on whether the set
of prefixes itself changes dynamically, it may be best to lift the
tuple() call out of the startswith() call.
...
However, .startswith doesn't seem to be the only example of this, and
the other examples are free of the string/iterable ambiguity:
isinstance(x, {int, float})
But this is even less likely to have a dynamically generated argument.
And there could still be another ambiguity here: a metaclass could
conceivably make its instances (i.e. classes) iterable.
It is exacly as it should behave, according to the docs: https://docs.python.org/3/library/functions.html#isinstance
If classinfo is a tuple of type objects (or recursively, other such tuples), return true if object is an instance of any of the types. If classinfo is not a type or tuple of types and such tuples, a TypeError exception is raised.
Because a string is also "some iterable". So you could write:
isinstance(some_object, 'foobar')
and it would check if some_object is an instance of f, o, b, a or r.
This wouldn't work obviously, so isinstance would need to check if the second argument is not a string. Since isinstance needs to do a type check, it might as well make sure the second argument is always a tuple.
Because this is the way the language was designed...
When you write code that can accept more that one type, it is easier to have fixed types that you can directly test than using duck typing. For example as strings are iterable, when you want to accept either a string of a sequence of strings you must first test for the string type.
Here I can imagine no strong reason for limiting to the tuple type, but no strong reason either to extend it to any sequence. You could try to propose it on the python-ideas list.
At a high level need a container type for isinstance checks, so you have tuples, lists, sets, and dicts for built-in containers. Most likely, they decided on tuple over a set because the expected use case for isinstance is a small number of types, and a tuple is faster to check for containment of than compared to a set.
Mutability really isn't a consideration. If they really needed immutability, they could have just re-wrapped the iterable into a tuple before processing.

How to write __getitem__ cleanly?

In Python, when implementing a sequence type, I often (relatively speaking) find myself writing code like this:
class FooSequence(collections.abc.Sequence):
# Snip other methods
def __getitem__(self, key):
if isinstance(key, int):
# Get a single item
elif isinstance(key, slice):
# Get a whole slice
else:
raise TypeError('Index must be int, not {}'.format(type(key).__name__))
The code checks the type of its argument explicitly with isinstance(). This is regarded as an antipattern within the Python community. How do I avoid it?
I cannot use functools.singledispatch, because that's quite deliberately incompatible with methods (it will attempt to dispatch on self, which is entirely useless since we're already dispatching on self via OOP polymorphism). It works with #staticmethod, but what if I need to get stuff out of self?
Casting to int() and then catching the TypeError, checking for a slice, and possibly re-raising is still ugly, though perhaps slightly less so.
It might be cleaner to convert integers into one-element slices and handle both situations with the same code, but that has its own problems (return 0 or [0]?).
As much as it seems odd, I suspect that the way you have it is the best way to go about things. Patterns generally exist to encompass common use cases, but that doesn't mean that they should be taken as gospel when following them makes life more difficult. The main reason that PEP 443 gives for balking at explicit typechecking is that it is "brittle and closed to extension". However, that mainly applies to custom functions that take a number of different types at any time. From the Python docs on __getitem__:
For sequence types, the accepted keys should be integers and slice objects. Note that the special interpretation of negative indexes (if the class wishes to emulate a sequence type) is up to the __getitem__() method. If key is of an inappropriate type, TypeError may be raised; if of a value outside the set of indexes for the sequence (after any special interpretation of negative values), IndexError should be raised. For mapping types, if key is missing (not in the container), KeyError should be raised.
The Python documentation explicitly states the two types that should be accepted, and what to do if an item that is not of those two types is provided. Given that the types are provided by the documentation itself, it's unlikely to change (doing so would break far more implementations than just yours), so it's likely not worth the trouble to go out of your way to code against Python itself potentially changing.
If you're set on avoiding explicit typechecking, I would point you toward this SO answer. It contains a concise implementation of a #methdispatch decorator (not my name, but i'll roll with it) that lets #singledispatch work with methods by forcing it to check args[1] (arg) rather than args[0] (self). Using that should allow you to use custom single dispatch with your __getitem__ method.
Whether or not you consider either of these "pythonic" is up to you, but remember that while The Zen of Python notes that "Special cases aren't special enough to break the rules", it then immediately notes that "practicality beats purity". In this case, just checking for the two types that the documentation explicitly states are the only things __getitem__ should support seems like the practical way to me.
The antipattern is for code to do explicit type checking, which means using the type() function. Why? Because then a subclass of the target type will no longer work. For instance, __getitem__ can use an int, but using type() to check for it means an int-subclass, which would work, will fail only because type() does not return int.
When a type-check is necessary, isinstance is the appropriate way to do it as it does not exclude subclasses.
When writing __dunder__ methods, type checking is necessary and expected -- using isinstance().
In other words, your code is perfectly Pythonic, and its only problem is the error message (it doesn't mention slices).
I'm not aware of a way to avoid doing it once. That's just the tradeoff of using a dynamically-typed language in this way. However, that doesn't mean you have to do it over and over again. I would solve it once by creating an abstract class with split out method names, then inherit from that class instead of directly from Sequence, like:
class UnannoyingSequence(collections.abc.Sequence):
def __getitem__(self, key):
if isinstance(key, int):
return self.getitem(key)
elif isinstance(key, slice):
return self.getslice(key)
else:
raise TypeError('Index must be int, not {}'.format(type(key).__name__))
# default implementation in terms of getitem
def getslice(self, key):
# Get a whole slice
class FooSequence(UnannoyingSequence):
def getitem(self, key):
# Get a single item
# optional efficient, type-specific implementation not in terms of getitem
def getslice(self, key):
# Get a whole slice
This cleans up FooSequence enough that I might even do it this way if I only had the one derived class. I'm sort of surprised the standard library doesn't already work that way.
To stay pythonic, you have work with the semantics rather than the type of the objects. So if you have some parameter as accessor to a sequence, just use it like that. Use the abstraction for a parameter as long as possible. If you expect a set of user identifiers, do not expect a set, but rather some data structure with a method add. If you expect some text, do not expect a unicode object, but rather some container for characters featuring encode and decode methods.
I assume in general you want to do something like "Use the behavior of the base implementation unless some special value is provided. If you want to implement __getitem__, you can use a case distinction where something different happens if one special value is provided. I'd use the following pattern:
class FooSequence(collections.abc.Sequence):
# Snip other methods
def __getitem__(self, key):
try:
if key == SPECIAL_VALUE:
return SOMETHING_SPECIAL
else:
return self.our_baseclass_instance[key]
except AttributeError:
raise TypeError('Wrong type: {}'.format(type(key).__name__))
If you want to distinguish between a single value (in perl terminology "scalar") and a sequence (in Java terminology "collection"), then it is pythonically fine to determine whether an iterator is implemented. You can either use a try-catch pattern or hasattr as I do now:
>>> a = 42
>>> b = [1, 3, 5, 7]
>>> c = slice(1, 42)
>>> hasattr(a, "__iter__")
False
>>> hasattr(b, "__iter__")
True
>>> hasattr(c, "__iter__")
False
>>>
Applied to our example:
class FooSequence(collections.abc.Sequence):
# Snip other methods
def __getitem__(self, key):
try:
if hasattr(key, "__iter__"):
return map(lambda x: WHATEVER(x), key)
else:
return self.our_baseclass_instance[key]
except AttributeError:
raise TypeError('Wrong type: {}'.format(type(key).__name__))
Dynamic programming languages like python and ruby use duck typing. And a duck is an animal, that walks like a duck, swims like a duck and quacks like a duck. Not because somebody calls it a "duck".

Python: check if an object is NOT an "array-type"

I'm looking for a way to test if an object is not of a "list-ish" type, that is - not only that the object is not iterable (e.g. - you can also run iter on a string, or on a simple object that implements iter) but that the object is not in the list family. I define the "list" family as list/tuple/set/frozenset, or anything that inherits from those, however - as there might be something that I'm missing, I would like to find a more general way than running isinstance against all of those types.
I thought of two possible ways to do it, but both seem somewhat awkward as they very much test against every possible list type, and I'm looking for a more general solution.
First option:
return not isinstance( value, (frozenset, list, set, tuple,) )
Second option:
return not hasattr(value, '__iter__')
Is testing for the __iter__ attribute enough? Is there a better way for finding whether an object is not a list-type?
Thanks in advance.
Edit:
(Quoted from comment to #Rosh Oxymoron's Solution):
Thinking about the definition better now, I believe it would be more right to say that I need to find everything that is not array-type in definition, but it can still be a string/other simple object...Checking against collections.Iterable will still give me True for objects which implement the __iter__ method.
There is no term 'list-ish' and there is no magic build-in check_if_value_is_an_instance_of_some_i_dont_know_what_set_of_types.
You solution with not isinstance( value, (frozenset, list, set, tuple,) ) is pretty good - it is clear and explicit.
There is no such family – it's not well-defined, and naturally there's no way to check for it. The closest thing possible is an iterable that is not a string. You can test if the object for iterability and then explicitly check if it is a string:
if isinstance(ob, collections.Iterable) and not isinstance(ob, types.StringTypes):
print "An iterable container"
A better approach would be to always ask for an iterable object, and when you need to pass a single string S, pass [S] instead. The ability to pass a string is a feature, e.g.:
alphabet = set('abcdefgijklmopqrstuvwxyz')
If you special-case string, you will:
Break the ability to use your function with the most natural way to pass a collection of characters.
Create inconsistency for user-defined string types and/or other containers that are string-ish (e.g. array.array can represent a chunk of data, just like string).
A string can be used to represent both a single piece of text and a collection of characters, and because of the second it is also list-ish.

Categories