Python: can I safely unpickle untrusted data? - python

The pickle module documentation says right at the beginning:
Warning:
The pickle module is not intended to be secure against erroneous or
maliciously constructed data. Never unpickle data received from an
untrusted or unauthenticated source.
However, further down under restricting globals it seems to describe a way to make unpickling data safe using a whitelist of allowed objects.
Does this mean that I can safely unpickle untrusted data if I use a RestrictedUnpickler that allows only some "elementary" types, or are there additional security issues that are not addressed by this method? If there are, is there another way to make unpickling safe (obviously at the cost of not being able to unpickle every stream)?
With "elementary types" I mean precisely the following:
bool
str, bytes, bytearray
int, float, complex
tuple, list, dict, set and frozenset

In this answer we're going to explore what exactly the pickle protocol allows an attacker to do. This means we're only going to rely on documented features of the protocol, not implementation details (with a few exceptions). In other words, we'll assume that the source code of the pickle module is correct and bug-free and allows us to do exactly what the documentation says and nothing more.
What does the pickle protocol allow an attacker to do?
Pickle allows classes to customize how their instances are pickled. During the unpickling process, we can:
Call (almost) any class's __setstate__ method (as long as we manage to unpickle an instance of that class).
Invoke arbitrary callables with arbitrary arguments, thanks to the __reduce__ method (as long as we can gain access to the callable somehow).
Invoke (almost) any unpickled object's append, extend and __setitem__ methods, once again thanks to __reduce__.
Access any attribute that Unpickler.find_class allows us to.
Freely create instances of the following types: str, bytes, list, tuple, dict, int, float, bool. This is not documented, but these types are built into the protocol itself and don't go through Unpickler.find_class.
The most useful (from an attacker's perspective) feature here is the ability to invoke callables. If they can access exec or eval, they can make us execute arbitrary code. If they can access os.system or subprocess.Popen they can run arbitrary shell commands. Of course, we can deny them access to these with Unpickler.find_class. But how exactly should we implement our find_class method? Which functions and classes are safe, and which are dangerous?
An attacker's toolbox
Here I'll try to explain some methods an attacker can use to do evil things. Giving an attacker access to any of these functions/classes means you're in danger.
Arbitrary code execution during unpickling:
exec and eval (duh)
os.system, os.popen, subprocess.Popen and all other subprocess functions
types.FunctionType, which allows to create a function from a code object (can be created with compile or types.CodeType)
typing.get_type_hints. Yes, you read that right. How, you ask? Well, typing.get_type_hints evaluates forward references. So all you need is an object with __annotations__ like {'x': 'os.system("rm -rf /")'} and get_type_hints will run the code for you.
functools.singledispatch. I see you shaking your head in disbelief, but it's true. Single-dispatch functions have a register method, which internally calls typing.get_type_hints.
... and probably a few more
Accessing things without going through Unpickler.find_class:
Just because our find_class method prevents an attacker from accessing something directly doesn't mean there's no indirect way of accessing that thing.
Attribute access: Everything is an object in python, and objects have lots of attributes. For example, an object's class can accessed as obj.__class__, a class's parents can be accessed as cls.__bases__, etc.
getattr
operator.attrgetter
object.__getattribute__
Tools.scripts.find_recursionlimit.RecursiveBlowup5.__getattr__
... and many more
Indexing: Lots of things are stored in lists, tuples and dicts - being able to index data structures opens many doors for an attacker.
operator.itemgetter
list.__getitem__, dict.__getitem__, etc
... and almost certainly some more
See Ned Batchelder's Eval is really dangerous to find out how an attacker can use these to gain access to pretty much everything.
Code execution after unpickling:
An attacker doesn't necessarily have to do something dangerous during the unpickling process - they can also try to return a dangerous object and let you call a dangerous function on accident. Maybe you call typing.get_type_hints on the unpickled object, or maybe you expect to unpickle a CuteBunny but instead unpickle a FerociousDragon and get your hand bitten off when you try to .pet() it. Always make sure the unpickled object is of the type you expect, its attributes are of the types you expect, and it doesn't have any attributes you don't expect it to have.
At this point, it should be obvious that there aren't many modules/classes/functions you can trust. When you implement your find_class method, never ever write a blacklist - always write a whitelist, and only include things you're sure can't be abused.
So what's the answer to the question?
If you really only allow access to bool, str, bytes, bytearray, int, float, complex, tuple, list, dict, set and frozenset then you're most likely safe. But let's be honest - you should probably use JSON instead.
In general, I think most classes are safe - with exceptions like subprocess.Popen, of course. The worst thing an attacker can do is call the class - which generally shouldn't do anything more dangerous than return an instance of that class.
What you really need to be careful about is allowing access to functions (and other non-class callables), and how you handle the unpickled object.

I'd go so far as saying that there is no safe way to use pickle to handle untrusted data.
Even with restricted globals, the dynamic nature of Python is such that a determined hacker still has a chance of finding a way back to the __builtins__ mapping and from there to the Crown Jewels.
See Ned Batchelder's blog posts on circumventing restrictions on eval() that apply in equal measure to pickle.
Remember that pickle is still a stack language and you cannot foresee all possible objects produced from allowing arbitrary calls even to a limited set of globals. The pickle documentation also doesn't mention the EXT* opcodes that allow calling copyreg-installed extensions; you'll have to account for anything installed in that registry too here. All it takes is one vector allowing a object call to be turned into a getattr equivalent for your defences to crumble.
At the very least use a cryptographic signature to your data so you can validate the integrity. You'll limit the risks, but if an attacker ever managed to steal your signing secrets (keys) then they could again slip you a hacked pickle.
I would instead use an an existing innocuous format like JSON and add type annotations; e.g. store data in dictionaries with a type key and convert when loading the data.

This idea has been discussed also on the mailing list python-ideas when addressing the problem of adding a safe pickle alternative in the standard library. For example here:
To make it safer I would have a restricted unpickler as the default (for load/loads) and force people to override it if they want to loosen restrictions. To be really explicit, I would make load/loads only work with built-in types.
And also here:
I've always wanted a version of pickle.loads() that takes a list of classes that are allowed to be instantiated.
Is the following enough for you: http://docs.python.org/3.4/library/pickle.html#restricting-globals ?
Indeed, it is. Thanks for pointing it out! I've never gotten past the module interface part of the docs. Maybe the warning at the top of the page could also mention that there are ways to mitigate the safety concerns, and point to #restricting-globals?
Yes, that would be a good idea :-)
So I don't know why the documentation has not been changed but according to me, using a RestrictedUnpickler to restrict the types that can be unpickled is a safe solution. Of course there could exist bugs in the library that compromise the system, but there could be a bug also in OpenSSL that show random memory data to everyone who asks.

Related

Can you safely change a Python object's type in a C extension?

Question
Suppose that I have implemented two Python types using the C extension API and that the types are identical (same data layouts/C struct) with the exception of their names and a few methods. Assuming that all methods respect the data layout, can you safely change the type of an object from one of these types into the other in a C function?
Notably, as of Python 3.9, there appears to be a function Py_SET_TYPE, but the documentation is not clear as to whether/when this is safe to do. I'm interested in knowing both how to use this function safely and whether types can be safely changed prior to version 3.9.
Motivation
I'm writing a Python C extension to implement a Persistent Hash Array Mapped Trie (PHAMT); in case it's useful, the source code is here (as of writing, it is at this commit). A feature I would like to add is the ability to create a Transient Hash Array Mapped Trie (THAMT) from a PHAMT. THAMTs can be created from PHAMTs in O(1) time and can be mutated in-place efficiently. Critically, THAMTs have the exact same underlying C data-structure as PHAMTs—the only real difference between a PHAMT and a THAMT is a few methods encapsulated by their Python types. This common structure allows one to very efficiently turn a THAMT back into a PHAMT once one has finished performing a set of edits. (This pattern typically reduces the number of memory allocations when performing a large number of updates to a PHAMT).
A very convenient way to implement the conversion from THAMT to PHAMT would be to simply change the type pointers of the THAMT objects from the THAMT type to the PHAMT type. I am confident that I can write code that safely navigates this change, but I can imagine that doing so might, for example, break the Python garbage collector.
(To be clear: the motivation is just context as to how the question arose. I'm not looking for help implementing the structures described in the Motivation, I'm looking for an answer to the Question, above.)
The supported way
It is officially possible to change an object's type in Python, as long as the memory layouts are compatible... but this is mostly limited to types not implemented in C. With some restrictions, it is possible to do
# Python attribute assignment, not C struct member assignment
obj.__class__ = some_new_class
to change an object's class, with one of the restrictions being that both the old and new classes must be "heap types", which all classes implemented in Python are and most classes implemented in C are not. (types.ModuleType and subclasses of that type are also specifically permitted, despite types.ModuleType not being a heap type. See the source for exact restrictions.)
If you want to create a heap type from C, you can, but the interface is pretty different from the normal way of defining Python types from C. Plus, for __class__ assignment to work, you have to not set the Py_TPFLAGS_IMMUTABLETYPE flag, and that means that people will be able to monkey-patch your classes in ways you might not like (or maybe you see that as an upside).
If you want to go that route, I suggest looking at the CPython 3.10 _functools module source code for an example. (They set the Py_TPFLAGS_IMMUTABLETYPE flag, which you'll have to make sure not to do.)
The unsupported way
There was an attempt at one point to allow __class__ assignment for non-heap types, as long as the memory layouts worked. It got abandoned because it caused problems with some built-in immutable types, where the interpreter likes to reuse instances. For example, allowing (1).__class__ = SomethingElse would have caused a lot of problems. You can read more in the big comment in the source code for the __class__ setter. (The comment is slightly out of date, particularly regarding the Py_TPFLAGS_IMMUTABLETYPE flag, which was added after the comment was written.)
As far as I know, this was the only problem, and I don't think any more problems have been added since then. The interpreter isn't going to aggressively reuse instances of your classes, so as long as you're not doing anything like that, and the memory layouts are compatible, I think changing the type of your objects should work for now, even for non-heap-types. However, it is not officially supported, so even if I'm right about this working for now, there's no guarantee it'll keep working.
Py_SET_TYPE only sets an object's type pointer. It doesn't do any refcount fixing that might be needed. It's a very low-level operation. If neither the old class nor the new class are heap types, no extra refcount fixing is needed, but if the old class is a heap type, you will have to decref the old class, and if the new class is a heap type, you will have to incref the new class.
If you need to decref the old class, make sure to do it after changing the object's class and possibly incref'ing the new class.
According to the language reference, chapter 3 "Data model" (see here):
An object’s type determines the operations that the object supports (e.g., “does it have a length?”) and also defines the possible values for objects of that type. The type() function returns an object’s type (which is an object itself). Like its identity, an object’s type is also unchangeable.[1]
which, to my mind states that the type must never change, and changing it would be illegal as it would break the language specification. The footnote however states that
[1] It is possible in some cases to change an object’s type, under certain controlled conditions. It generally isn’t a good idea though, since it can lead to some very strange behaviour if it is handled incorrectly.
I don't know of any method to change the type of an object from within python itself, so the "possible" may indeed refer to the CPython function.
As far as I can see a PyObject is defined internally as a
struct _object {
_PyObject_HEAD_EXTRA
Py_ssize_t ob_refcnt;
PyTypeObject *ob_type;
};
So the reference counting should still work. On the other hand you will segfault the interpreter if you set the type to something that is not a PyTypeObject, or if the pointer is free()d, so the usual caveats.
Apart from that I agree that the specification is a little ambiguous, but the question of "legality" may not have a good answer. The long and short of it seems to me to be "do not change types unless you know what your are doing, and if you are not hacking on CPython itself you do not know what you are doing".
Edit: The Py_SET_TYPE function was added in Python 3.9 based on this commit. Apparently, people used to just set the type using
Py_TYPE(obj) = typeobj;
So the inclusion (without being formerly announced as far as I can see) is more akin to adding a convenience function.

subclassing dict; dict.update returns incorrrect value - python bug?

I needed to make a class that extended dict and ran into an interesting problem illustrated by the dumb example in the image below.
Why is d.update() ignoring the class's __getitem__?
EDIT: This is in python2.7 which does not appear to contain collections.UserDict
Thinking UserDict.UserDict is the equivalent I tried this, and it gets closer, but still behaves interestingly.
This is an example of the open-closed-principle (the class is open for extension but closed for modification). It is good thing to have because it allows subclassers to extend or override a method without unintentionally triggering behavior changes in others and without breaking the classes's invariants.
We even do this in pure python code as well; for example, inside the pure python ordered dict code, the class local call from __init__() to update() is done using name mangling. This allows a subclasser to override update() without accidentally breaking __init__().
Sometimes, this is inconvenient. It means that a subclasser has to override every method whose behavior they want to change including get(), update(), and others. However, there are offsetting benefits (protection of internal invariants, preventing implementation details from leaking from the abstraction, and allowing users to assume the methods are independent of one another).
This style (chosen by Guido from the outset) is the default for the builtin types (otherwise we would forever be fighting segfaulting invariant violations) and for some pure python classes.
We do document when there is a departure from the default. For example, the cmd module uses the framework design pattern, letting the user define various do_action() methods. Also, some of the http modules do the same, specifically documenting that a user's do_GET() method is called and that is how you attach customized HTTP event handlers.
In the absence of specifically documented method hooks (i.e. those listed above or methods like dict.__missing__(), a subclasser should presume method independence. Otherwise, how are you to know whether __getitem__() calls get() under the hood or vice-versa?
FWIW, this isn't unique to Python. It comes up quite a bit in object oriented programming. Correctly designed classes either document root methods that affect the behavior of other methods or they are presumed to be independent.
There may need to be a FAQ for this, but nothing is broken or wrong here (other than Python having way too many dict variants to chose from). If someone mistakenly assumes or believes that __getitem__() must be called by the other accessor methods, they find out very quickly that assumption is wrong (that is if they run even minimal tests on the code).

Do Pickle and Dill have similar levels of risk of containing malicious script?

Dill is obviously a very useful module, and it seems as long as you manage the files carefully it is relatively safe. But I was put off by the statement:
Thus dill is not intended to be secure against erroneously or maliciously constructed data. It is left to the user to decide whether the data they unpickle is from a trustworthy source.
I read in in https://pypi.python.org/pypi/dill. It's left to the user to decide how to manage their files.
If I understand correctly, once it has been pickled by dill, you can not easily find out what the original script will do without some special skill.
MY QUESTION IS: although I don't see a warning, does a similar situation also exist for pickle?
Dill is built on top of pickle, and the warnings apply just as much to pickle as they do to dill.
Pickle uses a stack language to effectively execute arbitrary Python code. An attacker can sneak in instructions to open up a backport to your machine, for example. Don't ever use pickled data from untrusted sources.
The documentation includes an explicit warning:
Warning: The pickle module is not secure against erroneous or maliciously constructed data. Never unpickle data received from an untrusted or unauthenticated source.
Yes
Because Pickle allows you to override the object serialization and deserialization, via
object.__getstate__()
Classes can further influence how their instances are pickled; if the
class defines the method __getstate__(), it is called and the returned
object is pickled as the contents for the instance, instead of the
contents of the instance’s dictionary. If the __getstate__() method is
absent, the instance’s __dict__ is pickled as usual.
object.__setstate__(state)
Upon unpickling, if the class defines __setstate__(), it is called
with the unpickled state. In that case, there is no requirement for
the state object to be a dictionary. Otherwise, the pickled state must
be a dictionary and its items are assigned to the new instance’s
dictionary.
Because these functions can execute arbitrary code at the user's permission level, it is relatively easy to write a malicious deserializer -- e.g. one that deletes all the files on your hard disk.
Although I don't see a warning, does a similar situation also exist for pickle?
Always, always assume that just because someone doesn't state it's dangerous it is not safe to use something.
That being said, Pickle docs do say the same:
Warning The pickle module is not secure against erroneous or maliciously constructed data. Never unpickle data received from an untrusted or unauthenticated source.
So yes, that security risk exists on pickle, too.
To explain the background: pickle and dill restore the state of python objects. In CPython, the default python implementation, this means restoring PyObjects structs, which contain a length field. Modification of that, as an example, leads to funky effects and might have arbitrary effects on your python process' memory.
By the way, even assuming that data is not malicious doesn't mean you can un-pickle or un-dill just about anything that comes e.g. from a different python version. So, to me, that question is a bit of theoretical one: If you need portable objects, you will have to implement a rock-solid serialization/deserialization mechanism that transports the data you need transported, and nothing more or less.

Is pickle secure against malicious input after restricting to safe types? [duplicate]

The pickle module documentation says right at the beginning:
Warning:
The pickle module is not intended to be secure against erroneous or
maliciously constructed data. Never unpickle data received from an
untrusted or unauthenticated source.
However, further down under restricting globals it seems to describe a way to make unpickling data safe using a whitelist of allowed objects.
Does this mean that I can safely unpickle untrusted data if I use a RestrictedUnpickler that allows only some "elementary" types, or are there additional security issues that are not addressed by this method? If there are, is there another way to make unpickling safe (obviously at the cost of not being able to unpickle every stream)?
With "elementary types" I mean precisely the following:
bool
str, bytes, bytearray
int, float, complex
tuple, list, dict, set and frozenset
In this answer we're going to explore what exactly the pickle protocol allows an attacker to do. This means we're only going to rely on documented features of the protocol, not implementation details (with a few exceptions). In other words, we'll assume that the source code of the pickle module is correct and bug-free and allows us to do exactly what the documentation says and nothing more.
What does the pickle protocol allow an attacker to do?
Pickle allows classes to customize how their instances are pickled. During the unpickling process, we can:
Call (almost) any class's __setstate__ method (as long as we manage to unpickle an instance of that class).
Invoke arbitrary callables with arbitrary arguments, thanks to the __reduce__ method (as long as we can gain access to the callable somehow).
Invoke (almost) any unpickled object's append, extend and __setitem__ methods, once again thanks to __reduce__.
Access any attribute that Unpickler.find_class allows us to.
Freely create instances of the following types: str, bytes, list, tuple, dict, int, float, bool. This is not documented, but these types are built into the protocol itself and don't go through Unpickler.find_class.
The most useful (from an attacker's perspective) feature here is the ability to invoke callables. If they can access exec or eval, they can make us execute arbitrary code. If they can access os.system or subprocess.Popen they can run arbitrary shell commands. Of course, we can deny them access to these with Unpickler.find_class. But how exactly should we implement our find_class method? Which functions and classes are safe, and which are dangerous?
An attacker's toolbox
Here I'll try to explain some methods an attacker can use to do evil things. Giving an attacker access to any of these functions/classes means you're in danger.
Arbitrary code execution during unpickling:
exec and eval (duh)
os.system, os.popen, subprocess.Popen and all other subprocess functions
types.FunctionType, which allows to create a function from a code object (can be created with compile or types.CodeType)
typing.get_type_hints. Yes, you read that right. How, you ask? Well, typing.get_type_hints evaluates forward references. So all you need is an object with __annotations__ like {'x': 'os.system("rm -rf /")'} and get_type_hints will run the code for you.
functools.singledispatch. I see you shaking your head in disbelief, but it's true. Single-dispatch functions have a register method, which internally calls typing.get_type_hints.
... and probably a few more
Accessing things without going through Unpickler.find_class:
Just because our find_class method prevents an attacker from accessing something directly doesn't mean there's no indirect way of accessing that thing.
Attribute access: Everything is an object in python, and objects have lots of attributes. For example, an object's class can accessed as obj.__class__, a class's parents can be accessed as cls.__bases__, etc.
getattr
operator.attrgetter
object.__getattribute__
Tools.scripts.find_recursionlimit.RecursiveBlowup5.__getattr__
... and many more
Indexing: Lots of things are stored in lists, tuples and dicts - being able to index data structures opens many doors for an attacker.
operator.itemgetter
list.__getitem__, dict.__getitem__, etc
... and almost certainly some more
See Ned Batchelder's Eval is really dangerous to find out how an attacker can use these to gain access to pretty much everything.
Code execution after unpickling:
An attacker doesn't necessarily have to do something dangerous during the unpickling process - they can also try to return a dangerous object and let you call a dangerous function on accident. Maybe you call typing.get_type_hints on the unpickled object, or maybe you expect to unpickle a CuteBunny but instead unpickle a FerociousDragon and get your hand bitten off when you try to .pet() it. Always make sure the unpickled object is of the type you expect, its attributes are of the types you expect, and it doesn't have any attributes you don't expect it to have.
At this point, it should be obvious that there aren't many modules/classes/functions you can trust. When you implement your find_class method, never ever write a blacklist - always write a whitelist, and only include things you're sure can't be abused.
So what's the answer to the question?
If you really only allow access to bool, str, bytes, bytearray, int, float, complex, tuple, list, dict, set and frozenset then you're most likely safe. But let's be honest - you should probably use JSON instead.
In general, I think most classes are safe - with exceptions like subprocess.Popen, of course. The worst thing an attacker can do is call the class - which generally shouldn't do anything more dangerous than return an instance of that class.
What you really need to be careful about is allowing access to functions (and other non-class callables), and how you handle the unpickled object.
I'd go so far as saying that there is no safe way to use pickle to handle untrusted data.
Even with restricted globals, the dynamic nature of Python is such that a determined hacker still has a chance of finding a way back to the __builtins__ mapping and from there to the Crown Jewels.
See Ned Batchelder's blog posts on circumventing restrictions on eval() that apply in equal measure to pickle.
Remember that pickle is still a stack language and you cannot foresee all possible objects produced from allowing arbitrary calls even to a limited set of globals. The pickle documentation also doesn't mention the EXT* opcodes that allow calling copyreg-installed extensions; you'll have to account for anything installed in that registry too here. All it takes is one vector allowing a object call to be turned into a getattr equivalent for your defences to crumble.
At the very least use a cryptographic signature to your data so you can validate the integrity. You'll limit the risks, but if an attacker ever managed to steal your signing secrets (keys) then they could again slip you a hacked pickle.
I would instead use an an existing innocuous format like JSON and add type annotations; e.g. store data in dictionaries with a type key and convert when loading the data.
This idea has been discussed also on the mailing list python-ideas when addressing the problem of adding a safe pickle alternative in the standard library. For example here:
To make it safer I would have a restricted unpickler as the default (for load/loads) and force people to override it if they want to loosen restrictions. To be really explicit, I would make load/loads only work with built-in types.
And also here:
I've always wanted a version of pickle.loads() that takes a list of classes that are allowed to be instantiated.
Is the following enough for you: http://docs.python.org/3.4/library/pickle.html#restricting-globals ?
Indeed, it is. Thanks for pointing it out! I've never gotten past the module interface part of the docs. Maybe the warning at the top of the page could also mention that there are ways to mitigate the safety concerns, and point to #restricting-globals?
Yes, that would be a good idea :-)
So I don't know why the documentation has not been changed but according to me, using a RestrictedUnpickler to restrict the types that can be unpickled is a safe solution. Of course there could exist bugs in the library that compromise the system, but there could be a bug also in OpenSSL that show random memory data to everyone who asks.

Python 3: Determine if object supports IO

Some Python methods work on various input sources. For example, the XML element tree parse method takes an object which can either be a string, (in which case the API treats it like a filename), or an object that supports the IO interface, like a file object or io.StringIO.
So, obviously the parse method is doing some kind of interface sniffing to figure out which course of action to take. I guess the simplest way to achieve this would be to check if the input parameter is a string by saying isinstance(x, str), and if so treat it as a file name, else treat it as an IO object.
But for better error-checking, I would think it would be best to check if x supports the IO interface. What is the standard, idiomatic way to check if an object supports a specified interface?
One way, I suppose, would be to just say:
if "read" in x.__class__.__dict__: # check if object has a read method
But just because x has a "read" method doesn't necessarily mean it supports the IO interface, so I assume I should also check for every method in the IO interface. Is this usually the best way to go about doing this? Or should I just forget about checking the interface, and just let a possible AttributeError get handled further up the stack?
Python strongly encourages duck typing: Just assume the object that was passed in is valid and try to use it. This way, your code is as flexible as possible. Of course, if the actions of your code depend on the type of the object that is passed in, you do need some kind of type checking. I suggest to keep this type checking to a minimum though, and go for isinstance(x, str).
If you pass in an object that neither is a string nor supports an IO interface, this will result in an AttributeError. If this happens, this is a bug in the calling code. This exception shouldn't be handled anywhere -- instead the bug should be fixed!
That said, you could use
isinstance(x, io.IOBase)
to test for the built-in classes supporting the I/O protocol. This would restrict your code to classes that actually derive from io.IOBase though -- a superficial and unnecessary restriction.
Or should I just forget about checking the interface, and just let a possible AttributeError get handled further up the stack?
The general pythonic principle seems to be doing whatever you want to do with the object you get and just capture any exception it might cause. This is the so-called duck typing. It does not necessarily mean you should let those exception slip from your function to the calling code, though. You can handle them in the function itself if it's capable of doing so in meaningful way.
Yeah, python is all about duck typing, and it's perfectly acceptable to check for a few methods to decide whether an object supports the IO interface. Sometimes it even makes sense to just try calling your methods in a try/except block and catch TypeError or ValueError so you know if it really supports the same interface (but use this sparingly). I'd say use hasattr instead of looking at __class__.__dict__, but otherwise that's the approach I would take.
(In general, I'd check first if there wasn't already a method somewhere in the standard library to handle stuff like this, since it can be error-prone to decide what constitutes the "IO interface" yourself. For example, there a few handy gems in the types and inspect modules for related interface checking.)

Categories