Handling Python class redefinition warnings in pylint and mypy - python

SQLAlchemy backrefs tend to cause circular imports as they get complex, so I came up with a way to "re-open" a Python class like in Ruby:
def reopen(cls):
"""
Moves the contents of the decorated class into an existing class and returns that.
Usage::
from .other_module import ExistingClass
#reopen(ExistingClass)
class ExistingClass:
#property
def new_property(self):
pass
This is equivalent to::
def new_property(self):
pass
ExistingClass.new_property = property(new_property)
"""
def decorator(temp_cls):
for attr, value in temp_cls.__dict__.items():
# Skip the standard Python attributes, process the rest
if attr not in ('__dict__', '__doc__', '__module__', '__weakref__'):
setattr(cls, attr, value)
return cls
return decorator
This is a simplified version; the full implementation has more safety checks and tests.
This code works fairly well for inserting bi-directional SQLAlchemy relationships and helper methods wherever SQLAlchemy's existing relationship+backref mechanism doesn't suffice.
However:
mypy raises error "name 'ExistingClass' already defined (possibly by an import)"
pylint raises E0102 "class already defined"
I could ignore both errors by having an annotation on the line (# type: ignore and # skipcq), but overuse of those annotations can let bugs slip through. It'll be nice to tell mypy and pylint that this use is okay.
How can I do that?

As a workaround, I've switched to this pattern:
#reopen(ExistingClass)
class __ExistingClass:
#property
def new_property(self):
pass
>>> assert __ExistingClass is ExistingClass
True
Pylint and Mypy no longer complain about redefinition. However, Mypy will not recognise ExistingClass.new_property as an addition to the class, so it kicks the problem downstream to code that uses it.
My impression is that such class extension hacks, while discussed multiple times, have not gained enough traction as a coding pattern, and in Mypy's case this will need a custom plugin.

Related

Type hinting for object with autospec'd dependencies

I am creating tests for some controller objects that obviously have dependencies. I want to test that it's interacting correctly with the dependencies without instantiating them for obvious reasons (database connections). So I have a class like
class A:
def __init__(self, some_dependency: InterfaceB):
self.some_dependency: InterfaceB = some_dependency
def do_thing(self, x):
self.some_dependecy.do_subtask(necessary_data=x)
And I want to test that it's correctly passing the value down to its dependency so I do something like this
def test_do_thing():
a: A = A(some_dependency=create_autospec(InterfaceB))
a.do_thing(1)
assert a.some_dependency.do_subtask.called_with(necessary_data=1)
but the code completion/linter is going to complain that do_subtask is a function and doesn't have a called_with property, so I have to do
do_subtask: Mock = cast(Mock, a.some_dependency.do_subtask)
assert do_subtask.called_with(necessary_data=1)
which, if you imagine a controller with multiple dependencies, or even just using multiple methods of one dependency, it gets pretty tedious and not very DRY and is infact very WET (What? Eeek Terrible). Is there any solution that does any one of the follwing in order of preference
Magically hints that A is of type A, but all of its dependencies are autospeced, so they're of the type that they are, but then their methods would be mocks?
A type hint that says a variable is an autospec of a type, so it has all the methods of the type, but they're actually Mock objects?
Just a less repetitive way to declare to the pycharm linter that they're mocks
I don't know if my answer could be useful for you, and if not excuse me; surely not all your questions will have an answer from the code below. I don't have checked anything about PyCharm hints.
I have executed the code below in my IDE PyCharm and the test is passed successfully.
I have changed your assert because I use unittest.mock.
import unittest
from unittest.mock import create_autospec
class InterfaceB:
def do_subtask(self, necessary_data):
pass
class A:
def __init__(self, some_dependency: InterfaceB):
self.some_dependency: InterfaceB = some_dependency
def do_thing(self, x):
self.some_dependency.do_subtask(necessary_data=x)
class MyTestCase(unittest.TestCase):
def test_do_thing(self):
a: A = A(some_dependency=create_autospec(InterfaceB))
a.do_thing(1)
# ---> Note that I have changed your assert!
#assert a.some_dependency.do_subtask.called_with(necessary_data=1)
a.some_dependency.do_subtask.assert_called_once_with(necessary_data=1)
if __name__ == '__main__':
unittest.main()
As you can see I have defined a class InterfaceB with a method do_subtask().
The method test_do_thing() verifies only that: the execution of method do_thing(1) calls one time the method do_subtask(1).

Make python linter "understand" metaclass changes

I am playing around with python metaclasses, and trying to write some sort of metaclasses that changes or adds methods dynamically for its subclasses.
For example, here is a metaclass that its purpose is to find async methods in the subclass (that their name also ends with the string "_async") and add an additional "synchronized" version of this method:
class AsyncClientMetaclass(type):
#staticmethod
def async_func_to_sync(func):
return lambda *_args, **_kwargs: run_synchronized(func(*_args, **_kwargs))
def __new__(mcs, *args, **kwargs):
cls = super().__new__(mcs, *args, **kwargs)
_, __, d = args
for key, value in d.items():
if asyncio.iscoroutinefunction(value) and key.endswith('_async'):
sync_func_name = key[:-len('_async')]
if sync_func_name in d:
continue
if isinstance(value, staticmethod):
value = value.__func__
setattr(cls, sync_func_name, mcs.async_func_to_sync(value))
return cls
# usage
class SleepClient(metaclass=AsyncClientMetaclass):
async def sleep_async(self, seconds):
await asyncio.sleep(seconds)
return f'slept for {seconds} seconds'
c = SleepClient()
res = c.sleep(2)
print(res) # prints "slept for 2 seconds"
This example works great, the only problem is that the python linter warns about using the non async method that the metaclass has created (for the example above, the warning is Unresolved attribute reference 'sleep' for class 'SleepClient')
For now, I am adding pylint: disable whenever I am using a sync method created by the metaclass, but I am wondering if is there any way to add a custom linter rule with the metaclass, so the linter will know those methods will be created dynamically.
And are you think there is a better way to achieve this purpose rather than using metaclass?
Thanks!
As put by Chepner: no static code analyser can know about these methods - not linters nor type-annotation checking tools like MyPy, unless you give then a hint.
Maybe there is one way out: static type annotators will consume a parallel ".pyi" stub file, put side by side to the correspondent ".py" file that can list class interfaces, and, I may be wrong, but whatever it findes there will supersede what the toll "sees" on the actual Py file.
So, you could instrument your metaclass to, aside from generating the actual methods, render their signature and the signature for the "real" methods and attributes of the class as source code, and record those as the proper "pyi" file. You will have to run this code once, before the linter can find its way - but it is the only workaround I can think of.
In other words, to be clear:
make a mechanism called by the metaclass that will check for the existence and time-stamp of the appropriate ".pyi" file for the classes it is modifying, and generate them. By checking the timestamp, or generating this file only when some "--build" variable is active, there should be no runtime penalties, and static-type checkers (and possibly some linters), should be pleased.

Class invariants in Python

Class invariants definitely can be useful in coding, as they can give instant feedback when clear programming error has been detected and also they improve code readability as being explicit about what arguments and return value can be. I'm sure this applies to Python too.
However, generally in Python, testing of arguments seems not to be "pythonic" way to do things, as it is against the duck-typing idiom.
My questions are:
What is Pythonic way to use assertions in code?
For example, if I had following function:
def do_something(name, path, client):
assert isinstance(name, str)
assert path.endswith('/')
assert hasattr(client, "connect")
More generally, when there is too much of assertions?
I'd be happy to hear your opinions on this!
Short Answer:
Are assertions Pythonic?
Depends how you use them. Generally, no. Making generalized, flexible code is the most Pythonic thing to do, but when you need to check invariants:
Use type hinting to help your IDE perform type inference so you can avoid potential pitfalls.
Make robust unit tests.
Prefer try/except clauses that raise more specific exceptions.
Turn attributes into properties so you can control their getters and setters.
Use assert statements only for debug purposes.
Refer to this Stack Overflow discussion for more info on best practices.
Long Answer
You're right. It's not considered Pythonic to have strict class invariants, but there is a built-in way to designate the preferred types of parameters and returns called type hinting, as defined in PEP 484:
[Type hinting] aims to provide a standard syntax for type annotations, opening up Python code to easier static analysis and refactoring, potential runtime type checking, and (perhaps, in some contexts) code generation utilizing type information.
The format is this:
def greeting(name: str) -> str:
return 'Hello ' + name
The typing library provides even further functionality. However, there's a huge caveat...
While these annotations are available at runtime through the usual __annotations__ attribute, no type checking happens at runtime . Instead, the proposal assumes the existence of a separate off-line type checker which users can run over their source code voluntarily. Essentially, such a type checker acts as a very powerful linter.
Whoops. Well, you could use an external tool while testing to check when invariance is broken, but that doesn't really answer your question.
Properties and try/except
The best way to handle an error is to make sure it never happens in the first place. The second best way is to have a plan when it does. Take, for example, a class like this:
class Dog(object):
"""Canis lupus familiaris."""
self.name = str()
"""The name you call it."""
def __init__(self, name: str):
"""What're you gonna name him?"""
self.name = name
def speak(self, repeat=0):
"""Make dog bark. Can optionally be repeated."""
print("{dog} stares at you blankly.".format(dog=self.name))
for i in range(repeat):
print("{dog} says: 'Woof!'".format(dog=self.name)
If you want your dog's name to be an invariant, this won't actually prevent self.name from being overwritten. It also doesn't prevent parameters that could crash speak(). However, if you make self.name a property...
class Dog(object):
"""Canis lupus familiaris."""
self._name = str()
"""The name on the microchip."""
self.name = property()
"""The name on the collar."""
def __init__(self, name: str):
"""What're you gonna name him?"""
if not name and not name.isalpha():
raise ValueError("Name must exist and be pronouncable.")
self._name = name
def speak(self, repeat=0):
"""Make dog bark. Can optionally be repeated."""
try:
print("{dog} stares at you blankly".format(dog=self.name))
if repeat < 0:
raise ValueError("Cannot negatively bark.")
for i in range(repeat):
print("{dog} says: 'Woof!'".format(dog=self.name))
except (ValueError, TypeError) as e:
raise RuntimeError("Dog unable to speak.") from e
#property
def name(self):
"""Gets name."""
return self._name
Since our property doesn't have a setter, self.name is essentially invariant; that value can't change unless someone is aware of the self._x. Furthermore, since we've added try/except clauses to process the specific errors we're expecting, we've provided a more concise control flow for our program.
So When Do You Use Assertions?
There might not be a 100% "Pythonic" way to perform assertions since you should be doing those in your unit tests. However, if it's critical at runtime for data to be invariant, assert statements can be used to pinpoint possible trouble spots, as explained in the Python wiki:
Assertions are particularly useful in Python because of Python's powerful and flexible dynamic typing system. In the same example, we might want to make sure that ids are always numeric: this will protect against internal bugs, and also against the likely case of somebody getting confused and calling by_name when they meant by_id.
For example:
from types import *
class MyDB:
...
def add(self, id, name):
assert type(id) is IntType, "id is not an integer: %r" % id
assert type(name) is StringType, "name is not a string: %r" % name
Note that the "types" module is explicitly "safe for import *"; everything it exports ends in "Type".
That takes care of data type checking. For classes, you use isinstance(), as you did in your example:
You can also do this for classes, but the syntax is a little different:
class PrintQueueList:
...
def add(self, new_queue):
assert new_queue not in self._list, \
"%r is already in %r" % (self, new_queue)
assert isinstance(new_queue, PrintQueue), \
"%r is not a print queue" % new_queue
I realize that's not the exact way our function works but you get the idea: we want to protect against being called incorrectly. You can also see how printing the string representation of the objects involved in the error will help with debugging.
For proper form, attaching a message to your assertions like in the examples above
(ex: assert <statement>, "<message>") will automatically attach the info into the resulting AssertionError to assist you with debugging. It could also give some insight into a consumer bug report as to why the program is crashing.
Checking isinstance() should not be overused: if it quacks like a duck, there's perhaps no need to enquire too deeply into whether it really is. Sometimes it can be useful to pass values that were not anticipated by the original programmer.
Places to consider putting assertions:
checking parameter types, classes, or values
checking data structure invariants
checking "can't happen" situations (duplicates in a list, contradictory state variables.)
after calling a function, to make sure that its return is reasonable
Assertions can be beneficial if they're properly used, but you shouldn't become dependent on them for data that doesn't need to be explicitly invariant. You might need to refactor your code if you want it to be more Pythonic.
Please have a look at icontract library. We developed it to bring design-by-contract into Python with informative error messages. Here as an example of a class invariant:
>>> #icontract.inv(lambda self: self.x > 0)
... class SomeClass:
... def __init__(self) -> None:
... self.x = 100
...
... def some_method(self) -> None:
... self.x = -1
...
... def __repr__(self) -> str:
... return "some instance"
...
>>> some_instance = SomeClass()
>>> some_instance.some_method()
Traceback (most recent call last):
...
icontract.ViolationError: self.x > 0:
self was some instance
self.x was -1

What is the point of calling super in custom error classes in python?

So I have a simple custom error class in Python that I created based on the Python 2.7 documentation:
class InvalidTeamError(Exception):
def __init__(self, message='This user belongs to a different team'):
self.message = message
This gives me warning W0231: __init__ method from base class %r is not called in PyLint so I go look it up and am given the very helpful description of "explanation needed." I'd normally just ignore this error but I have noticed that a ton code online includes a call to super in the beginning of the init method of custom error classes so my question is: Does doing this actually serve a purpose or is it just people trying to appease a bogus pylint warning?
This was a valid pylint warning: by not using the superclass __init__ you can miss out on implementation changes in the parent class. And, indeed, you have - because BaseException.message has been deprecated as of Python 2.6.
Here would be an implementation which will avoid your warning W0231 and will also avoid python's deprecation warning about the message attribute.
class InvalidTeamError(Exception):
def __init__(self, message='This user belongs to a different team'):
super(InvalidTeamError, self).__init__(message)
This is a better way to do it, because the implementation for BaseException.__str__ only considers the 'args' tuple, it doesn't look at message at all. With your old implementation, print InvalidTeamError() would have only printed an empty string, which is probably not what you wanted!
Lookinv at the cpython2.7 source code there should be no problem avoiding that call to super init and Yes it's done just because its generally a good practice to call base class init in your init.
https://github.com/python/cpython/blob/master/Objects/exceptions.c see line 60 for BaseException init and line 456 how Exception derives from BaseException.

Python's equivalent of .Net's sealed class

Does python have anything similar to a sealed class? I believe it's also known as final class, in java.
In other words, in python, can we mark a class so it can never be inherited or expanded upon? Did python ever considered having such a feature? Why?
Disclaimers
Actually trying to understand why sealed classes even exist. Answer here (and in many, many, many, many, many, really many other places) did not satisfy me at all, so I'm trying to look from a different angle. Please, avoid theoretical answers to this question, and focus on the title! Or, if you'd insist, at least please give one very good and practical example of a sealed class in csharp, pointing what would break big time if it was unsealed.
I'm no expert in either language, but I do know a bit of both. Just yesterday while coding on csharp I got to know about the existence of sealed classes. And now I'm wondering if python has anything equivalent to that. I believe there is a very good reason for its existence, but I'm really not getting it.
You can use a metaclass to prevent subclassing:
class Final(type):
def __new__(cls, name, bases, classdict):
for b in bases:
if isinstance(b, Final):
raise TypeError("type '{0}' is not an acceptable base type".format(b.__name__))
return type.__new__(cls, name, bases, dict(classdict))
class Foo:
__metaclass__ = Final
class Bar(Foo):
pass
gives:
>>> class Bar(Foo):
... pass
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 5, in __new__
TypeError: type 'Foo' is not an acceptable base type
The __metaclass__ = Final line makes the Foo class 'sealed'.
Note that you'd use a sealed class in .NET as a performance measure; since there won't be any subclassing methods can be addressed directly. Python method lookups work very differently, and there is no advantage or disadvantage, when it comes to method lookups, to using a metaclass like the above example.
Before we talk Python, let's talk "sealed":
I, too, have heard that the advantage of .Net sealed / Java final / C++ entirely-nonvirtual classes is performance. I heard it from a .Net dev at Microsoft, so maybe it's true. If you're building a heavy-use, highly-performance-sensitive app or framework, you may want to seal a handful of classes at or near the real, profiled bottleneck. Particularly classes that you are using within your own code.
For most applications of software, sealing a class that other teams consume as part of a framework/library/API is kinda...weird.
Mostly because there's a simple work-around for any sealed class, anyway.
I teach "Essential Test-Driven Development" courses, and in those three languages, I suggest consumers of such a sealed class wrap it in a delegating proxy that has the exact same method signatures, but they're override-able (virtual), so devs can create test-doubles for these slow, nondeterministic, or side-effect-inducing external dependencies.
[Warning: below snark intended as humor. Please read with your sense of humor subroutines activated. I do realize that there are cases where sealed/final are necessary.]
The proxy (which is not test code) effectively unseals (re-virtualizes) the class, resulting in v-table look-ups and possibly less efficient code (unless the compiler optimizer is competent enough to in-line the delegation). The advantages are that you can test your own code efficiently, saving living, breathing humans weeks of debugging time (in contrast to saving your app a few million microseconds) per month... [Disclaimer: that's just a WAG. Yeah, I know, your app is special. ;-]
So, my recommendations: (1) trust your compiler's optimizer, (2) stop creating unnecessary sealed/final/non-virtual classes that you built in order to either (a) eke out every microsecond of performance at a place that is likely not your bottleneck anyway (the keyboard, the Internet...), or (b) create some sort of misguided compile-time constraint on the "junior developers" on your team (yeah...I've seen that, too).
Oh, and (3) write the test first. ;-)
Okay, yes, there's always link-time mocking, too (e.g. TypeMock). You got me. Go ahead, seal your class. Whatevs.
Back to Python: The fact that there's a hack rather than a keyword is probably a reflection of the pure-virtual nature of Python. It's just not "natural."
By the way, I came to this question because I had the exact same question. Working on the Python port of my ever-so-challenging and realistic legacy-code lab, and I wanted to know if Python had such an abominable keyword as sealed or final (I use them in the Java, C#, and C++ courses as a challenge to unit testing). Apparently it doesn't. Now I have to find something equally challenging about untested Python code. Hmmm...
Python does have classes that can't be extended, such as bool or NoneType:
>>> class ExtendedBool(bool):
... pass
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: type 'bool' is not an acceptable base type
However, such classes cannot be created from Python code. (In the CPython C API, they are created by not setting the Py_TPFLAGS_BASETYPE flag.)
Python 3.6 will introduce the __init_subclass__ special method; raising an error from it will prevent creating subclasses. For older versions, a metaclass can be used.
Still, the most “Pythonic” way to limit usage of a class is to document how it should not be used.
Similar in purpose to a sealed class and useful to reduce memory usage (Usage of __slots__?) is the __slots__ attribute which prevents monkey patching a class. Because when the metaclass __new__ is called, it is too late to put a __slots__ into the class, we have to put it into the namespace at the first possible timepoint, i.e. during __prepare__. Additionally, this throws the TypeError a little bit earlier. Using mcs for the isinstance comparison removes the necessity to hardcode the metaclass name in itself. The disadvantage is that all unslotted attributes are read-only. Therefore, if we want to set specific attributes during initialization or later, they have to slotted specifically. This is feasible e.g. by using a dynamic metaclass taking slots as an argument.
def Final(slots=[]):
if "__dict__" in slots:
raise ValueError("Having __dict__ in __slots__ breaks the purpose")
class _Final(type):
#classmethod
def __prepare__(mcs, name, bases, **kwargs):
for b in bases:
if isinstance(b, mcs):
msg = "type '{0}' is not an acceptable base type"
raise TypeError(msg.format(b.__name__))
namespace = {"__slots__":slots}
return namespace
return _Final
class Foo(metaclass=Final(slots=["_z"])):
y = 1
def __init__(self, z=1):
self.z = 1
#property
def z(self):
return self._z
#z.setter
def z(self, val:int):
if not isinstance(val, int):
raise TypeError("Value must be an integer")
else:
self._z = val
def foo(self):
print("I am sealed against monkey patching")
where the attempt of overwriting foo.foo will throw AttributeError: 'Foo' object attribute 'foo' is read-only and attempting to add foo.x will throw AttributeError: 'Foo' object has no attribute 'x'. The limiting power of __slots__ would be broken when inheriting, but because Foo has the metaclass Final, you can't inherit from it. It would also be broken when dict is in slots, so we throw a ValueError in case. To conclude, defining setters and getters for slotted properties allows to limit how the user can overwrite them.
foo = Foo()
# attributes are accessible
foo.foo()
print(foo.y)
# changing slotted attributes is possible
foo.z = 2
# %%
# overwriting unslotted attributes won't work
foo.foo = lambda:print("Guerilla patching attempt")
# overwriting a accordingly defined property won't work
foo.z = foo.foo
# expanding won't work
foo.x = 1
# %% inheriting won't work
class Bar(Foo):
pass
In that regard, Foo could not be inherited or expanded upon. The disadvantage is that all attributes have to be explicitly slotted, or are limited to a read-only class variable.
Python 3.8 has that feature in the form of the typing.final decorator:
class Base:
#final
def done(self) -> None:
...
class Sub(Base):
def done(self) -> None: # Error reported by type checker
...
#final
class Leaf:
...
class Other(Leaf): # Error reported by type checker
See https://docs.python.org/3/library/typing.html#typing.final

Categories