PyTorch Module with attrs cannot get parameter list - python

The attr's package somehow ruins pytorch's parameter() method for a module. I am wondering if anyone has any work-arounds or solutions, so that the two packages can seamlessly integrate?
If not, any advice on which github to post the issue to? My instinct would be to post this onto attr's github, but the stack trace is almost entirely relevant to pytorch's codebase.
Python 3.7.3
attrs== 19.1.0
torch==1.1.0.post2
torchvision==0.3.0
import attr
import torch
class RegularModule(torch.nn.Module):
pass
#attr.s
class AttrsModule(torch.nn.Module):
pass
module = RegularModule()
print(list(module.parameters()))
module = AttrsModule()
print(list(module.parameters()))
The actual output is:
$python attrs_pytorch.py
[]
Traceback (most recent call last):
File "attrs_pytorch.py", line 18, in <module>
print(list(module.parameters()))
File "/usr/local/anaconda3/envs/bgg/lib/python3.7/site-packages/torch/nn/modules/module.py", line 814, in parameters
for name, param in self.named_parameters(recurse=recurse):
File "/usr/local/anaconda3/envs/bgg/lib/python3.7/site-packages/torch/nn/modules/module.py", line 840, in named_parameters
for elem in gen:
File "/usr/local/anaconda3/envs/bgg/lib/python3.7/site-packages/torch/nn/modules/module.py", line 784, in _named_members
for module_prefix, module in modules:
File "/usr/local/anaconda3/envs/bgg/lib/python3.7/site-packages/torch/nn/modules/module.py", line 975, in named_modules
if self not in memo:
TypeError: unhashable type: 'AttrsModule'
The expected output is:
$python attrs_pytorch.py
[]
[]

You may get it to work with one workaround and using dataclasses (which you should, as it's in standard Python library since 3.7 which you are apparently using). Though I think simple __init__ is more readable. One could do something similar using attrs library (disabling hashing), I just prefer the solution using standard libraries if possible.
The reason (if you manage to handle hashing related errors) is that you are calling torch.nn.Module.__init__() which generates _parameters attribute and other framework-specific data.
First solving hashing with dataclasses:
#dataclasses.dataclass(eq=False)
class AttrsModule(torch.nn.Module):
pass
This solves hashing issues as, as stated by the documentation, section about hash and eq:
By default, dataclass() will not implicitly add a hash() method
unless it is safe to do so.
which is needed by PyTorch so the model can be used in C++ backed (correct me if I'm wrong), furthermore:
If eq is false, hash() will be left untouched meaning the
hash() method of the superclass will be used (if the superclass is object, this means it will fall back to id-based hashing).
So you are fine using torch.nn.Module __hash__ function (refer to documentation of dataclasses if any further errors arise).
This leaves you with the error:
AttributeError: 'AttrsModule' object has no attribute '_parameters'
Because torch.nn.Module constructor is not called. Quick and dirty fix:
#dataclasses.dataclass(eq=False)
class AttrsModule(torch.nn.Module):
def __post_init__(self):
super().__init__()
__post_init__ is a function called after __init__ (who would of guessed), where you can initialize torch-specific parameters.
Still, I would advise against using those two modules together. For example, you are destroying PyTorch's __repr__ using your code, so repr=False should be passed to the dataclasses.dataclass constructor, which gives this final code (obvious collisions between libraries eliminated I hope):
import dataclasses
import torch
class RegularModule(torch.nn.Module):
pass
#dataclasses.dataclass(eq=False, repr=False)
class AttrsModule(torch.nn.Module):
def __post_init__(self):
super().__init__()
module = RegularModule()
print(list(module.parameters()))
module = AttrsModule()
print(list(module.parameters()))
For more on attrs please see hynek answer and his blog post.

attrs has a chapter on hashability that also explains the pitfalls of hashing in Python: https://www.attrs.org/en/stable/hashing.html
You’ll have to decide what behavior is adequate for your concrete problem. For more general information check out https://hynek.me/articles/hashes-and-equality/ — turns out hashing is surprisingly tricky in Python.

Related

How to dynamically assign class methods from within class?

import util
class C():
save = util.save
setattr(C, 'load', util.load)
C.save is visible to the linter - but C.load isn't. There's thus some difference between assigning class methods from within the class itself, and from outside. Same deal for documentation builders; e.g. Sphinx won't acknowledge :meth:C.load - instead, need to do :func:util.load, which is misleading if load is meant to be C's method. An IDE (Spyder) also fails to "go to" method via self.load code.
The end-goal is to make linter (+docs & IDE) recognize load as C's method just like C.save is, but class method assignment needs to be dynamic (context). Can this be accomplished?
Note: the purpose of dynamic assignment is to automatically pull methods from modules (e.g. util) instead of having to manually update C upon method addition / removal.
Disclaimer: This solution does not work in all use cases, see comments. I leave it here, since it might still be useful under some circumstances.
I don't know about the linter and Sphinx, but an IDE like PyCharm will recognize the method if you declare it upfront using type hinting. In the code below, without the line load: Callable[[], None], I get the warning 'Unresolved attribute reference', but with the line there are no warnings in the file. Check the docs for more information about type hinting.
Notes:
Even with the more general load: Callable, the type checker is satisfied.
If you don't always declare a callable, a valid declaration is load: Optional[Callable] = None. This means that the default value is None. If you then call it without setting it, you will get an error, but you got that already anyway, that's unrelated to this typing.
p.s. I don't have your utils, so I defined some functions in the file itself.
from typing import Callable
def save():
pass
def load():
pass
class C:
load: Callable[[], None]
save = save
setattr(C, 'load', load)
C.load()

Is it possible to uniformly save any object in a JSON file?

I'm working on a web-server type of application and as part of multi-language communication I need to serialize objects in a JSON file. The issue is that I'd like to create a function which can take any custom object and save it at run time rather than limiting the function to what type of objects it can store based on structure.
Apologies if this question is a duplicate, however from what I have searched the other questions and answers do not seem to tackle the dynamic structure aspect of the problem, thus leading me to open this question.
The function is going to be used to communicate between PHP server code and Python scripts, hence the need for such a solution
I have attempted to use the json.dump(data,outfile) function, however the issue is that I need to convert such objects to a legal data structure first
JSON is a rigidly structured format, and Python's json module, by design, won't try to coerce types it doesn't understand.
Check out this SO answer. While __dict__ might work in some cases, it's often not exactly what you want. One option is to write one or more classes that inherit JSONEncoder and provides a method that turns your type or types into basic types that json.dump can understand.
Another option would be to write a parent class, e.g. JSONSerializable and have these data types inherit it the way you'd use an interface in some other languages. Making it an abstract base class would make sense, but I doubt that's important to your situation. Define a method on your base class, e.g. def dictify(self), and either implement it if it makes sense to have a default behavior or just have it it raise NotImplementedError.
Note that I'm not calling the method serialize, because actual serialization will be handled by json.dump.
class JSONSerializable(ABC):
def dictify(self):
raise NotImplementedError("Missing serialization implementation!")
class YourDataType(JSONSerializable):
def __init__(self):
self.something = None
# etc etc
def dictify(self):
return {"something": self.something}
class YourIncompleteDataType(JSONSerializable):
# No dictify(self) implementation
pass
Example usage:
>>> valid = YourDataType()
>>> valid.something = "really something"
>>> valid.dictify()
{'something': 'really something'}
>>>
>>> invalid = YourIncompleteDataType()
>>> invalid.dictify()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in dictify
NotImplementedError: Missing dictify implementation!
Basically, though: You do need to handle this yourself, possibly on a per-type basis, depending on how different your types are. It's just a matter of what method of formatting your types for serialization is the best for your use case.

How to introspect a function defined in a Cython C extension module

Python's inspect module doesn't seem to be able to inspect the signatures of "built-in" functions, which include functions defined in C extension modules, like those defined by Cython. Is there any way to get the signature of a Python function you have defined in such a module, and specifically in Cython? I am looking to be able to find the available keyword arguments.
MWE:
# mwe.pyx
def example(a, b=None):
pass
and
import pyximport; pyximport.install()
import mwe
import inspect
inspect.signature(mwe.example)
yields:
Traceback (most recent call last):
File "mwe_py.py", line 5, in <module>
inspect.signature(mwe.example)
File "/nix/store/134l79vxb91w8mhxxkj6kb5llf7dmwpm-python3-3.4.5/lib/python3.4/inspect.py", line 2063, in signature
return _signature_internal(obj)
File "/nix/store/134l79vxb91w8mhxxkj6kb5llf7dmwpm-python3-3.4.5/lib/python3.4/inspect.py", line 1965, in _signature_internal
skip_bound_arg=skip_bound_arg)
File "/nix/store/134l79vxb91w8mhxxkj6kb5llf7dmwpm-python3-3.4.5/lib/python3.4/inspect.py", line 1890, in _signature_from_builtin
raise ValueError("no signature found for builtin {!r}".format(func))
ValueError: no signature found for builtin <built-in function example>
In Python 3.4.5 and Cython 0.24.1
I've retracted my duplicate suggestion (saying that it was impossible...) having investigated further. It seems to work fine with reasonably recent versions of Cython (v0.23.4) and Python 3.4.4.
import cython
import inspect
scope = cython.inline("""def f(a,*args,b=False): pass """)
print(inspect.getfullargspec(scope['f']))
gives the output
FullArgSpec(args=['a'], varargs='args', varkw=None, defaults=None, kwonlyargs=['b'], kwonlydefaults={'b': False}, annotations={})
Also mentioned in the documentation is the compilation option "binding" which apparently makes this detail more accessible (although I didn't need it).
I have a feeling that this may depend on improvements to inspect made relatively recently (possibly this fix) so if you're using Python 2 you're probably out of luck.
Edit: your example works if you using the binding compilation option:
import cython
#cython.binding(True)
def example(a, b=None):
pass
I suspect that inline adds it automatically (but the code to do inline is sufficiently convoluted that I can't find proof of that either way). You can also set it as a file-level option.
The answer above using the binding decorator works for me when running code that has been cythonized. But, when I was running that same code within a Django 2.2 app, the application would fail on start with an error that cython has no attribute 'binding'. To avoid this I have added this "special cython header" at the top of my file containing the cythonized function as documented here to achieve the same results.
# cython: binding=True
def example(a, b=None):
pass

IPython representation of classes

I was trying IPython with a module I created and it does not show the actual representation of class objects. Instead it shows something like
TheClass.__module__ + '.' + TheClass.__name__
I heavily use metaclasses in this module and I have really meaningful class representations that should be shown to the user.
Is there an IPython specific method I can change to make the right representation available instead of this namespace thingy that is quite useless in this application?
Or, if that's not possible, how can I customize my version of IPython to show the information I want?
EDIT
As complementary information, if I get a class and change the __module__ attribute to e.g. None, it blows with this traceback when trying to show the representation:
Traceback (most recent call last):
... [Huge traceback] ...
File "C:\Python32\lib\site-packages\IPython\lib\pretty.py", line 599, in _type_pprint
name = obj.__module__ + '.' + obj.__name__
TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'
So my expectations were right and this function is used to show class objects:
def _type_pprint(obj, p, cycle):
I tried customizing it in my class but I don't think I'm doing it right. This module IPython.lib.pretty does have a big dictionary linking type (the parent of metaclasses) with this function.
EDIT 2
Things I tried:
Adding the _repr_pretty_ function to metaclass. It do work with instances but not with classes...
Using this function IPython.lib.pretty.for_type(typ, func). It only changes the big dictionary a wrote above but not the copy of it made by the RepresentationPrinter instance... So this function has no use at all?!
Calling the magic function %pprint. It disables (or enables) this pretty print feature, using the default Python __repr__ for all the objects. That's bad because the pretty printing of lists, dict and many others are quite nice.
The first approach is more of what I want because it does not affect the environment and is specific for this class.
This is just an issue with IPython 0.12 and older versions. Now is possible to do:
class A(type):
def _repr_pretty_(cls, p, cycle):
p.text(repr(self))
def __repr__(cls):
return 'This Information'
class B: #or for Py3K: class B(metaclass=A):
__metaclass__ = A
and it'll show the desired representation for B.

Problem grabbing parameters using execfile within a class

I have a problem similar to the first problem in this question, which as far as I can see went unanswered.
I have a file "config.py" which contains a lot of parameters to be used by a class (this config.py file will change), however I can't get these to propagate into the class via execfile.
In an example piece of code:
class Class():
def __init__(self):
execfile("config.py")
print x
# config.py
x = "foo"
>>> t = Class()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 4, in __init__
NameError: global name 'x' is not defined
Any help welcome, or any better methods of retrieving parameters from a file to be used in a class.
Many Thanks.
I don't get what you're trying to do (but i don't like it, and this is just me) but to fix your problem do (test in python2.6):
class Class():
def __init__(self):
execfile('config.py', locals()) # Not recommanded, maybe you want globals().
print x
But from the doc:
Note
The default locals act as described
for function locals() below:
modifications to the default locals
dictionary should not be attempted.
Pass an explicit locals dictionary if
you need to see effects of the code on
locals after function execfile()
returns. execfile() cannot be used
reliably to modify a function’s
locals.
and about :
Any help welcome, or any better
methods of retrieving parameters from
a file to be used in a class.
You can use import.
Even though it might be convenient to keep configuration settings in a Python file I would recommend against it. I think it opens up a whole set of problems that you don't really want to have do deal with. Anything could be placed in your configuration file, including malicious code.
I would use either the json module or the ConfigParser module to hold my configuration.
If you have trouble choosing between those two I would recommend the json module. Json is a simple yet flexible format for structured data.

Categories