Any way to bypass namedtuple 255 arguments limitation?

Any way to bypass namedtuple 255 arguments limitation? - python

I'm using a namedtuple to hold sets of strings and their corresponding values.
I'm not using a dictionary, because I want the strings accessible as attributes.
Here's my code:
from collections import namedtuple
# Shortened for readability :-)
strings = namedtuple("strings", ['a0', 'a1', 'a2', ..., 'a400'])
my_strings = strings(value0, value1, value2, ..., value400)
Ideally, once my_strings is initialized, I should be able to do this:
print(my_strings.a1)
and get value1 printed back.
However, I get the following error instead:
strings(value0, value1, value2, ...value400)
^SyntaxError: more than 255 arguments
It seems python functions (including namedtuple's init()), do not accept more than 255 arguments when called.
Is there any way to bypass this issue and have named tuples with more than 255 items? Why is there a 255 arguments limit anyway?

This is a limit to CPython function definitions; in versions before Python 3.7, you cannot specify more than 255 explicit arguments to a callable. This applies to any function definition, not just named tuples.
Note that this limit has been lifted in Python 3.7 and newer, where the new limit is sys.maxint. See What is a maximum number of arguments in a Python function?
It is the generated code for the class that is hitting this limit. You cannot define a function with more than 255 arguments; the __new__ class method of the resulting class is thus not achievable in the CPython implementation.
You'll have to ask yourself, however, if you really should be using a different structure instead. It looks like you have a list-like piece of data to me; 400 numbered names is a sure sign of your data bleeding into your names.
You can work around this by creating your own subclass, manually:
from operator import itemgetter
from collections import OrderedDict
class strings(tuple):
__slots__ = ()
_fields = tuple('a{}'.format(i) for i in range(400))
def __new__(cls, *args, **kwargs):
req = len(cls._fields)
if len(args) + len(kwargs) > req:
raise TypeError(
'__new__() takes {} positional arguments but {} were given'.format(
req, len(args) + len(kwargs)))
if kwargs.keys() > set(cls._fields):
raise TypeError(
'__new__() got an unexpected keyword argument {!r}'.format(
(kwargs.keys() - set(cls._fields)).pop()))
missing = req - len(args)
if kwargs.keys() & set(cls._fields[:-missing]):
raise TypeError(
'__new__() got multiple values for argument {!r}'.format(
(kwargs.keys() & set(cls._fields[:-missing])).pop()))
try:
for field in cls._fields[-missing:]:
args += (kwargs[field],)
missing -= 1
except KeyError:
pass
if len(args) < req:
raise TypeError('__new__() missing {} positional argument{}: {}'.format(
missing, 's' if missing > 1 else '',
' and '.join(filter(None, [', '.join(map(repr, cls._fields[-missing:-1])), repr(cls._fields[-1])]))))
return tuple.__new__(cls, args)
#classmethod
def _make(cls, iterable, new=tuple.__new__, len=len):
'Make a new strings object from a sequence or iterable'
result = new(cls, iterable)
if len(result) != len(cls._fields):
raise TypeError('Expected %d arguments, got %d' % (len(cls._fields), len(result)))
return result
def __repr__(self):
'Return a nicely formatted representation string'
format = '{}({})'.format(self.__class__.__name__, ', '.join('{}=%r'.format(n) for n in self._fields))
return format % self
def _asdict(self):
'Return a new OrderedDict which maps field names to their values'
return OrderedDict(zip(self._fields, self))
__dict__ = property(_asdict)
def _replace(self, **kwds):
'Return a new strings object replacing specified fields with new values'
result = self._make(map(kwds.pop, self._fields, self))
if kwds:
raise ValueError('Got unexpected field names: %r' % list(kwds))
return result
def __getnewargs__(self):
'Return self as a plain tuple. Used by copy and pickle.'
return tuple(self)
def __getstate__(self):
'Exclude the OrderedDict from pickling'
return None
for i, name in enumerate(strings._fields):
setattr(strings, name,
property(itemgetter(i), doc='Alias for field number {}'.format(i)))
This version of the named tuple avoids the long argument lists altogether, but otherwise behaves exactly like the original. The somewhat verbose __new__ method is not strictly needed but does closely emulate the original behaviour when arguments are incomplete. Note the construction of the _fields attribute; replace this with your own to name your tuple fields.
Pass in a generator expression to set your arguments:
s = strings(i for i in range(400))
or if you have a list of values:
s = strings(iter(list_of_values))
Either technique bypasses the limits on function signatures and function call argument counts.
Demo:
>>> s = strings(i for i in range(400))
>>> s
strings(a0=0, a1=1, a2=2, a3=3, a4=4, a5=5, a6=6, a7=7, a8=8, a9=9, a10=10, a11=11, a12=12, a13=13, a14=14, a15=15, a16=16, a17=17, a18=18, a19=19, a20=20, a21=21, a22=22, a23=23, a24=24, a25=25, a26=26, a27=27, a28=28, a29=29, a30=30, a31=31, a32=32, a33=33, a34=34, a35=35, a36=36, a37=37, a38=38, a39=39, a40=40, a41=41, a42=42, a43=43, a44=44, a45=45, a46=46, a47=47, a48=48, a49=49, a50=50, a51=51, a52=52, a53=53, a54=54, a55=55, a56=56, a57=57, a58=58, a59=59, a60=60, a61=61, a62=62, a63=63, a64=64, a65=65, a66=66, a67=67, a68=68, a69=69, a70=70, a71=71, a72=72, a73=73, a74=74, a75=75, a76=76, a77=77, a78=78, a79=79, a80=80, a81=81, a82=82, a83=83, a84=84, a85=85, a86=86, a87=87, a88=88, a89=89, a90=90, a91=91, a92=92, a93=93, a94=94, a95=95, a96=96, a97=97, a98=98, a99=99, a100=100, a101=101, a102=102, a103=103, a104=104, a105=105, a106=106, a107=107, a108=108, a109=109, a110=110, a111=111, a112=112, a113=113, a114=114, a115=115, a116=116, a117=117, a118=118, a119=119, a120=120, a121=121, a122=122, a123=123, a124=124, a125=125, a126=126, a127=127, a128=128, a129=129, a130=130, a131=131, a132=132, a133=133, a134=134, a135=135, a136=136, a137=137, a138=138, a139=139, a140=140, a141=141, a142=142, a143=143, a144=144, a145=145, a146=146, a147=147, a148=148, a149=149, a150=150, a151=151, a152=152, a153=153, a154=154, a155=155, a156=156, a157=157, a158=158, a159=159, a160=160, a161=161, a162=162, a163=163, a164=164, a165=165, a166=166, a167=167, a168=168, a169=169, a170=170, a171=171, a172=172, a173=173, a174=174, a175=175, a176=176, a177=177, a178=178, a179=179, a180=180, a181=181, a182=182, a183=183, a184=184, a185=185, a186=186, a187=187, a188=188, a189=189, a190=190, a191=191, a192=192, a193=193, a194=194, a195=195, a196=196, a197=197, a198=198, a199=199, a200=200, a201=201, a202=202, a203=203, a204=204, a205=205, a206=206, a207=207, a208=208, a209=209, a210=210, a211=211, a212=212, a213=213, a214=214, a215=215, a216=216, a217=217, a218=218, a219=219, a220=220, a221=221, a222=222, a223=223, a224=224, a225=225, a226=226, a227=227, a228=228, a229=229, a230=230, a231=231, a232=232, a233=233, a234=234, a235=235, a236=236, a237=237, a238=238, a239=239, a240=240, a241=241, a242=242, a243=243, a244=244, a245=245, a246=246, a247=247, a248=248, a249=249, a250=250, a251=251, a252=252, a253=253, a254=254, a255=255, a256=256, a257=257, a258=258, a259=259, a260=260, a261=261, a262=262, a263=263, a264=264, a265=265, a266=266, a267=267, a268=268, a269=269, a270=270, a271=271, a272=272, a273=273, a274=274, a275=275, a276=276, a277=277, a278=278, a279=279, a280=280, a281=281, a282=282, a283=283, a284=284, a285=285, a286=286, a287=287, a288=288, a289=289, a290=290, a291=291, a292=292, a293=293, a294=294, a295=295, a296=296, a297=297, a298=298, a299=299, a300=300, a301=301, a302=302, a303=303, a304=304, a305=305, a306=306, a307=307, a308=308, a309=309, a310=310, a311=311, a312=312, a313=313, a314=314, a315=315, a316=316, a317=317, a318=318, a319=319, a320=320, a321=321, a322=322, a323=323, a324=324, a325=325, a326=326, a327=327, a328=328, a329=329, a330=330, a331=331, a332=332, a333=333, a334=334, a335=335, a336=336, a337=337, a338=338, a339=339, a340=340, a341=341, a342=342, a343=343, a344=344, a345=345, a346=346, a347=347, a348=348, a349=349, a350=350, a351=351, a352=352, a353=353, a354=354, a355=355, a356=356, a357=357, a358=358, a359=359, a360=360, a361=361, a362=362, a363=363, a364=364, a365=365, a366=366, a367=367, a368=368, a369=369, a370=370, a371=371, a372=372, a373=373, a374=374, a375=375, a376=376, a377=377, a378=378, a379=379, a380=380, a381=381, a382=382, a383=383, a384=384, a385=385, a386=386, a387=387, a388=388, a389=389, a390=390, a391=391, a392=392, a393=393, a394=394, a395=395, a396=396, a397=397, a398=398, a399=399)
>>> s.a391
391

namedtuple out of the box doesn't support what you are trying to do.
So the following might achieve the goal, which might change from 400 to 450 arguments, or lesser and saner.
def customtuple(*keys):
class string:
_keys = keys
_dict = {}
def __init__(self, *args):
args = list(args)
if len(args) != len(self._keys):
raise Exception("No go forward")
for key in range(len(args)):
self._dict[self._keys[key]] = args[key]
def __setattr__(self, *args):
raise BaseException("Not allowed")
def __getattr__(self, arg):
try:
return self._dict[arg]
except:
raise BaseException("Name not defined")
def __repr__(self):
return ("string(%s)"
%(", ".join(["%s=%r"
%(self._keys[key],
self._dict[self._keys[key]])
for key in range(len(self._dict))])))
return string
>>> strings = customtuple(*['a'+str(x) for x in range(1, 401)])
>>> s = strings(*['a'+str(x) for x in range(2, 402)])
>>> s.a1
'a2'
>>> s.a1 = 1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/hus787/p.py", line 15, in __setattr__
def __setattr__(self, *args):
BaseException: Not allowed
For more light on the subject.

Here is my version of a replacement for namedtuple that supports more than 255 arguments. The idea was not to be functionally equivalent but rather to improve on some aspects (in my opinion). This is for Python 3.4+ only:
class SequenceAttrReader(object):
""" Class to function similar to collections.namedtuple but allowing more than 255 keys.
Initialize with attribute string (space separated), then load in data via a sequence, then access the list keys as properties
i.e.
csv_line = SequenceAttrReader('a b c')
csv_line = csv_line.load([1, 2, 3])
print(csv_line.b)
>> 2
"""
_attr_string = None
_attr_list = []
_data_list = []
def __init__(self, attr_string):
if not attr_string:
raise AttributeError('SequenceAttrReader not properly initialized, please use a non-empty string')
self._attr_string = attr_string
self._attr_list = attr_string.split(' ')
def __getattr__(self, name):
if not self._attr_string or not self._attr_list or not self._data_list:
raise AttributeError('SequenceAttrReader not properly initialized or loaded')
try:
index = self._attr_list.index(name)
except ValueError:
raise AttributeError("'{name}'' not in attribute string".format(name=name)) from None
try:
value = self._data_list[index]
except IndexError:
raise AttributeError("No attribute named '{name}'' in".format(name=name)) from None
return value
def __str__(self):
return str(self._data_list)
def __repr__(self):
return 'SequenceAttrReader("{attr_string}")'.format(attr_string=self._attr_string)
def load_data(self, data_list):
if not self._attr_list:
raise AttributeError('SequenceAttrReader not properly initialized')
if not data_list:
raise AttributeError('SequenceAttrReader needs to load a non-empty sequence')
self._data_list = data_list
This is probably not the most efficient way if you are doing a lot of individual lookups, converting it internally to a dict may be better. I'll work on an optimized version once I have more time or at least see what the performance difference is.

Related

Subclassing Template to provide default arguments

I am subclassing Template from string to give it some extra defaulting capabilities. The idea is for its look-up to extend beyond the passed dict to the locals() first, then to the globals() and finally default (e.g., to '-'). So this is what I wrote:
class MyTemplate(Template):
def substitute_default(*args, **kws):
if not args:
raise TypeError("descriptor 'substitute' of 'Template' object needs an argument")
self, *args = args # allow the "self" keyword be passed
if len(args) > 1:
raise TypeError('Too many positional arguments')
if not args:
mapping = kws
elif kws:
mapping = ChainMap(kws, args[0])
else:
mapping = args[0]
def convert(mo):
named = mo.group('named') or mo.group('braced')
if named is not None:
val = mapping.get(named, locals().get(named, globals().get(named, '-')))
return '%s' % (val,)
if mo.group('escaped') is not None:
return self.delimiter
if mo.group('invalid') is not None:
self._invalid(mo)
raise ValueError('Unrecognized named group in pattern', self.pattern)
return self.pattern.sub(convert, self.template)
The line with the juice is this:
val = mapping.get(named, locals().get(named, globals().get(named, '-')))
I am testing it like so:
a = 'global_foo'
def f():
b = 'local_foo'
t = MyTemplate('''a=$a, b=$b, c=$c, d=$d''')
text = t.substitute_default({'c': 'foo', 'd': 'bar'})
print(text)
f() # -> a=global_foo, b=-, c=foo, d=bar
As you can see, the globals() look-up works but the locals() one does not..
Does anyone have an idea as to why this might be the case?
Is there a better way to do it?

The problem is that locals() is local to your convert function when you'd want it to refer to f locals.
You have to pass the locals() dictionary somehow, either in constructor or somewhere else for it to work.

Class Factory do not work with isinstance

I am trying to write a version of python's namedtuple that takes the types of the fields, and double checks that the fields are the correct types at creation. In order to do this, I created a factory function that returns a custom class. However, it's impossible to use instances of these classes with isinstance. Is there an easy way to fix this?
Edit: To make the question more clear, neither of the if statements at the end of my code print anything, because the conditions evaluate to false. I'd like a simple way to determine which objects are instances of the class.
Code below.
from collections import namedtuple
# TYPED VERSION OF A NAMEDTUPLE
def typedtuple(name, fields, types):
if len(fields) != len(types):
raise TypeError("typedtuple must have the same number of fields and types")
Base = namedtuple(name, fields)
class TypedTuple(Base):
def __new__(cls, *args):
if len(fields) < len(args):
raise TypeError("Too many arguments to typedtuple"
"{}; got {} and expected {}".format(
name, len(args), len(fields)
))
if len(fields) > len(args):
raise TypeError("Not enough arguments to typedtuple"
"{}; got {} and expected {}".format(
name, len(args), len(fields)
))
i = 0
for a, t in zip(args, types):
if not isinstance(a, t):
raise TypeError("Wrong type of argument to typedtuple"
"{}; {} at position {} is not of type {}".format(
name, a, i, t
))
i += 1
return Base(*args)
TypedTuple.__name__ = name
return TypedTuple
GaussianInteger = typedtuple("GaussianInteger", ["real", "imag"], [int, int])
point = GaussianInteger(1,2)
print point
print type(point)
print GaussianInteger
if isinstance(point, GaussianInteger):
print "it is an instance"
if type(point) == GaussianInteger:
print "it is the type of"

Dynamically subclass an Enum base class

I've set up a metaclass and base class pair for creating the line specifications of several different file types I have to parse.
I have decided to go with using enumerations because many of the individual parts of the different lines in the same file often have the same name. Enums make it easy to tell them apart. Additionally, the specification is rigid and there will be no need to add more members, or extend the line specifications later.
The specification classes work as expected. However, I am having some trouble dynamically creating them:
>>> C1 = LineMakerMeta('C1', (LineMakerBase,), dict(a = 0))
AttributeError: 'dict' object has no attribute '_member_names'
Is there a way around this? The example below works just fine:
class A1(LineMakerBase):
Mode = 0, dict(fill=' ', align='>', type='s')
Level = 8, dict(fill=' ', align='>', type='d')
Method = 10, dict(fill=' ', align='>', type='d')
_dummy = 20 # so that Method has a known length
A1.format(**dict(Mode='DESIGN', Level=3, Method=1))
# produces ' DESIGN 3 1'
The metaclass is based on enum.EnumMeta, and looks like this:
import enum
class LineMakerMeta(enum.EnumMeta):
"Metaclass to produce formattable LineMaker child classes."
def _iter_format(cls):
"Iteratively generate formatters for the class members."
for member in cls:
yield member.formatter
def __str__(cls):
"Returns string line with all default values."
return cls.format()
def format(cls, **kwargs):
"Create formatted version of the line populated by the kwargs members."
# build resulting string by iterating through members
result = ''
for member in cls:
# determine value to be injected into member
try:
try:
value = kwargs[member]
except KeyError:
value = kwargs[member.name]
except KeyError:
value = member.default
value_str = member.populate(value)
result = result + value_str
return result
And the base class is as follows:
class LineMakerBase(enum.Enum, metaclass=LineMakerMeta):
"""A base class for creating Enum subclasses used for populating lines of a file.
Usage:
class LineMaker(LineMakerBase):
a = 0, dict(align='>', fill=' ', type='f'), 3.14
b = 10, dict(align='>', fill=' ', type='d'), 1
b = 15, dict(align='>', fill=' ', type='s'), 'foo'
# ^-start ^---spec dictionary ^--default
"""
def __init__(member, start, spec={}, default=None):
member.start = start
member.spec = spec
if default is not None:
member.default = default
else:
# assume value is numerical for all provided types other than 's' (string)
default_or_set_type = member.spec.get('type','s')
default = {'s': ''}.get(default_or_set_type, 0)
member.default = default
#property
def formatter(member):
"""Produces a formatter in form of '{0:<format>}' based on the member.spec
dictionary. The member.spec dictionary makes use of these keys ONLY (see
the string.format docs):
fill align sign width grouping_option precision type"""
try:
# get cached value
return '{{0:{}}}'.format(member._formatter)
except AttributeError:
# add width to format spec if not there
member.spec.setdefault('width', member.length if member.length != 0 else '')
# build formatter using the available parts in the member.spec dictionary
# any missing parts will simply not be present in the formatter
formatter = ''
for part in 'fill align sign width grouping_option precision type'.split():
try:
spec_value = member.spec[part]
except KeyError:
# missing part
continue
else:
# add part
sub_formatter = '{!s}'.format(spec_value)
formatter = formatter + sub_formatter
member._formatter = formatter
return '{{0:{}}}'.format(formatter)
def populate(member, value=None):
"Injects the value into the member's formatter and returns the formatted string."
formatter = member.formatter
if value is not None:
value_str = formatter.format(value)
else:
value_str = formatter.format(member.default)
if len(value_str) > len(member) and len(member) != 0:
raise ValueError(
'Length of object string {} ({}) exceeds available'
' field length for {} ({}).'
.format(value_str, len(value_str), member.name, len(member)))
return value_str
#property
def length(member):
return len(member)
def __len__(member):
"""Returns the length of the member field. The last member has no length.
Length are based on simple subtraction of starting positions."""
# get cached value
try:
return member._length
# calculate member length
except AttributeError:
# compare by member values because member could be an alias
members = list(type(member))
try:
next_index = next(
i+1
for i,m in enumerate(type(member))
if m.value == member.value
)
except StopIteration:
raise TypeError(
'The member value {} was not located in the {}.'
.format(member.value, type(member).__name__)
)
try:
next_member = members[next_index]
except IndexError:
# last member defaults to no length
length = 0
else:
length = next_member.start - member.start
member._length = length
return length

This line:
C1 = enum.EnumMeta('C1', (), dict(a = 0))
fails with exactly the same error message. The __new__ method of EnumMeta expects an instance of enum._EnumDict as its last argument. _EnumDict is a subclass of dict and provides an instance variable named _member_names, which of course a regular dict doesn't have. When you go through the standard mechanism of enum creation, this all happens correctly behind the scenes. That's why your other example works just fine.
This line:
C1 = enum.EnumMeta('C1', (), enum._EnumDict())
runs with no error. Unfortunately, the constructor of _EnumDict is defined as taking no arguments, so you can't initialize it with keywords as you apparently want to do.
In the implementation of enum that's backported to Python3.3, the following block of code appears in the constructor of EnumMeta. You could do something similar in your LineMakerMeta class:
def __new__(metacls, cls, bases, classdict):
if type(classdict) is dict:
original_dict = classdict
classdict = _EnumDict()
for k, v in original_dict.items():
classdict[k] = v
In the official implementation, in Python3.5, the if statement and the subsequent block of code is gone for some reason. Therefore classdict must be an honest-to-god _EnumDict, and I don't see why this was done. In any case the implementation of Enum is extremely complicated and handles a lot of corner cases.
I realize this is not a cut-and-dried answer to your question but I hope it will point you to a solution.

Create your LineMakerBase class, and then use it like so:
C1 = LineMakerBase('C1', dict(a=0))
The metaclass was not meant to be used the way you are trying to use it. Check out this answer for advice on when metaclass subclasses are needed.
Some suggestions for your code:
the double try/except in format seems clearer as:
for member in cls:
if member in kwargs:
value = kwargs[member]
elif member.name in kwargs:
value = kwargs[member.name]
else:
value = member.default
this code:
# compare by member values because member could be an alias
members = list(type(member))
would be clearer with list(member.__class__)
has a false comment: listing an Enum class will never include the aliases (unless you have overridden that part of EnumMeta)
instead of the complicated __len__ code you have now, and as long as you are subclassing EnumMeta you should extend __new__ to automatically calculate the lengths once:
# untested
def __new__(metacls, cls, bases, clsdict):
# let the main EnumMeta code do the heavy lifting
enum_cls = super(LineMakerMeta, metacls).__new__(cls, bases, clsdict)
# go through the members and calculate the lengths
canonical_members = [
member
for name, member in enum_cls.__members__.items()
if name == member.name
]
last_member = None
for next_member in canonical_members:
next_member.length = 0
if last_member is not None:
last_member.length = next_member.start - last_member.start

The simplest way to create Enum subclasses on the fly is using Enum itself:
>>> from enum import Enum
>>> MyEnum = Enum('MyEnum', {'a': 0})
>>> MyEnum
<enum 'MyEnum'>
>>> MyEnum.a
<MyEnum.a: 0>
>>> type(MyEnum)
<class 'enum.EnumMeta'>
As for your custom methods, it might be simpler if you used regular functions, precisely because Enum implementation is so special.

Elegant pattern for mutually exclusive keyword args?

Sometimes in my code I have a function which can take an argument in one of two ways. Something like:
def func(objname=None, objtype=None):
if objname is not None and objtype is not None:
raise ValueError("only 1 of the ways at a time")
if objname is not None:
obj = getObjByName(objname)
elif objtype is not None:
obj = getObjByType(objtype)
else:
raise ValueError("not given any of the ways")
doStuffWithObj(obj)
Is there any more elegant way to do this? What if the arg could come in one of three ways? If the types are distinct I could do:
def func(objnameOrType):
if type(objnameOrType) is str:
getObjByName(objnameOrType)
elif type(objnameOrType) is type:
getObjByType(objnameOrType)
else:
raise ValueError("unk arg type: %s" % type(objnameOrType))
But what if they are not? This alternative seems silly:
def func(objnameOrType, isName=True):
if isName:
getObjByName(objnameOrType)
else:
getObjByType(objnameOrType)
cause then you have to call it like func(mytype, isName=False) which is weird.

How about using something like a command dispatch pattern:
def funct(objnameOrType):
dispatcher = {str: getObjByName,
type1: getObjByType1,
type2: getObjByType2}
t = type(objnameOrType)
obj = dispatcher[t](objnameOrType)
doStuffWithObj(obj)
where type1,type2, etc are actual python types (e.g. int, float, etc).

Sounds like it should go to https://codereview.stackexchange.com/
Anyway, keeping the same interface, I may try
arg_parsers = {
'objname': getObjByName,
'objtype': getObjByType,
...
}
def func(**kwargs):
assert len(kwargs) == 1 # replace this with your favorite exception
(argtypename, argval) = next(kwargs.items())
obj = arg_parsers[argtypename](argval)
doStuffWithObj(obj)
or simply create 2 functions?
def funcByName(name): ...
def funcByType(type_): ...

One way to make it slightly shorter is
def func(objname=None, objtype=None):
if [objname, objtype].count(None) != 1:
raise TypeError("Exactly 1 of the ways must be used.")
if objname is not None:
obj = getObjByName(objname)
else:
obj = getObjByType(objtype)
I have not yet decided if I would call this "elegant".
Note that you should raise a TypeError if the wrong number of arguments was given, not a ValueError.

For whatever it's worth, similar kinds of things happen in the Standard Libraries; see, for example, the beginning of GzipFile in gzip.py (shown here with docstrings removed):
class GzipFile:
myfileobj = None
max_read_chunk = 10 * 1024 * 1024 # 10Mb
def __init__(self, filename=None, mode=None,
compresslevel=9, fileobj=None):
if mode and 'b' not in mode:
mode += 'b'
if fileobj is None:
fileobj = self.myfileobj = __builtin__.open(filename, mode or 'rb')
if filename is None:
if hasattr(fileobj, 'name'): filename = fileobj.name
else: filename = ''
if mode is None:
if hasattr(fileobj, 'mode'): mode = fileobj.mode
else: mode = 'rb'
Of course this accepts both filename and fileobj keywords and defines a particular behavior in the case that it receives both; but the general approach seems pretty much identical.

I use a decorator:
from functools import wraps
def one_of(kwarg_names):
# assert that one and only one of the given kwarg names are passed to the decorated function
def inner(f):
#wraps(f)
def wrapped(*args, **kwargs):
count = 0
for kw in kwargs:
if kw in kwarg_names and kwargs[kw] is not None:
count += 1
assert count == 1, f'exactly one of {kwarg_names} required, got {kwargs}'
return f(*args, **kwargs)
return wrapped
return inner
Used as:
#one_of(['kwarg1', 'kwarg2'])
def my_func(kwarg1='default', kwarg2='default'):
pass
Note that this only accounts for non- None values that are passed as keyword arguments. E.g. multiple of the kwarg_names may still be passed if all but one of them have a value of None.
To allow for passing none of the kwargs simply assert that the count is <= 1.

It sounds like you're looking for function overloading, which isn't implemented in Python 2. In Python 2, your solution is nearly as good as you can expect to get.
You could probably bypass the extra argument problem by allowing your function to process multiple objects and return a generator:
import types
all_types = set([getattr(types, t) for t in dir(types) if t.endswith('Type')])
def func(*args):
for arg in args:
if arg in all_types:
yield getObjByType(arg)
else:
yield getObjByName(arg)
Test:
>>> getObjByName = lambda a: {'Name': a}
>>> getObjByType = lambda a: {'Type': a}
>>> list(func('IntType'))
[{'Name': 'IntType'}]
>>> list(func(types.IntType))
[{'Type': <type 'int'>}]

The built-in sum() can be used to on a list of boolean expressions. In Python, bool is a subclass of int, and in arithmetic operations, True behaves as 1, and False behaves as 0.
This means that this rather short code will test mutual exclusivity for any number of arguments:
def do_something(a=None, b=None, c=None):
if sum([a is not None, b is not None, c is not None]) != 1:
raise TypeError("specify exactly one of 'a', 'b', or 'c'")
Variations are also possible:
def do_something(a=None, b=None, c=None):
if sum([a is not None, b is not None, c is not None]) > 1:
raise TypeError("specify at most one of 'a', 'b', or 'c'")

I occasionally run into this problem as well, and it is hard to find an easily generalisable solution. Say I have more complex combinations of arguments that are delineated by a set of mutually exclusive arguments and want to support additional arguments for each (some of which may be required and some optional), as in the following signatures:
def func(mutex1: str, arg1: bool): ...
def func(mutex2: str): ...
def func(mutex3: int, arg1: Optional[bool] = None): ...
I would use object orientation to wrap the arguments in a set of descriptors (with names depending on the business meaning of the arguments), which can then be validated by something like pydantic:
from typing import Optional
from pydantic import BaseModel, Extra
# Extra.forbid ensures validation error if superfluous arguments are provided
class BaseDescription(BaseModel, extra=Extra.forbid):
pass # Arguments common to all descriptions go here
class Description1(BaseDescription):
mutex1: str
arg1: bool
class Description2(BaseDescription):
mutex2: str
class Description3(BaseDescription):
mutex3: int
arg1: Optional[bool]
You could instantiate these descriptions with a factory:
class DescriptionFactory:
_class_map = {
'mutex1': Description1,
'mutex2': Description2,
'mutex3': Description3
}
#classmethod
def from_kwargs(cls, **kwargs) -> BaseDescription:
kwargs = {k: v for k, v in kwargs.items() if v is not None}
set_fields = kwargs.keys() & cls._class_map.keys()
try:
[set_field] = set_fields
except ValueError:
raise ValueError(f"exactly one of {list(cls._class_map.keys())} must be provided")
return cls._class_map[set_field](**kwargs)
#classmethod
def validate_kwargs(cls, func):
def wrapped(**kwargs):
return func(cls.from_kwargs(**kwargs))
return wrapped
Then you can wrap your actual function implementation like this and use type checking to see which arguments were provided:
#DescriptionFactory.validate_kwargs
def func(desc: BaseDescription):
if isinstance(desc, Description1):
... # use desc.mutex1 and desc.arg1
elif isinstance(desc, Description2):
... # use desc.mutex2
... # etc.
and call as func(mutex1='', arg1=True), func(mutex2=''), func(mutex3=123) and so on.
This is not overall shorter code, but it performs argument validation in a very descriptive way according to your specification, raises useful pydantic errors when validation fails, and results in accurate static types in each branch of the function implementation.
Note that if you're using Python 3.10+, structural pattern matching could simplify some parts of this.

How can I get the values of the locals of a function after it has been executed?

Suppose I have a function like f(a, b, c=None). The aim is to call the function like f(*args, **kwargs), and then construct a new set of args and kwargs such that:
If the function had default values, I should be able to acquire their values. For example, if I call it like f(1, 2), I should be able to get the tuple (1, 2, None) and/or the dictionary {'c': None}.
If the value of any of the arguments was modified inside the function, get the new value. For example, if I call it like f(1, 100000, 3) and the function does if b > 500: b = 5 modifying the local variable, I should be able to get the the tuple (1, 5, 3).
The aim here is to create a a decorator that finishes the job of a function. The original function acts as a preamble setting up the data for the actual execution, and the decorator finishes the job.
Edit: I'm adding an example of what I'm trying to do. It's a module for making proxies for other classes.
class Spam(object):
"""A fictional class that we'll make a proxy for"""
def eggs(self, start, stop, step):
"""A fictional method"""
return range(start, stop, step)
class ProxyForSpam(clsproxy.Proxy):
proxy_for = Spam
#clsproxy.signature_preamble
def eggs(self, start, stop, step=1):
start = max(0, start)
stop = min(100, stop)
And then, we'll have that:
ProxyForSpam().eggs(-10, 200) -> Spam().eggs(0, 100, 1)
ProxyForSpam().eggs(3, 4) -> Spam().eggs(3, 4, 1)

There are two recipes available here, one which requires an external library and another that uses only the standard library. They don't quite do what you want, in that they actually modify the function being executed to obtain its locals() rather than obtain the locals() after function execution, which is impossible, since the local stack no longer exists after the function finishes execution.
Another option is to see what debuggers, such as WinPDB or even the pdb module do. I suspect they use the inspect module (possibly along with others), to get the frame inside which a function is executing and retrieve locals() that way.
EDIT: After reading some code in the standard library, the file you want to look at is probably bdb.py, which should be wherever the rest of your Python standard library is. Specifically, look at set_trace() and related functions. This will give you an idea of how the Python debugger breaks into the class. You might even be able to use it directly. To get the frame to pass to set_trace() look at the inspect module.

I've stumbled upon this very need today and wanted to share my solution.
import sys
def call_function_get_frame(func, *args, **kwargs):
"""
Calls the function *func* with the specified arguments and keyword
arguments and snatches its local frame before it actually executes.
"""
frame = None
trace = sys.gettrace()
def snatch_locals(_frame, name, arg):
nonlocal frame
if frame is None and name == 'call':
frame = _frame
sys.settrace(trace)
return trace
sys.settrace(snatch_locals)
try:
result = func(*args, **kwargs)
finally:
sys.settrace(trace)
return frame, result
The idea is to use sys.trace() to catch the frame of the next 'call'. Tested on CPython 3.6.
Example usage
import types
def namespace_decorator(func):
frame, result = call_function_get_frame(func)
try:
module = types.ModuleType(func.__name__)
module.__dict__.update(frame.f_locals)
return module
finally:
del frame
#namespace_decorator
def mynamespace():
eggs = 'spam'
class Bar:
def hello(self):
print("Hello, World!")
assert mynamespace.eggs == 'spam'
mynamespace.Bar().hello()

I don't see how you could do this non-intrusively -- after the function is done executing, it doesn't exist any more -- there's no way you can reach inside something that doesn't exist.
If you can control the functions that are being used, you can do an intrusive approach like
def fn(x, y, z, vars):
'''
vars is an empty dict that we use to pass things back to the caller
'''
x += 1
y -= 1
z *= 2
vars.update(locals())
>>> updated = {}
>>> fn(1, 2, 3, updated)
>>> print updated
{'y': 1, 'x': 2, 'z': 6, 'vars': {...}}
>>>
...or you can just require that those functions return locals() -- as #Thomas K asks above, what are you really trying to do here?

Witchcraft below read on your OWN danger(!)
I have no clue what you want to do with this, it's possible but it's an awful hack...
Anyways, I HAVE WARNED YOU(!), be lucky if such things don't work in your favorite language...
from inspect import getargspec, ismethod
import inspect
def main():
#get_modified_values
def foo(a, f, b):
print a, f, b
a = 10
if a == 2:
return a
f = 'Hello World'
b = 1223
e = 1
c = 2
foo(e, 1000, b = c)
# intercept a function and retrieve the modifed values
def get_modified_values(target):
def wrapper(*args, **kwargs):
# get the applied args
kargs = getcallargs(target, *args, **kwargs)
# get the source code
src = inspect.getsource(target)
lines = src.split('\n')
# oh noes string patching of the function
unindent = len(lines[0]) - len(lines[0].lstrip())
indent = lines[0][:len(lines[0]) - len(lines[0].lstrip())]
lines[0] = ''
lines[1] = indent + 'def _temp(_args, ' + lines[1].split('(')[1]
setter = []
for k in kargs.keys():
setter.append('_args["%s"] = %s' % (k, k))
i = 0
while i < len(lines):
indent = lines[i][:len(lines[i]) - len(lines[i].lstrip())]
if lines[i].find('return ') != -1 or lines[i].find('return\n') != -1:
for e in setter:
lines.insert(i, indent + e)
i += len(setter)
elif i == len(lines) - 2:
for e in setter:
lines.insert(i + 1, indent + e)
break
i += 1
for i in range(0, len(lines)):
lines[i] = lines[i][unindent:]
data = '\n'.join(lines) + "\n"
# setup variables
frame = inspect.currentframe()
loc = inspect.getouterframes(frame)[1][0].f_locals
glob = inspect.getouterframes(frame)[1][0].f_globals
loc['_temp'] = None
# compile patched function and call it
func = compile(data, '<witchstuff>', 'exec')
eval(func, glob, loc)
loc['_temp'](kargs, *args, **kwargs)
# there you go....
print kargs
# >> {'a': 10, 'b': 1223, 'f': 'Hello World'}
return wrapper
# from python 2.7 inspect module
def getcallargs(func, *positional, **named):
"""Get the mapping of arguments to values.
A dict is returned, with keys the function argument names (including the
names of the * and ** arguments, if any), and values the respective bound
values from 'positional' and 'named'."""
args, varargs, varkw, defaults = getargspec(func)
f_name = func.__name__
arg2value = {}
# The following closures are basically because of tuple parameter unpacking.
assigned_tuple_params = []
def assign(arg, value):
if isinstance(arg, str):
arg2value[arg] = value
else:
assigned_tuple_params.append(arg)
value = iter(value)
for i, subarg in enumerate(arg):
try:
subvalue = next(value)
except StopIteration:
raise ValueError('need more than %d %s to unpack' %
(i, 'values' if i > 1 else 'value'))
assign(subarg,subvalue)
try:
next(value)
except StopIteration:
pass
else:
raise ValueError('too many values to unpack')
def is_assigned(arg):
if isinstance(arg,str):
return arg in arg2value
return arg in assigned_tuple_params
if ismethod(func) and func.im_self is not None:
# implicit 'self' (or 'cls' for classmethods) argument
positional = (func.im_self,) + positional
num_pos = len(positional)
num_total = num_pos + len(named)
num_args = len(args)
num_defaults = len(defaults) if defaults else 0
for arg, value in zip(args, positional):
assign(arg, value)
if varargs:
if num_pos > num_args:
assign(varargs, positional[-(num_pos-num_args):])
else:
assign(varargs, ())
elif 0 < num_args < num_pos:
raise TypeError('%s() takes %s %d %s (%d given)' % (
f_name, 'at most' if defaults else 'exactly', num_args,
'arguments' if num_args > 1 else 'argument', num_total))
elif num_args == 0 and num_total:
raise TypeError('%s() takes no arguments (%d given)' %
(f_name, num_total))
for arg in args:
if isinstance(arg, str) and arg in named:
if is_assigned(arg):
raise TypeError("%s() got multiple values for keyword "
"argument '%s'" % (f_name, arg))
else:
assign(arg, named.pop(arg))
if defaults: # fill in any missing values with the defaults
for arg, value in zip(args[-num_defaults:], defaults):
if not is_assigned(arg):
assign(arg, value)
if varkw:
assign(varkw, named)
elif named:
unexpected = next(iter(named))
if isinstance(unexpected, unicode):
unexpected = unexpected.encode(sys.getdefaultencoding(), 'replace')
raise TypeError("%s() got an unexpected keyword argument '%s'" %
(f_name, unexpected))
unassigned = num_args - len([arg for arg in args if is_assigned(arg)])
if unassigned:
num_required = num_args - num_defaults
raise TypeError('%s() takes %s %d %s (%d given)' % (
f_name, 'at least' if defaults else 'exactly', num_required,
'arguments' if num_required > 1 else 'argument', num_total))
return arg2value
main()
Output:
1 1000 2
{'a': 10, 'b': 1223, 'f': 'Hello World'}
There you go... I'm not responsible for any small children that get eaten by demons or something the like (or if it breaks on complicated functions).
PS: The inspect module is the pure EVIL.

Since you are trying to manipulate variables in one function, and do some job based on those variables on another function, the cleanest way to do it is having these variables to be an object's attributes.
It could be a dictionary - that could be defined inside the decorator - therefore access to it inside the decorated function would be as a "nonlocal" variable. That cleans up the default parameter tuple of this dictionary, that #bgporter proposed.:
def eggs(self, a, b, c=None):
# nonlocal parms ## uncomment in Python 3
parms["a"] = a
...
To be even more clean, you probably should have all these parameters as attributes of the instance (self) - so that no "magical" variable has to be used inside the decorated function.
As for doing it "magically" without having the parameters set as attributes of certain object explicitly, nor having the decorated function to return the parameters themselves (which is also an option) - that is, to have it to work transparently with any decorated function - I can't think of a way that does not involve manipulating the bytecode of the function itself.
If you can think of a way to make the wrapped function raise an exception at return time, you could trap the exception and check the execution trace.
If it is so important to do it automatically that you consider altering the function bytecode an option, feel free to ask me further.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Any way to bypass namedtuple 255 arguments limitation? - python

Related

Subclassing Template to provide default arguments

Class Factory do not work with isinstance

Dynamically subclass an Enum base class

Elegant pattern for mutually exclusive keyword args?

How can I get the values of the locals of a function after it has been executed?

Categories

Resources