The Python docs (Python2 and Python3) state that identifiers must not start with a digit. From my understanding this is solely a compiler constraint (see also this question). So is there anything wrong about starting dynamically created identifiers with a digit? For example:
type('3Tuple', (object,), {})
setattr(some_object, '123', 123)
Edit
Admittedly the second example (using setattr) from above might be less relevant, as one could introspect the object via dir, discovers the attribute '123' but cannot retrieve it via some_object.123.
So I'll elaborate a bit more on the first example (which appears more relevant to me).
The user should be provided with fixed length tuples and because the tuple length is arbitrary and not known in advance a proxy function for retrieving such tuples is used (could also be a class implementing __call__ or __getattr__):
def NTuple(number_of_elements):
# Add methods here.
return type('{0}Tuple'.format(number_of_elements),
(object,),
{'number_of_elements': number_of_elements})
The typical use case involves referencing instances of those dynamically create classes, not the classes themselves, as for example:
limits = NTuple(3)(1, 2, 3)
But still the class name provides some useful information (as opposed to just using 'Tuple'):
>>> limits.__class__.__name__
'3Tuple'
Also those class names will not be relevant for any code at compile time, hence it doesn't introduce any obstacles for the programmer/user.
Related
After I learned about different data types I learned that once an object from a given type is created it has innate methods that can do 'things'.
Playing around, I noticed that, while some methods return a value, others make change to the original data stored.
Is there any specific term for these two types of methods and is there any intuition or logic as to which methods return a value and which make changes?
For example:
abc= "something"
defg= [12,34,11,45,132,1]
abc.capitalise() #this returns a value
defg.sort() #this changes the orignal list
Is there any specific term for these two types of methods
A method that changes an object's state (ie list.sort()) is usually called a "mutator" (it "mutates" the object). There's no general name for methods that return values - they could be "getters" (methods that take no arguments and return part of the object's state), alternative constructors (methods that are called on the class itself and provide an alternative way to construct an instance of the class), or just methods that take some arguments, do some computations based on both the arguments and the object's state and return a result, or actually just do anything (do some computation AND change the object's state AND return a value).
is there any intuition or logic as to which methods return a value and which make changes?
Some Python objects are immutable (strings, numerics, tuples etc) so when you're working on one of those types you know you won't have any mutator. Except for this special case, nope, you will have to check the doc. The only naming convention here is that methods whose name starts with "set_" and take one argument will change the object's state based on their argument (and most often return nothing) and that methods whose name starts with "get_" and take no arguments will return informations on the object's state and change nothing (you'll often see the formers named "setters" and the laters named "getters"), but like any convention it's only followed by those who follow it, IOW don't assume that because a method name starts with "get_" or "set_" it will indeed behave as expected.
Strings are immutable, so all libraries that do string manipulation will return a new string.
For the other types, you will have to refer to the library documentation.
Suppose I have a module PyFoo.py that has a function bar. I want bar to print all of the local variables associated with the namespace that called it.
For example:
#! /usr/bin/env python
import PyFoo as pf
var1 = 'hi'
print locals()
pf.bar()
The two last lines would give the same output. So far I've tried defining bar as such:
def bar(x=locals):
print x()
def bar(x=locals()):
print x
But neither works. The first ends up being what's local to bar's namespace (which I guess is because that's when it's evaluated), and the second is as if I passed in globals (which I assume is because it's evaluated during import).
Is there a way I can have the default value of argument x of bar be all variables in the namespace which called bar?
EDIT 2018-07-29:
As has been pointed out, what was given was an XY Problem; as such, I'll give the specifics.
The module I'm putting together will allow the user to create various objects that represent different aspects of a numerical problem (e.x. various topology definitions, boundary conditions, constitutive models, ect.) and define how any given object interacts with any other object(s). The idea is for the user to import the module, define the various model entities that they need, and then call a function which will take all objects passed to it, make needed adjustments to ensure capability between them, and then write out a file that represents the entire numerical problem as a text file.
The module has a function generate that accepts each of the various types of aspects of the numerical problem. The default value for all arguments is an empty list. If a non-empty list is passed, then generate will use those instances for generating the completed numerical problem. If an argument is an empty list, then I'd like it to take in all instances in the namespace that called generate (which I will then parse out the appropriate instances for the argument).
EDIT 2018-07-29:
Sorry for any lack of understanding on my part (I'm not that strong of a programmer), but I think I might understand what you're saying with respect to an instance being declared or registered.
From my limited understanding, could this be done by creating some sort of registry dataset (like a list or dict) in the module that will be created when the module is imported, and that all module classes take this registry object in by default. During class initialization self can be appended to said dataset, and then the genereate function will take the registry as a default value for one of the arguments?
There's no way you can do what you want directly.
locals just returns the local variables in whatever namespace it's called in. As you've seen, you have access to the namespace the function is defined in at the time of definition, and you have access to the namespace of the function itself from within the function, but you don't have access to any other namespaces.
You can do what you want indirectly… but it's almost certainly a bad idea. At least this smells like an XY problem, and whatever it is you're actually trying to do, there's probably a better way to do it.
But occasionally it is necessary, so in case you have one of those cases:
The main good reason to want to know the locals of your caller is for some kind of debugging or other introspection function. And the way to do introspection is almost always through the inspect library.
In this case, what you want to inspect is the interpreter call stack. The calling function will be the first frame on the call stack behind your function's own frame.
You can get the raw stack frame:
inspect.currentframe().f_back
… or you can get a FrameInfo representing it:
inspect.stack()[1]
As explained at the top of the inspect docs, a frame object's local namespace is available as:
frame.f_locals
Note that this has all the same caveats that apply to getting your own locals with locals: what you get isn't the live namespace, but a mapping that, even if it is mutable, can't be used to modify the namespace (or, worse in 2.x, one that may or may not modify the namespace, unpredictably), and that has all cell and free variables flattened into their values rather than their cell references.
Also, see the big warning in the docs about not keeping frame objects alive unnecessarily (or calling their clear method if you need to keep a snapshot but not all of the references, but I think that only exists in 3.x).
Update:
As of CPython 3.6, dictionaries have a version (thank you pylang for showing this to me).
If they added the same version to list and made it public, all 3 asserts from my original post would pass! It would definitely meet my needs. Their implementation differs from what I envisioned, but I like it.
As it is, I don't feel I can use dictionary version:
It isn't public. Jake Vanderplas shows how to expose it in a post, but he cautions: definitely not code you should use for any purpose beyond simply having fun. I agree with his reasons.
In all of my use cases, the data is conceptually arrays of elements each of which has the same structure. A list of tuples is a natural fit. Using a dictionary would make the code less natural and probably more cumbersome.
Does anyone know if there are plans to add version to list?
Are there plans to make it public?
If there are plans to add version to list and make it public, I would feel awkward putting forward an incompatible VersionedList now. I would just implement the bare minimum I need and get by.
Original post below
Turns out that many of the times I wanted an immutable list, a VersionedList would have worked almost as well (sometimes even better).
Has anyone implemented a versioned list?
Is there a better, more Pythonic, concept that meets my needs? (See motivation below.)
What I mean by a versioned list is:
A class that behaves like a list
Any change to an instance or elements in the instance results in instance.version() being updated. So, if alist is a normal list:
a = VersionedList(alist)
a_version = a.version()
change(a)
assert a_version != a.version()
reverse_last_change(a)
If a list was hashable, hash() would achieve the above and meet all the needs identified in the motivation below. We need to define 'version()' in a way that doesn't have all of the same problems as 'hash()'.
If identical data in two lists is highly unlikely to ever happen except at initialization, we aren't going to have a reason to test for deep equality. From (https://docs.python.org/3.5/reference/datamodel.html#object.hash) The only required property is that objects which compare equal have the same hash value. If we don't impose this requirement on 'version()', it seems likely that 'version()' won't have all of the same problems that makes lists unhashable. So unlike hash, identical contents doesn't mean the same version
#contents of 'a' are now identical to original, but...
assert a_version != a.version()
b = VersionedList(alist)
c = VersionedList(alist)
assert b.version() != c.version()
For VersionList, it would be good if any attempt to modify the result of __get__ automatically resulted in a copy instead of modifying the underlying implementation data. I think that the only other option would be to have __get__ always return a copy of the elements, and this would be very inefficient for all of the use cases I can think of. I think we need to restrict the elements to immutable objects (deeply immutable, for example: exclude tuples with list elements). I can think of 3 ways to achieve this:
Only allow elements that can't contain mutable elements (int, str, etc are fine, but exclude tuples). (This is far too limiting for my cases)
Add code to __init__, __set__, etc to traverse inputs to deeply check for mutable sub-elements. (expensive, any way to avoid this?)
Also allow more complex elements, but require that they are deeply immutable. Perhaps require that they expose a deeply_immutable attribute. (This turns out to be easy for all the use cases I have)
Motivation:
If I am analyzing a dataset, I often have to perform multiple steps that return large datasets (note: since the dataset is ordered, it is best represented by a List not a set).
If at the end of several steps (ex: 5) it turns out that I need to perform different analysis (ex: back at step 4), I want to know that the dataset from step 3 hasn't accidentally been changed. That way I can start at step 4 instead of repeating steps 1-3.
I have functions (control-points, first-derivative, second-derivative, offset, outline, etc) that depend on and return array-valued objects (in the linear algebra sense). The base 'array' is knots.
control-points() depends on: knots, algorithm_enum
first-derivative() depends on: control-points(), knots
offset() depends on: first-derivative(), control-points(), knots, offset_distance
outline() depends on: offset(), end_type_enum
If offset_distance changes, I want to avoid having to recalculate first-derivative() and control-points(). To avoid recalculation, I need to know that nothing has accidentally changed the resultant 'arrays'.
If 'knots' changes, I need to recalculate everything and not depend on the previous resultant 'arrays'.
To achieve this, knots and all of the 'array-valued' objects could be VersionedList.
FYI: I had hoped to take advantage of an efficient class like numpy.ndarray. In most of my use cases, the elements logically have structure. Having to mentally keep track of multi-dimensions of indexes meant implementing and debugging the algorithms was many times more difficult with ndarray. An implementation based on lists of namedtuples of namedtuples turned out to be much more sustainable.
Private dicts in 3.6
In Python 3.6, dictionaries are now private (PEP 509) and compact (issue 27350), which track versions and preserve order respectively. These features are presently true when using the CPython 3.6 implementation. Despite the challenge, Jake VanderPlas demonstrates in his blog post a detailed demonstration of exposing this versioning feature from CPython within normal Python. We can use his approach to:
determine when a dictionary has been updated
preserve the order
Example
import numpy as np
d = {"a": np.array([1,2,3]),
"c": np.array([1,2,3]),
"b": np.array([8,9,10]),
}
for i in range(3):
print(d.get_version()) # monkey-patch
# 524938
# 524938
# 524938
Notice the version number does not change until the dictionary is updated, as shown below:
d.update({"c": np.array([10, 11, 12])})
d.get_version()
# 534448
In addition, the insertion order is preserved (the following was tested in restarted sessions of Python 3.5 and 3.6):
list(d.keys())
# ['a', 'c', 'b']
You may be able to take advantage of this new dictionary behavior, saving you from implementing a new datatype.
Details
For those interested, the latter get_version()is a monkey-patched method for any dictionary, implemented in Python 3.6 using the following modified code derived from Jake VanderPlas' blog post. This code was run prior to calling get_version().
import types
import ctypes
import sys
assert (3, 6) <= sys.version_info < (3, 7) # valid only in Python 3.6
py_ssize_t = ctypes.c_ssize_t
# Emulate the PyObjectStruct from CPython
class PyObjectStruct(ctypes.Structure):
_fields_ = [('ob_refcnt', py_ssize_t),
('ob_type', ctypes.c_void_p)]
# Create a DictStruct class to wrap existing dictionaries
class DictStruct(PyObjectStruct):
_fields_ = [("ma_used", py_ssize_t),
("ma_version_tag", ctypes.c_uint64),
("ma_keys", ctypes.c_void_p),
("ma_values", ctypes.c_void_p),
]
def __repr__(self):
return (f"DictStruct(size={self.ma_used}, "
f"refcount={self.ob_refcnt}, "
f"version={self.ma_version_tag})")
#classmethod
def wrap(cls, obj):
assert isinstance(obj, dict)
return cls.from_address(id(obj))
assert object.__basicsize__ == ctypes.sizeof(PyObjectStruct)
assert dict.__basicsize__ == ctypes.sizeof(DictStruct)
# Code for monkey-patching existing dictionaries
class MappingProxyStruct(PyObjectStruct):
_fields_ = [("mapping", ctypes.POINTER(DictStruct))]
#classmethod
def wrap(cls, D):
assert isinstance(D, types.MappingProxyType)
return cls.from_address(id(D))
assert types.MappingProxyType.__basicsize__ == ctypes.sizeof(MappingProxyStruct)
def mappingproxy_setitem(obj, key, val):
"""Set an item in a read-only mapping proxy"""
proxy = MappingProxyStruct.wrap(obj)
ctypes.pythonapi.PyDict_SetItem(proxy.mapping,
ctypes.py_object(key),
ctypes.py_object(val))
mappingproxy_setitem(dict.__dict__,
'get_version',
lambda self: DictStruct.wrap(self).ma_version_tag)
For my application (modelling hardware registers with named bit fields), I'd like to support syntax like this to access the fields:
device.REG0.F0 = 1 # access some defined subset of bits
print device.REG0.F0
but also allow access to the whole register as an integer:
device.REG0 = 123 # access all bits
print device.REG0
To support this, __getattr__() on the outer object needs to determine whether it is part of an access to some innermost field (in which case return the register object for further __get/setattr__() processing), or simply an access to a whole register (in which case return the integer value).
I have a half-assed proof-of-concept working by looking at the source text of caller's context via the inspect module, but it's easily broken. Is there some more reliable way to get maybe AST or other syntactic information about the 'current spot' in the code?
Or, are there alternative approaches which give the desired syntax:
some way for an object to implicitly yield an integer in appropriate contexts?
some way to simulate properties of a property?
some other magic?
Note: I'm aware that I can achieve the required functionality with different syntax. It's this specific syntax which is important to me.
I'm coding a poker hand evaluator as my first programming project. I've made it through three classes, each of which accomplishes its narrowly-defined task very well:
HandRange = a string-like object (e.g. "AA"). getHands() returns a list of tuples for each specific hand within the string:
[(Ad,Ac),(Ad,Ah),(Ad,As),(Ac,Ah),(Ac,As),(Ah,As)]
Translation = a dictionary that maps the return list from getHands to values that are useful for a given evaluator (yes, this can probably be refactored into another class).
{'As':52, 'Ad':51, ...}
Evaluator = takes a list from HandRange (as translated by Translator), enumerates all possible hand matchups and provides win % for each.
My question: what should my "domain" class for using all these classes look like, given that I may want to connect to it via either a shell UI or a GUI? Right now, it looks like an assembly line process:
user_input = HandRange()
x = Translation.translateList(user_input)
y = Evaluator.getEquities(x)
This smells funny in that it feels like it's procedural when I ought to be using OO.
In a more general way: if I've spent so much time ensuring that my classes are well defined, narrowly focused, orthogonal, whatever ... how do I actually manage work flow in my program when I need to use all of them in a row?
Thanks,
Mike
Don't make a fetish of object orientation -- Python supports multiple paradigms, after all! Think of your user-defined types, AKA classes, as building blocks that gradually give you a "language" that's closer to your domain rather than to general purpose language / library primitives.
At some point you'll want to code "verbs" (actions) that use your building blocks to perform something (under command from whatever interface you'll supply -- command line, RPC, web, GUI, ...) -- and those may be module-level functions as well as methods within some encompassing class. You'll surely want a class if you need multiple instances, and most likely also if the actions involve updating "state" (instance variables of a class being much nicer than globals) or if inheritance and/or polomorphism come into play; but, there is no a priori reason to prefer classes to functions otherwise.
If you find yourself writing static methods, yearning for a singleton (or Borg) design pattern, writing a class with no state (just methods) -- these are all "code smells" that should prompt you to check whether you really need a class for that subset of your code, or rather whether you may be overcomplicating things and should use a module with functions for that part of your code. (Sometimes after due consideration you'll unearth some different reason for preferring a class, and that's allright too, but the point is, don't just pick a class over a module w/functions "by reflex", without critically thinking about it!).
You could create a Poker class that ties these all together and intialize all of that stuff in the __init__() method:
class Poker(object):
def __init__(self, user_input=HandRange()):
self.user_input = user_input
self.translation = Translation.translateList(user_input)
self.evaluator = Evaluator.getEquities(x)
# and so on...
p = Poker()
# etc, etc...