python mutable string class not working correctly - python

I created a mutable String class in Python, based on the builtin str class.
I can change the first character, but when I call capitalize(), it uses the old value instead
class String(str):
def __init__(self, string):
self.string = list(string)
def __repr__(self):
return "".join(self.string)
def __str__(self):
return "".join(self.string)
def __setitem__(self, index, value):
self.string[index] = value
def __getitem__(self, index):
if type(index) == slice:
return "".join(self.string[index])
return self.string[index]
def __delitem__(self, index):
del self.string[index]
def __add__(self, other_string):
return String("".join(self.string) + other_string)
def __len__(self):
return len(self.string)
text = String("cello world")
text[0] = "h"
print(text)
print(text.capitalize())
Expected Output :
hello world
Hello world
Actual Output :
hello world
Cello world

Your implementation inherits from str, so it brings along all the methods that str implements. However, the implementation of the str.capitalize() method is not designed to take that into account. Methods like str.capitalize() return a new str object with the required change applied.
Moreover, the Python built-in types do not store their state in a __dict__ mapping of attributes, but use internal struct data structures) only accessible on the C level; your self.string attribute is not where the (C equivalent of) str.__new__() stores the string data. The str.capitalize() method bases its return value on the value stored in the internal data structure when the instance was created, which can't be altered from Python code.
You'll have to shadow all the str methods that return a new value, including str.capitalize() to behave differently. If you want those methods from returning a new instance to changing the value in-place, you have to do so yourself:
class String(str):
# ...
def capitalize(self):
"""Capitalize the string, in place"""
self.string[:] ''.join(self.string).capitalize()
return self # or return None, like other mutable types would do
That can be a lot of work, writing methods like these for every possible str method that returns an updated value. Instead, you could use a __getattribute__ hook to redirect methods:
_MUTATORS = {'capitalize', 'lower', 'upper', 'replace'} # add as needed
class String(str):
# ...
def __getattribute__(self, name):
if name in _MUTATORS:
def mutator(*args, **kwargs):
orig = getattr(''.join(self.string), name)
self.string[:] = orig(*args, **kwargs)
return self # or return None for Python type consistency
mutator.__name__ = name
return mutator
return super().__getattribute__(name)
Demo with the __getattribute__ method above added to your class:
>>> text = String("cello world")
>>> text[0] = "h"
>>> print(text)
hello world
>>> print(text.capitalize())
Hello world
>>> print(text)
Hello world
One side note: the __repr__ method should use repr() to return a proper representation, not just the value:
def __repr__(self):
return repr(''.join(self.string))
Also, take into account that most Python APIs that are coded in C and take a str value as input, are likely to use the C API for Unicode strings and so not only completely ignore your custom implementations but like the original str.capitalize() method will also ignore the self.string attribute. Instead, they too will interact with the internal str data.

This approach is inferior to the already suggested answers. There is more overhead because you don't get to just track things as a list, and isinstance(s, str) won't work, for example.
Another way to accomplish this is to subclass collections.UserString. It's a wrapper around the built-in string type that stores it as a member named data. So you could do something like
from collections import UserString
class String(UserString):
def __init__(self, string):
super().__init__(string)
def __setitem__(self, index, value):
data_list = list(self.data)
data_list[index] = value
self.data = "".join(data_list)
# etc.
And then you will get capitalize and the other string methods for free.

You inherited str's definition of capitalize, which ignores your class's behaviors and just uses the underlying data of the "real" str.
Inheriting from a built-in type like this effectively requires you to reimplement every method, or do some metaprogramming with __getattribute__; otherwise, the underlying type's behaviors will be inherited unmodified.

Related

Python Expert: how to inherit built-in class and override every member function w.r.t. the base-class member function?

It is known that in Python, due to optimization concerns, we cannot add/modify member functions of a built-in class, e.g., adding an sed function to the built-in str class to perform re.sub(). Thus, the only way to achieve so is to inherit the class (or subclassing). i.e.,
class String(str):
def __init__(self, value='', **kwargs):
super().__init__()
def sed(self, src, tgt):
return String(re.sub(src, tgt, self))
The problem with this is that after sub-classing, member functions return base-class instance instead of the inherited class instance. For example, I would like to chain String edits String(' A b C d E [!] ').sed(...).lower().sed(...).strip().sed('\[.*\]', '').split() and so on. However, functions such as .lower() and .strip() returns an str instead of String, so cannot perform .sed(...) afterwards. And I do not want to keep casting to String after every function call.
So I did a manual over-ride of every base-class methods as follows:
class String(str):
for func in dir(str):
if not func.startswith('_'):
exec(f'{func}=lambda *args, **kwargs: [(String(i) if type(i)==str else i) for i in [str.{func}(*args, **kwargs)]][0]')
def __init__(self, value='', **kwargs):
super().__init__()
def sed(self, src, tgt):
return String(re.sub(src, tgt, self))
However, not every member function returns a simple str object, e.g., for functions such as .split(), they return a list of str; other functions like .isalpha() or .find() return boolean or integer. In general, I want to add more string-morphing functions and do not want to manually over-ride member functions of each return type in order to return inherited-class objects rather than base-class objects. So is there a more elegant way of doing this? Thanks!
Python's built-in classes are not designed to support that style of inheritance
easily. Also, the whole idea seems flawed to my eye. Even if you do figure out
a way to solve the problem as you've framed it, what's the advantage over good
old functions?
# Special String objects with new methods.
s = String('foo bar')
result = s.sed('...', '...')
# Regular str instances passed to ordinary functions.
s = 'foo bar'
result = sed(s, '...', '...')
That said, here's one way to try. I have not tested it
extensively, it might have a flaw, and I would never use it in real code.
The basic idea is to capture objects returned during low-level
attribute access, and if the object is callable return
a wrapped version of it that will perform the needed
data conversions.
import re
from functools import wraps
class String(str):
def __getattribute__(self, attr):
obj = object.__getattribute__(self, attr)
return wrapped(obj) if callable(obj) else obj
def __init__(self, value='', **kwargs):
super().__init__()
def sed(self, src, tgt):
return re.sub(src, tgt, self)
def wrapped(func):
#wraps(func)
def wrapper(*xs, **kws):
obj = func(*xs, **kws)
return convert(obj)
return wrapper
def convert(obj):
if isinstance(obj, str):
return String(obj)
elif isinstance(obj, list):
return [convert(x) for x in obj]
elif isinstance(obj, tuple):
return tuple(convert(x) for x in obj)
else:
return obj
Demo:
s = String('foo bar')
got = s.sed('foo', 'bzz').upper().split()
print(got)
print(type(got))
print(type(got[0]))
Output:
['BZZ', 'BAR']
<class 'list'>
<class '__main__.String'>

More efficient way of setting default method argument to instance attribute [duplicate]

I want to pass a default argument to an instance method using the value of an attribute of the instance:
class C:
def __init__(self, format):
self.format = format
def process(self, formatting=self.format):
print(formatting)
When trying that, I get the following error message:
NameError: name 'self' is not defined
I want the method to behave like this:
C("abc").process() # prints "abc"
C("abc").process("xyz") # prints "xyz"
What is the problem here, why does this not work? And how could I make this work?
You can't really define this as the default value, since the default value is evaluated when the method is defined which is before any instances exist. The usual pattern is to do something like this instead:
class C:
def __init__(self, format):
self.format = format
def process(self, formatting=None):
if formatting is None:
formatting = self.format
print(formatting)
self.format will only be used if formatting is None.
To demonstrate the point of how default values work, see this example:
def mk_default():
print("mk_default has been called!")
def myfun(foo=mk_default()):
print("myfun has been called.")
print("about to test functions")
myfun("testing")
myfun("testing again")
And the output here:
mk_default has been called!
about to test functions
myfun has been called.
myfun has been called.
Notice how mk_default was called only once, and that happened before the function was ever called!
In Python, the name self is not special. It's just a convention for the parameter name, which is why there is a self parameter in __init__. (Actually, __init__ is not very special either, and in particular it does not actually create the object... that's a longer story)
C("abc").process() creates a C instance, looks up the process method in the C class, and calls that method with the C instance as the first parameter. So it will end up in the self parameter if you provided it.
Even if you had that parameter, though, you would not be allowed to write something like def process(self, formatting = self.formatting), because self is not in scope yet at the point where you set the default value. In Python, the default value for a parameter is calculated when the function is compiled, and "stuck" to the function. (This is the same reason why, if you use a default like [], that list will remember changes between calls to the function.)
How could I make this work?
The traditional way is to use None as a default, and check for that value and replace it inside the function. You may find it is a little safer to make a special value for the purpose (an object instance is all you need, as long as you hide it so that the calling code does not use the same instance) instead of None. Either way, you should check for this value with is, not ==.
Since you want to use self.format as a default argument this implies that the method needs to be instance specific (i.e. there is no way to define this at class level). Instead you can define the specific method during the class' __init__ for example. This is where you have access to instance specific attributes.
One approach is to use functools.partial in order to obtain an updated (specific) version of the method:
from functools import partial
class C:
def __init__(self, format):
self.format = format
self.process = partial(self.process, formatting=self.format)
def process(self, formatting):
print(formatting)
c = C('default')
c.process()
# c.process('custom') # Doesn't work!
c.process(formatting='custom')
Note that with this approach you can only pass the corresponding argument by keyword, since if you provided it by position, this would create a conflict in partial.
Another approach is to define and set the method in __init__:
from types import MethodType
class C:
def __init__(self, format):
self.format = format
def process(self, formatting=self.format):
print(formatting)
self.process = MethodType(process, self)
c = C('test')
c.process()
c.process('custom')
c.process(formatting='custom')
This allows also passing the argument by position, however the method resolution order becomes less apparent (which can affect the IDE inspection for example, but I suppose there are IDE specific workarounds for that).
Another approach would be to create a custom type for these kind of "instance attribute defaults" together with a special decorator that performs the corresponding getattr argument filling:
import inspect
class Attribute:
def __init__(self, name):
self.name = name
def decorator(method):
signature = inspect.signature(method)
def wrapper(self, *args, **kwargs):
bound = signature.bind(*((self,) + args), **kwargs)
bound.apply_defaults()
bound.arguments.update({k: getattr(self, v.name) for k, v in bound.arguments.items()
if isinstance(v, Attribute)})
return method(*bound.args, **bound.kwargs)
return wrapper
class C:
def __init__(self, format):
self.format = format
#decorator
def process(self, formatting=Attribute('format')):
print(formatting)
c = C('test')
c.process()
c.process('custom')
c.process(formatting='custom')
You can't access self in the method definition. My workaround is this -
class Test:
def __init__(self):
self.default_v = 20
def test(self, v=None):
v = v or self.default_v
print(v)
Test().test()
> 20
Test().test(10)
> 10
"self" need to be pass as the first argument to any class functions if you want them to behave as non-static methods.
it refers to the object itself. You could not pass "self" as default argument as it's position is fix as first argument.
In your case instead of "formatting=self.format" use "formatting=None" and then assign value from code as below:
[EDIT]
class c:
def __init__(self, cformat):
self.cformat = cformat
def process(self, formatting=None):
print "Formating---",formatting
if formatting == None:
formatting = self.cformat
print formatting
return formatting
else:
print formatting
return formatting
c("abc").process() # prints "abc"
c("abc").process("xyz") # prints "xyz"
Note : do not use "format" as variable name, 'cause it is built-in function in python
Instead of creating a list of if-thens that span your default arguements, one can make use of a 'defaults' dictionary and create new instances of a class by using eval():
class foo():
def __init__(self,arg):
self.arg = arg
class bar():
def __init__(self,*args,**kwargs):
#default values are given in a dictionary
defaults = {'foo1':'foo()','foo2':'foo()'}
for key in defaults.keys():
#if key is passed through kwargs, use that value of that key
if key in kwargs: setattr(self,key,kwargs[key])
#if no key is not passed through kwargs
#create a new instance of the default value
else: setattr(self,key, eval(defaults[key]))
I throw this at the beginning of every class that instantiates another class as a default argument. It avoids python evaluating the default at compile... I would love a cleaner pythonic approach, but lo'.

Classes Python and developing Stack

I have created my own LIFO container class Stack that supports the methods of push, len, pop, and a check on isEmpty. All methods appear to be working in my example calls, except for when I call a created instance of this class(in my example s) I receive a memory location for the created object when I want to see the actual contents of that object.
class Stack:
x = []
def __init__(self, x=None):
if x == None:
self.x = []
else:
self.x = x
def isEmpty(self):
return len(self.x) == 0
def push(self,p):
self.x.append(p)
def pop(self):
return self.x.pop()
def __len__(self):
return(len(self.x))
s = Stack()
s.push('plate 1')
s.push('plate 2')
s.push('plate 3')
print(s)
print(s.isEmpty())
print(len(s))
print(s.pop())
print(s.pop())
print(s.pop())
print(s.isEmpty())
I get the result of running this line print(s) to be <__main__.Stack object at 0x00000000032CD748>t when I would expect and am looking for ['plate 1','plate 2','plate3']
You need to also override __str__ or __repr__ if you want your class to have a different representation when printing. Something like:
def __str__(self):
return str(self.x)
should do the trick. __str__ is what is called by the str function (and implicitly called by print). The default __str__ simply returns the result of __repr__ which defaults to that funny string with the type and the memory address.
You need to override the default implementation of __repr__. Otherwise it will use the default implementation which returns an informal string representation of the class, in this case the type and memory address.
def __repr__(self):
return str(self.x)
Yes override __str__ and/or __repr__
Remember __repr__ can can evaled and return the same object if possible

Python: renaming method names on-the-fly

I have many files using classes with the following syntax:
o = module.CreateObject()
a = o.get_Field
and now the implementation has changed from 'get_XXX' and 'set_XXX' to just 'XXX':
o = module.CreateObject()
a = o.Field
This implementation is an external package, which I don't want to change. Is it possible to write a wrapper which will on-the-fly intercept all calls to 'get_XXX' and replace then with calls to the new name 'XXX'?
o = MyRenamer(module.CreateObject())
a = o.get_Field # works as before, o.Field is called
a = o.DoIt() # works as before, o.DoIt is called
It needs to intercept all calls, not just to a finite-set of fields, decide based on the method name if to modify it and cause a method with a modified name to be called.
If you want to continue to use get_Field and set_Field on an object that has switched to using properties (where you simply access or assign to Field), it's possible to use an wrapper object:
class NoPropertyAdaptor(object):
def __init__(self, obj):
self.obj = obj
def __getattr__(self, name):
if name.startswith("get_"):
return lambda: getattr(self.obj, name[4:])
elif name.startswith("set_"):
return lambda value: setattr(self.obj, name[4:], value)
else:
return getattr(self.obj, name)
This will have problems if you are using extra syntax, like indexing or iteration on the object, or if you need to recognize the type of the object using isinstance.
A more sophisticated solution would be to create a subclass that does the name rewriting and force the object to use it. This isn't exactly a wrapping, since outside code will still deal with the object directly (and so magic methods and isinstance) will work as expected. This approach will work for most objects, but it might fail for types that have fancy metaclass magic going on and for some builtin types:
def no_property_adaptor(obj):
class wrapper(obj.__class__):
def __getattr__(self, name):
if name.startswith("get_"):
return lambda: getattr(self, name[4:])
elif name.startswith("set_"):
return lambda value: setattr(self, name[4:], value)
else:
return super(wrapper, self).__getattr__(name)
obj.__class__ = wrapper
return obj
You can 'monkey patch' any python class; import the class directly and add a property:
import original_module
#property
def get_Field(self):
return self.Field
original_module.OriginalClass.get_Field = get_Field
You'd need to enumerate what fields you wanted to access this way:
def addField(fieldname, class):
#property
def get_Field(self):
return getattr(self, fieldname)
setattr(original_module.OriginalClass, 'get_{}'.format(fieldname), get_Field)
for fieldname in ('Foo', 'Bar', 'Baz'):
addField(fieldname, original_module.OriginalClass)

Mapping obj.method({argument:value}) to obj.argument(value)

I don't know if this will make sense, but...
I'm trying to dynamically assign methods to an object.
#translate this
object.key(value)
#into this
object.method({key:value})
To be more specific in my example, I have an object (which I didn't write), lets call it motor, which has some generic methods set, status and a few others. Some take a dictionary as an argument and some take a list. To change the motor's speed, and see the result, I use:
motor.set({'move_at':10})
print motor.status('velocity')
The motor object, then formats this request into a JSON-RPC string, and sends it to an IO daemon. The python motor object doesn't care what the arguments are, it just handles JSON formatting and sockets. The strings move_at and velocity are just two of what might be hundreds of valid arguments.
What I'd like to do is the following instead:
motor.move_at(10)
print motor.velocity()
I'd like to do it in a generic way since I have so many different arguments I can pass. What I don't want to do is this:
# create a new function for every possible argument
def move_at(self,x)
return self.set({'move_at':x})
def velocity(self)
return self.status('velocity')
#and a hundred more...
I did some searching on this which suggested the solution lies with lambdas and meta programming, two subjects I haven't been able to get my head around.
UPDATE:
Based on the code from user470379 I've come up with the following...
# This is what I have now....
class Motor(object):
def set(self,a_dict):
print "Setting a value", a_dict
def status(self,a_list):
print "requesting the status of", a_list
return 10
# Now to extend it....
class MyMotor(Motor):
def __getattr__(self,name):
def special_fn(*value):
# What we return depends on how many arguments there are.
if len(value) == 0: return self.status((name))
if len(value) == 1: return self.set({name:value[0]})
return special_fn
def __setattr__(self,attr,value): # This is based on some other answers
self.set({attr:value})
x = MyMotor()
x.move_at = 20 # Uses __setattr__
x.move_at(10) # May remove this style from __getattr__ to simplify code.
print x.velocity()
output:
Setting a value {'move_at': 20}
Setting a value {'move_at': 10}
10
Thank you to everyone who helped!
What about creating your own __getattr__ for the class that returns a function created on the fly? IIRC, there's some tricky cases to watch out for between __getattr__ and __getattribute__ that I don't recall off the top of my head, I'm sure someone will post a comment to remind me:
def __getattr__(self, name):
def set_fn(self, value):
return self.set({name:value})
return set_fn
Then what should happen is that calling an attribute that doesn't exist (ie: move_at) will call the __getattr__ function and create a new function that will be returned (set_fn above). The name variable of that function will be bound to the name parameter passed into __getattr__ ("move_at" in this case). Then that new function will be called with the arguments you passed (10 in this case).
Edit
A more concise version using lambdas (untested):
def __getattr__(self, name):
return lambda value: self.set({name:value})
There are a lot of different potential answers to this, but many of them will probably involve subclassing the object and/or writing or overriding the __getattr__ function.
Essentially, the __getattr__ function is called whenever python can't find an attribute in the usual way.
Assuming you can subclass your object, here's a simple example of what you might do (it's a bit clumsy but it's a start):
class foo(object):
def __init__(self):
print "initting " + repr(self)
self.a = 5
def meth(self):
print self.a
class newfoo(foo):
def __init__(self):
super(newfoo, self).__init__()
def meth2(): # Or, use a lambda: ...
print "meth2: " + str(self.a) # but you don't have to
self.methdict = { "meth2":meth2 }
def __getattr__(self, name):
return self.methdict[name]
f = foo()
g = newfoo()
f.meth()
g.meth()
g.meth2()
Output:
initting <__main__.foo object at 0xb7701e4c>
initting <__main__.newfoo object at 0xb7701e8c>
5
5
meth2: 5
You seem to have certain "properties" of your object that can be set by
obj.set({"name": value})
and queried by
obj.status("name")
A common way to go in Python is to map this behaviour to what looks like simple attribute access. So we write
obj.name = value
to set the property, and we simply use
obj.name
to query it. This can easily be implemented using the __getattr__() and __setattr__() special methods:
class MyMotor(Motor):
def __init__(self, *args, **kw):
self._init_flag = True
Motor.__init__(self, *args, **kw)
self._init_flag = False
def __getattr__(self, name):
return self.status(name)
def __setattr__(self, name, value):
if self._init_flag or hasattr(self, name):
return Motor.__setattr__(self, name, value)
return self.set({name: value})
Note that this code disallows the dynamic creation of new "real" attributes of Motor instances after the initialisation. If this is needed, corresponding exceptions could be added to the __setattr__() implementation.
Instead of setting with function-call syntax, consider using assignment (with =). Similarly, just use attribute syntax to get a value, instead of function-call syntax. Then you can use __getattr__ and __setattr__:
class OtherType(object): # this is the one you didn't write
# dummy implementations for the example:
def set(self, D):
print "setting", D
def status(self, key):
return "<value of %s>" % key
class Blah(object):
def __init__(self, parent):
object.__setattr__(self, "_parent", parent)
def __getattr__(self, attr):
return self._parent.status(attr)
def __setattr__(self, attr, value):
self._parent.set({attr: value})
obj = Blah(OtherType())
obj.velocity = 42 # prints setting {'velocity': 42}
print obj.velocity # prints <value of velocity>

Categories