Python multiple dataframe methods in single line

Python multiple dataframe methods in single line - python

New to Python. Can anyone explain how multiple methods are put together in one single line of code and how do we write a class with such capability?
Here is the code snippet I got but I dont know exactly how/why it works.
df = df.div(100).add(1.01).cumprod()
Thanks

Each intermediate method (div and add) returns a pd.DataFrame, that's why you can put them into a single line. You can also implement it in a custom class if you want:
class Calculator:
def __init__(self, value):
self.value = value
def add(self, other):
# Return a new instance of Calculator with the method result
return Calculator(self.value + other)
def subtract(self, other):
return Calculator(self.value - other)
def result(self):
# Since this does not return the same instance, you can't
# do something like Calculator(4).add(2).result().add(5)
return self.value
Then you can call it like that:
print(Calculator(5).add(10).subtract(3).result())
# Outputs 12
The above class has the advantage of being immutable (considering self.value is private).
However, you can also do a mutable approach:
class Calculator:
def __init__(self, value):
self.value = value
def add(self, other):
self.value += other
# Return the same instance of Calculator you're acting on
return self
def subtract(self, other):
self.value -= other
return self
def result(self):
return self.value
In the above, no new instance of the same class is created. Instead, all the calculations are done inside the calculator. You can call it the same way:
print(Calculator(5).add(10).subtract(3).result())
# Outputs 12

Related

Pythonic way to improve subclasses of MutableSequence and MutableMapping

recently i needed to implement some data structures that could tell if they have been modified since a given point in time. Currently i have a ChangeDetectable class that implements the check but the implementation of setting the flag is left for the child classes. Here's a minimal example:
ChangeDetectable class:
class ChangeDetectable:
def __init__(self):
self._changed = False
def save_state(self):
self._changed = False
def has_changed(self) -> bool:
return self._changed
List-like class:
class MyList(MutableSequence, ChangeDetectable):
def __init__(self):
super().__init__()
self._list = list()
def __getitem__(self, item):
return self._list.__getitem__(item)
def __setitem__(self, key, value):
self._list.__setitem__(key, value)
self._changed = True
def __delitem__(self, key):
self._list.__delitem__(key)
self._changed = True
def __len__(self):
return self._list.__len__()
def insert(self, index, item):
self._list.insert(index, item)
self._changed = True
Dict-like class:
class MyDict(MutableMapping, ChangeDetectable):
def __init__(self):
super().__init__()
self._dict = dict()
def __getitem__(self, key):
return self._dict.__getitem__(key)
def __setitem__(self, key, value):
self._dict.__setitem__(key, value)
self._changed = True
def __delitem__(self, key):
self._dict.__delitem__(key)
self._changed = True
def __iter__(self):
return self._dict.__iter__()
def __len__(self):
return self._dict.__len__()
So my question is: right now children have to implement the right write operation. For instance MyList needs insert method and MyDict does not. Is there a way to implement all the methods i could possibly need in the parent and then only inherit in the children the ones i need?
This could make the code cleaner because it would have super() but i wouldn't want to have insert in MyDict.
Thank you.

Let me give you a general answer.
Abstract Base Classes
The official documentation for this is quite good: https://docs.python.org/3/library/abc.html.
Also please take a look at this SO question: What are metaclasses in Python?
You create a class like MyABC and then you define all the methods must be implemented and mark them as #abstractmethod. For example, if MyList and MyDict must implement values, then you should define a method values in MyABC. MyList and MyDict inherit from MyABC and must implement values, as seen in this answer.
If MyDict implements something specific to it, as a dictionary, then it simply defines and implements myKeys.
This is answered quite fully at Why use Abstract Base Classes in Python?.

What is the correct way to use objects as keys in a ZOBD OOBTree?

In ZOBD (in Python 3.x) I would like to be able to store objects as keys in an BTrees.OOBTree.OOBTree(). Example of the error I get when I try (see the comment):
from BTrees.OOBTree import OOBTree as Btree
class Test:
pass
bt=Btree()
t=Test()
bt[t]=None #TypeError: Object has default comparison
So, I read somewhere that __eq__ may need to be defined to remove that error, but although that seemed to fix the previous problem it seems to cause more problems. Example:
[EDIT: It should be noted that I've found some problems with inheriting OOBTree (and TreeSet) as I do here. Apparently, they don't save properly; so, it's not the same as inheriting Persistent, even though they inherit Persistent.]
from BTrees.OOBTree import OOBTree as Btree
class Test:
def __eq__(self, other): #Maybe this isn't the way to define the method
return self==other
bt=Btree()
t=Test()
bt[t]=None
t in bt #TypeError: unorderable types: Test() < Test()
What is the correct way to use objects as keys in a BTree or OOBTree? I do need to test whether the key exists, too.
For those who don't know, BTrees in ZODB are pretty much like scalable Python dictionaries (they should be workable with more key-value pairs than a regular Python dictionary) designed for persistence.

I think this answer can help with your problem.
Bascically, you have to reimplent three methods on your object:
__eq__ (equality check)
__ne__ (non equality check)
__hash__ to make the object really serializable as a dictionary key

Although Eliot Berriot's answer led me to the answers I needed, I figured I would post the full answer that helped me so others don't have to spend extra time figuring stuff out. (I'm going to speak to myself in the second person.)
First of all (I didn't really ask about it, but it's something you might be tempted to do), don't inherit OOBTree or OOTreeSet (this causes problems). Make your own classes that inherit Persistent, and put an OOBTree or an OOTreeSet inside, if you want something like an inherited OOBTree (also, define the methods needed to make it seem like a dictionary or a set if you want that).
Next of all, you need to create a Persistent ID system (for every object that you put in the OOBTree or OOTreeSet, because objects cause OOBTrees and OOTreeSets to malfunction if you don't have a unique integer that ZOBD can keep track of your objects with. You need to define the methods that Eliot mentioned, as well as some other similar ones (and these need to compare that integer ID—not the object itself); i.e. define these methods of your classes that produce objects that will be keys of an OOBTree or contained in an OOTreeSet: __eq__, __ne__, __hash__, __lt__, __le__, __gt__, and __ge__. However, in order to have a persistent ID, you're going to have to make an ID counter class or something (because it won't save plain integers as values in an OOBTree for some odd reason, unless I did it wrong), and that counter class will have to have an ID, too.
Next of all, you need to make sure that if you're making objects keys, then you better not make things like strings be keys, too, in the same OOBTree, or else you'll have mysterious issues (due to strings not having the same sort of ID system as your objects). It'll be comparing the string keys to your object keys, and cause an error, because they're not designed to compare.
Here is a working example of Python 3.x code that allows you to use objects as keys in an OOBTree, and it will allow you to iterate over persistent objects in an OOBTree (and use them as keys). It also shows you how it can save and load the objects.
Sorry it's kind of long, but it should give you a good idea of how this can work:
import transaction, ZODB, ZODB.FileStorage
from persistent import Persistent
from BTrees.OOBTree import OOBTree as OOBTree
from BTrees.OOBTree import OOTreeSet as OOTreeSet
class Btree(Persistent):
def __init__(self, ID=None, **attr):
#I like to use entirely uppercase variables to represent ones you aren't supposed to access outside of the class (because it doesn't have the restrictions that adding _ and __ to the beginning do, and because you don't really need all caps for constants in Python)
Persistent.__init__(self)
self.DS=OOBTree() #DS stands for data structure
self.DS.update(attr)
if ID==None:
self.ID=-1 #To give each object a unique id. The value, -1, is replaced.
self.ID_SET=False
else:
self.ID=ID #You should remember what you’re putting here, and it should be negative.
self.ID_SET=True
def clear(self):
self.DS.clear()
def __delitem__(self, key):
del self.DS[key]
def __getitem__(self, key):
return self.DS[key]
def __len__(self):
return len(self.DS)
def __iadd__(self, other):
self.DS.update(other)
def __isub__(self, other):
for x in other:
try:
del self.DS[x]
except KeyError:
pass
def __contains__(self, key):
return self.DS.has_key(key)
def __setitem__(self, key, value):
self.DS[key]=value
def __iter__(self):
return iter(self.DS)
def __eq__(self, other):
return self.id==other.id
def __ne__(self, other):
return self.id!=other.id
def __hash__(self):
return self.id
def __lt__(self, other):
return self.id<other.id
def __le__(self, other):
return self.id<=other.id
def __gt__(self, other):
return self.id>other.id
def __ge__(self, other):
return self.id>=other.id
#property
def id(self):
if self.ID_SET==False:
print("Warning. self.id_set is False. You are accessing an id that has not been set.")
return self.ID
#id.setter
def id(self, num):
if self.ID_SET==True:
raise ValueError("Once set, the id value may not be changed.")
else:
self.ID=num
self.ID_SET=True
def save(self, manager, commit=True):
if self.ID_SET==False:
self.id=manager.inc()
manager.root.other_set.add(self)
if commit==True:
transaction.commit()
class Set(Persistent):
def __init__(self, ID=None, *items):
Persistent.__init__(self)
self.DS=OOTreeSet()
if ID==None:
self.ID=-1 #To give each object a unique id. The value, -1, is replaced automatically when saved by the project for the first time (which should be done right after the object is created).
self.ID_SET=False
else:
if ID>=0:
raise ValueError("Manual values should be negative.")
self.ID=ID #You should remember what you’re putting here, and it should be negative.
self.ID_SET=True
self.update(items)
def update(self, items):
self.DS.update(items)
def add(self, *items):
self.DS.update(items)
def remove(self, *items):
for x in items:
self.DS.remove(x)
def has(self, *items):
for x in items:
if not self.DS.has_key(x):
return False
return True
def __len__(self):
return len(self.DS)
def __iadd__(self, other):
self.DS.update(other)
def __isub__(self, other):
self.remove(*other)
def __contains__(self, other):
return self.DS.has_key(other)
def __iter__(self):
return iter(self.DS)
def __eq__(self, other):
return self.id==other.id
def __ne__(self, other):
return self.id!=other.id
def __hash__(self):
return self.id
def __lt__(self, other):
return self.id<other.id
def __le__(self, other):
return self.id<=other.id
def __gt__(self, other):
return self.id>other.id
def __ge__(self, other):
return self.id>=other.id
#property
def id(self):
if self.ID_SET==False:
print("Warning. self.id_set is False. You are accessing an id that has not been set.")
return self.ID
#id.setter
def id(self, num):
if self.ID_SET==True:
raise ValueError("Once set, the id value may not be changed.")
else:
self.ID=num
self.ID_SET=True
def save(self, manager, commit=True):
if self.ID_SET==False:
self.id=manager.inc()
manager.root.other_set.add(self)
if commit==True:
transaction.commit()
class Counter(Persistent):
#This is for creating a persistent id count object (using a plain integer outside of a class doesn't seem to work).
def __init__(self, value=0):
self.value=value
self.ID_SET=False
self.id=value
#The following methods are so it will fit fine in a BTree (they don't have anything to do with self.value)
def __eq__(self, other):
return self.id==other.id
def __ne__(self, other):
return self.id!=other.id
def __hash__(self):
return self.id
def __lt__(self, other):
return self.id<other.id
def __le__(self, other):
return self.id<=other.id
def __gt__(self, other):
return self.id>other.id
def __ge__(self, other):
return self.id>=other.id
#property
def id(self):
if self.ID_SET==False:
print("Warning. self.id_set is False. You are accessing an id that has not been set.")
return self.ID
#id.setter
def id(self, num):
if self.ID_SET==True:
raise ValueError("Once set, the id value may not be changed.")
else:
self.ID=num
self.ID_SET=True
class Manager:
def __init__(self, filepath):
self.filepath=filepath
self.storage = ZODB.FileStorage.FileStorage(filepath)
self.db = ZODB.DB(self.storage)
self.conn = self.db.open()
self.root = self.conn.root
print("Database opened.\n")
try:
self.root.other_dict #This holds arbitrary stuff, like the Counter. String keys.
except AttributeError:
self.root.other_dict=OOBTree()
self.root.other_dict["id_count"]=Counter()
try:
self.root.other_set #set other
except AttributeError:
self.root.other_set=OOTreeSet() #This holds all our Btree and Set objects (they are put here when saved to help them be persistent).
def inc(self): #This increments our Counter and returns the new value to become the integer id of a new object.
self.root.other_dict["id_count"].value+=1
return self.root.other_dict["id_count"].value
def close(self):
self.db.pack()
self.db.close()
print("\nDatabase closed.")
class Btree2(Btree):
#To prove that we can inherit our own classes we created that inherit Persistent (but inheriting OOBTree or OOTreeSet causes issues)
def __init__(self, ID=None, **attr):
Btree.__init__(self, ID, **attr)
m=Manager("/path/to/database/test.fs")
try:
m.root.tree #Causes an AttributeError if this is the first time you ran the program, because it doesn't exist.
print("OOBTree loaded.")
except AttributeError:
print("Creating OOBTree.")
m.root.tree=OOBTree()
for i in range(5):
key=Btree2()
key.save(m, commit=False) #Saving without committing adds it to the manager's OOBTree and gives it an integer ID. This needs to be done right after creating an object (whether or not you commit).
value=Btree2()
value.save(m, commit=False)
m.root.tree[key]=value #Assigning key and value (which are both objects) to the OOBTree
transaction.commit() #Commit the transactions
try:
m.root.set
print("OOTreeSet loaded.")
except AttributeError:
print("Creating OOTreeSet")
m.root.set=OOTreeSet()
for i in range(5):
item=Set()
item.save(m, commit=False)
m.root.set.add(item)
transaction.commit()
#Doing the same with an OOTreeSet (since objects in them suffered from the same problem as objects as keys in an OOBTree)
for x in m.root.tree:
print("Key: "+str(x.id))
print("Value: "+str(m.root.tree[x].id))
if x in m.root.tree:
print("Comparison works for "+str(x.id))
print("\nOn to OOTreeSet.\n")
for x in m.root.set:
if x in m.root.set:
print("Comparison works for "+str(x.id))
m.close()

How to implement selection sort using forward iterators?

I'm working on a homework assignment where I shall implement selection sorting using forward iterators for both python lists and linked lists(single).
Here are some codes I have for iterators:
from abc import *
class ForwardIterator(metaclass=ABCMeta):
#abstractmethod
def getNext(self):
return
#abstractmethod
def getItem(self):
return
#abstractmethod
def getLoc(self):
return
#abstractmethod
def clone(self):
return
def __eq__(self, other):
return self.getLoc() == other.getLoc()
def __ne__(self, other):
return not (self == other)
def __next__(self):
if self.getLoc() == None:
raise StopIteration
else:
item = self.getItem()
self.getNext()
return item
class ForwardAssignableIterator(ForwardIterator):
#abstractmethod
def setItem(self, item):
"""Sets the item at the current position."""
return
class PythonListFAIterator(ForwardAssignableIterator):
def __init__(self, lst, startIndex):
self.lst = lst
self.curIndex = startIndex
def getNext(self):
self.curIndex += 1
def getItem(self):
if self.curIndex < len(self.lst):
return self.lst[self.curIndex]
else:
return None
def setItem(self, item):
if self.curIndex < len(self.lst):
self.lst[self.curIndex] = item
def getLoc(self):
if self.curIndex < len(self.lst):
return self.curIndex
else:
return None
def clone(self):
return PythonListFAIterator(self.lst, self.curIndex)
The LinkedListFAIterator is similar to PythonListFAIterator, plus getStartIterator, and __iter__ method.
I don't know how I can write codes to implement selection sort with one paraemter, a FAIterator (the forward iterator). Please help me. I know I shall find the minimum element and put it at the beginning of the list. I also know that I shall use the clone method to create multiple iterators to keep track of multiple locations at once. But I don't know how to write the code.
Please give me some hints.

Comparison of hashable objects

I have a tuple of python objects, from which I need a list of objects with no duplicates, using set() (this check for duplicate objects is to be done on an attribute.). This code will give a simple illustration:
class test:
def __init__(self, t):
self.t = t
def __repr__(self):
return repr(self.t)
def __hash__(self):
return self.t
l = (test(1), test(2), test(-1), test(1), test(3), test(2))
print l
print set(l)
However, it did not work. I can do it on an iteration over l, but any idea why set() is not working? Here is the official documentation.

From the documentation you linked to:
The set classes are implemented using dictionaries. Accordingly, the
requirements for set elements are the same as those for dictionary
keys; namely, that the element defines both __eq__() and __hash__().
To be more specific, if a == b then your implementation must be such that hash(a) == hash(b). The reverse is not required.
Also, you should probably call hash in __hash__ to handle long integers
class Test:
def __init__(self, t):
self.t = t
def __repr__(self):
return repr(self.t)
def __hash__(self):
return hash(self.t)
def __eq__(self, other):
return isinstance(other, Test) and self.t == other.t

Small nit picks:
Your implementation of __eq__ doesn't give the other object a chance to run its own __eq__. The class must also consider its members as immutable as the hash must stay constant. You don't want to break your dicts, do you?
class Test:
def __init__(self, t):
self._t = t
#property
def t(self):
return self._t
def __repr__(self):
return repr(self._t)
def __hash__(self):
return hash(self._t)
def __eq__(self, other):
if not isinstance(other, Test):
return NotImplemented # don't know how to handle `other`
return self.t == other.t

Decorating arithmetic operators | should I be using a metaclass?

I'd like to implement an object, that bounds values within a given range after arithmetic operations have been applied to it. The code below works fine, but I'm pointlessly rewriting the methods. Surely there's a more elegant way of doing this. Is a metaclass the way to go?
def check_range(_operator):
def decorator1(instance,_val):
value = _operator(instance,_val)
if value > instance._upperbound:
value = instance._upperbound
if value < instance._lowerbound:
value = instance._lowerbound
instance.value = value
return Range(value, instance._lowerbound, instance._upperbound)
return decorator1
class Range(object):
'''
however you add, multiply or divide, it will always stay within boundaries
'''
def __init__(self, value, lowerbound, upperbound):
'''
#param lowerbound:
#param upperbound:
'''
self._lowerbound = lowerbound
self._upperbound = upperbound
self.value = value
def init(self):
'''
set a random value within bounds
'''
self.value = random.uniform(self._lowerbound, self._upperbound)
def __str__(self):
return self.__repr__()
def __repr__(self):
return "<Range: %s>" % (self.value)
#check_range
def __mul__(self, other):
return self.value * other
#check_range
def __div__(self, other):
return self.value / float(other)
def __truediv__(self, other):
return self.div(other)
#check_range
def __add__(self, other):
return self.value + other
#check_range
def __sub__(self, other):
return self.value - other

It is possible to use a metaclass to apply a decorator to a set of function names, but I don't think that this is the way to go in your case. Applying the decorator in the class body on a function-by-function basis as you've done, with the #decorator syntax, I think is a very good option. (I think you've got a bug in your decorator, BTW: you probably do not want to set instance.value to anything; arithmetic operators usually don't mutate their operands).
Another approach I might use in your situation, kind of avoiding decorators all together, is to do something like this:
import operator
class Range(object):
def __init__(self, value, lowerbound, upperbound):
self._lowerbound = lowerbound
self._upperbound = upperbound
self.value = value
def __repr__(self):
return "<Range: %s>" % (self.value)
def _from_value(self, val):
val = max(min(val, self._upperbound), self._lowerbound)
# NOTE: it's nice to use type(self) instead of writing the class
# name explicitly; it then continues to work if you change the
# class name, or use a subclass
return type(self)(val, rng._lowerbound, rng._upperbound)
def _make_binary_method(fn):
# this is NOT a method, just a helper function that is used
# while the class body is being evaluated
def bin_op(self, other):
return self._from_value(fn(self.value, other))
return bin_op
__mul__ = _make_binary_method(operator.mul)
__div__ = _make_binary_method(operator.truediv)
__truediv__ = __div__
__add__ = _make_binary_method(operator.add)
__sub__ = _make_binary_method(operator.sub)
rng = Range(7, 0, 10)
print rng + 5
print rng * 50
print rng - 10
print rng / 100
printing
<Range: 10>
<Range: 10>
<Range: 0>
<Range: 0.07>
I suggest that you do NOT use a metaclass in this circumstance, but here is one way you could. Metaclasses are a useful tool, and if you're interested, it's nice to understand how to use them for when you really need them.
def check_range(fn):
def wrapper(self, other):
value = fn(self, other)
value = max(min(value, self._upperbound), self._lowerbound)
return type(self)(value, self._lowerbound, self._upperbound)
return wrapper
class ApplyDecoratorsType(type):
def __init__(cls, name, bases, attrs):
for decorator, names in attrs.get('_auto_decorate', ()):
for name in names:
fn = attrs.get(name, None)
if fn is not None:
setattr(cls, name, decorator(fn))
class Range(object):
__metaclass__ = ApplyDecoratorsType
_auto_decorate = (
(check_range,
'__mul__ __div__ __truediv__ __add__ __sub__'.split()),
)
def __init__(self, value, lowerbound, upperbound):
self._lowerbound = lowerbound
self._upperbound = upperbound
self.value = value
def __repr__(self):
return "<Range: %s>" % (self.value)
def __mul__(self, other):
return self.value * other
def __div__(self, other):
return self.value / float(other)
def __truediv__(self, other):
return self / other
def __add__(self, other):
return self.value + other
def __sub__(self, other):
return self.value - other

As it is wisely said about metaclasses: if you wonder wether you need them, then you don't.
I don't fully understand your problem, but I would create a BoundedValue class, and us only instances of said class into the class you are proposing.
class BoundedValue(object):
default_lower = 0
default_upper = 1
def __init__(self, upper=None, lower=None):
self.upper = upper or BoundedValue.default_upper
self.lower = lower or BoundedValue.default_lower
#property
def val(self):
return self._val
#val.setter
def val(self, value):
assert self.lower <= value <= self.upper
self._val = value
v = BoundedValue()
v.val = 0.5 # Correctly assigns the value 0.5
print v.val # prints 0.5
v.val = 10 # Throws assertion error
Of course you could (and should) change the assertion for the actual behavior you are looking for; also you can change the constructor to include the initialization value. I chose to make it an assignment post-construction via the property val.
Once you have this object, you can create your classes and use BoundedValue instances, instead of floats or ints.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python multiple dataframe methods in single line - python

New to Python. Can anyone explain how multiple methods are put together in one single line of code and how do we write a class with such capability? Here is the code snippet I got but I dont know exactly how/why it works. df = df.div(100).add(1.01).cumprod() Thanks

Related

Pythonic way to improve subclasses of MutableSequence and MutableMapping

What is the correct way to use objects as keys in a ZOBD OOBTree?

How to implement selection sort using forward iterators?

Comparison of hashable objects

Decorating arithmetic operators | should I be using a metaclass?

Categories

Resources