Groupby in DataFrame with class objects - python

Let say I have two following objects which I use in DataFrame
online_price = OnlinePrice(article_id=1,
date=datetime.date(2020, 11, 5),
min_sale_price=62.99,
max_sale_price=83.94,
avg_sale_price=83.74,
median_sale_price=100.00)
online_price_group = OnlinePriceGroup(title='Test OnlinePriceGroup',
user_id=1,
selection_group_id=1,
periodic_task_id=1,
min_price_multiplier=30,
calculator_method=enums.OnlineGroupCalculatorMethod.USE_LOWEST_OF_RECOMMENDATION_AND_ONLINE_PRICE,
status=enums.OnlineGroupStatus.ACTIVE,
allow_higher_than_current_prices=enums.OnlineGroupUseHigherPriceThanCurrent.ALLOW)
I want to have groupby with 'status' for example dividing ACTIVE and INACTIVE.
Gives me back this error '<' not supported between instances of 'OnlinePriceGroup' and 'OnlinePriceGroup'
Thank in advance.

This error occurs since you are using custom defined classes where you have not defined how to compare two instances of the class with eachother. How do you determine whether group 1 is larger than group 2? That logic needs to be added first
A minimum example of this can be seen below
class GroupClass:
def __init__(self, val):
self.val = val
def __gt__(self, other):
return self.val > other.val
def __ge__(self, other):
return self.val >= other.val
def __eq__(self, other):
return self.val == other.val
With the above, you would be able to run
print(GroupClass(2) > GroupClass(3))
print(GroupClass(2) >= GroupClass(3))
print(GroupClass(2) == GroupClass(3))

Related

Python multiple dataframe methods in single line

New to Python. Can anyone explain how multiple methods are put together in one single line of code and how do we write a class with such capability?
Here is the code snippet I got but I dont know exactly how/why it works.
df = df.div(100).add(1.01).cumprod()
Thanks
Each intermediate method (div and add) returns a pd.DataFrame, that's why you can put them into a single line. You can also implement it in a custom class if you want:
class Calculator:
def __init__(self, value):
self.value = value
def add(self, other):
# Return a new instance of Calculator with the method result
return Calculator(self.value + other)
def subtract(self, other):
return Calculator(self.value - other)
def result(self):
# Since this does not return the same instance, you can't
# do something like Calculator(4).add(2).result().add(5)
return self.value
Then you can call it like that:
print(Calculator(5).add(10).subtract(3).result())
# Outputs 12
The above class has the advantage of being immutable (considering self.value is private).
However, you can also do a mutable approach:
class Calculator:
def __init__(self, value):
self.value = value
def add(self, other):
self.value += other
# Return the same instance of Calculator you're acting on
return self
def subtract(self, other):
self.value -= other
return self
def result(self):
return self.value
In the above, no new instance of the same class is created. Instead, all the calculations are done inside the calculator. You can call it the same way:
print(Calculator(5).add(10).subtract(3).result())
# Outputs 12

TypeError: '<' not supported between instances of 'node' and 'node' [duplicate]

This question already has answers here:
How to sort a list of objects based on an attribute of the objects?
(9 answers)
Closed 2 years ago.
I am using Python 3.8. I have below list:
[['Bangalore', 116.0], ['Mumbai', 132.0], ['Kolkata', 234.0]]
then I have created as a node & added in successors list like this:
successors = [<__main__.Node object at 0x7f89582eb7c0>, <__main__.Node object at 0x7f89582eb790>, <__main__.Node object at 0x7f89582eb7f0>]
I have created a fringe list & adding each successor node. After than sorting it based on distance value. I am getting error - '<' not supported between instances of 'node' and 'node'
for succ_node in successors:
fringe.append(succ_node)
fringe.sort() <- Error Here
This is my node class:
class Node:
def __init__(self, parent, name, g):
self.parent = parent
self.name = name
self.g = g
What am I doing wrong?
In python 3 you must define comparison methods for class, for example like below:
class Node:
def __init__(self, parent, name, g):
self.parent = parent
self.name = name
self.g = g
def __eq__(self, other):
return (self.name == other.name) and (self.g == other.g)
def __ne__(self, other):
return not (self == other)
def __lt__(self, other):
return (self.name < other.name) and (self.g < other.g)
def __gt__(self, other):
return (self.name > other.name) and (self.g > other.g)
def __le__(self, other):
return (self < other) or (self == other)
def __ge__(self, other):
return (self > other) or (self == other)
Sometimes not all these methods are needed - it mainly depends on what kind of comparison you will use.
There is more detailed description: https://portingguide.readthedocs.io/en/latest/comparisons.html

How to write a custom comparator with custom sort in Python3 to use it in sorted() function

I'm currently stuck on a problem to write a comparator. The base idea was to write a function, which takes to parameters (two lists), but I want to use it on a list of these lists to use it in sorted() function. How shall I do it?
Comparator:
def dispersion_sort(frec, srec):
if isinstance(frec, intervals.Interval) and isinstance(srec, intervals.Interval):
if frec[DOUBLE_RES_COL] < srec[DOUBLE_RES_COL]:
return frec
if frec[DOUBLE_RES_COL] > srec[DOUBLE_RES_COL]:
return srec
if frec[DOUBLE_RES_COL].overlaps(srec[DOUBLE_RES_COL]):
if (frec[DOUBLE_TIME_COL] < srec[DOUBLE_TIME_COL]):
return frec
else:
return srec
return frec
Sample frec data:
['1', 'Mikhail Nitenko', '#login', '✅', [-0.000509228437634554,0.0007110924383354339], datetime.datetime(2020, 1, 2, 14, 46, 46)]
How I wanted to call it:
results = sorted(results, key=dispersion_sort)
Thanks a lot!
You can use functools.cmp_to_key for this:
from functools import cmp_to_key
results = sorted(results, key=cmp_to_key(dispersion_sort))
It will transform the old style comparator function (which takes two arguments), into a new style key function (which takes one argument).
If you wanted to explicitly create a comparator, you'd want to implement a custom class that has that has these magic methods:
class comparator:
def __init__(self, obj, *args):
self.obj = obj
def __lt__(self, other):
return mycmp(self.obj, other.obj) < 0
def __gt__(self, other):
return mycmp(self.obj, other.obj) > 0
def __eq__(self, other):
return mycmp(self.obj, other.obj) == 0
def __le__(self, other):
return mycmp(self.obj, other.obj) <= 0
def __ge__(self, other):
return mycmp(self.obj, other.obj) >= 0
def __ne__(self, other):
return mycmp(self.obj, other.obj) != 0
Here, the function mycmp is a function like the one you showed. You can also choose to put your logic directly in the class itself. Here, these methods should return a True or False, which is different from your current function. Make sure that is changed accordingly if you want to use the current function directly into this class template.
Once you have the class ready , you can pass it in directly: key=comparator

Single function comparison in custom Python classes

I often need to make a class I made comparable for various reasons (sorting, set usage,...) and I don't want to write multiple comparison functions each time. How do I support the class so I only have to write a single function for every new class?
My solution to the problem is to create an abstract class a class can inherit and override the main comparison function (diff()) with the desired comparison method.
class Comparable:
'''
An abstract class that can be inherited to make a class comparable and sortable.
For proper functionality, function diff must be overridden.
'''
def diff(self, other):
"""
Calculates the difference in value between two objects and returns a number.
If the returned number is
- positive, the value of object a is greater than that of object b.
- 0, objects are equivalent in value.
- negative, value of object a is lesser than that of object b.
Used in comparison operations.
Override this function."""
return 0
def __eq__(self, other):
return self.diff(other) == 0
def __ne__(self, other):
return self.diff(other) != 0
def __lt__(self, other):
return self.diff(other) < 0
def __le__(self, other):
return self.diff(other) <= 0
def __gt__(self, other):
return self.diff(other) > 0
def __ge__(self, other):
return self.diff(other) >= 0

sorting list of objects python

I wrote a class that stores a list of objects which I have also defined.
I would like to be able to call obj_list.sort(), and have the results sorted in ascending order, but it isn't working out exactly how I want it.
If I get the obj data and call sort() three times this is the behavior with my current implementation:
class MyClass():
def __init__(self):
self.obj_list = self.set_obj_list()
def set_obj_list(self):
data = []
for x in range(20):
obj = MyObjClass(x)
data.append( obj )
data.sort()
return data
class MyObjClass():
def __init__(self, number):
self.number = number # number is an integer
def __lt__(self, other):
return cmp(self.number, other.number)
def __repr__(self):
return str(self.number)
a = MyClass()
print a.obj_list
a.obj_list.sort()
print a.obj_list
a.obj_list.sort()
print a.obj_list
a.obj_list.sort()
print a.obj_list
Thank you.
I want it sorted in ascending order, but for sort() to do nothing if already sorted.
__lt__ should return a true value if & only if self is less than other, but cmp(self, other) will return a true value (1 or -1) if self does not equal other. You need to change this:
def __lt__(self, other):
return cmp(self.number, other.number)
to this:
def __lt__(self, other):
return cmp(self.number, other.number) < 0

Categories