Long story short, I need to write a data analysing tool using mainly OOP principles. I'm not quite a beginner at python but still not the best. I wrote a function which returned true or false value based on what the user inputted (below):
def secondary_selection():
"""This prints the options the user has to manipulate the data"""
print("---------------------------")
print("Column Statistics [C]")
print("Graph Plotting [G]")
d = input(str("Please select how you want the data to be processed:")).lower()
# Returns as a true/false boolean as it's easier
if d == "c":
return True
elif d == "g":
return False
else:
print("Please enter a valid input")
This function works the way I want it to, but I then tried to import it into different file for use with a class (below):
class Newcastle:
def __init__(self, data, key_typed):
self.data = data[0]
self.key_typed = key_typed
def newcastle_selection(self):
# If function returns True
if self:
column_manipulation()
# If function returns False
if not self:
graph_plotting()
The newcastle_selection(self) function takes the secondary_selection() function as a argument, but the only way I got it to work was the if self statement. As writing something like if true lead to both the column_manipulation and the graph_plotting functions being printed.
I am wondering if there's any better way to write this as I am not quite a beginner at python but still relatively new to it.
Disclaimer: This is for a first year course, and I asked this as a last result.
I'm not sure I've understood really well your code structure, but it looks like a little factory could helps you:
def column_manipulation():
print("Column manipulation")
def graph_plotting():
print("Graph plotting")
class Newcastle:
def __init__(self, data, func):
self.data = data[0]
self._func = func
def newcastle_selection(self):
return self._func()
#classmethod
def factorize(cls, data, key_typed):
if key_typed is True:
return cls(data, column_manipulation)
elif key_typed is False:
return cls(data, graph_plotting)
else:
raise TypeError('Newcastle.factorize expects bool, {} given'.format(
type(key_typed).__name__
))
nc = Newcastle.factorize(["foo"], True)
nc.newcastle_selection()
nc = Newcastle.factorize(["foo"], False)
nc.newcastle_selection()
Output
Column manipulation
Graph plotting
The main idea is to define your class in a generic way, so you store a function given as __init__ parameter within self._func and just call it in newcastle_selection.
Then, you create a classmethod that takes your data & key_typed. This method is in charge of choosing which function (column_manipulation or graph_plotting) will be used for the current instance.
So you don't store useless values like key_typed in your class and don't have to handle specific cases everywhere, only in factorize.
Cleaner and powerful I think (and btw it answer your question "Is this pythonic", this is).
This is simple and basic example of secondary_selection method using class. This would help probably.
class castle:
def __init__(self):
self.data = ''
def get_input(self):
print("1: Column Statistics")
print("2: Graph Plotting")
self.data = input("Please select how you want the data to be processed: ")
def process(self):
if self.data == 1:
return self.column_manipulation()
else:
return self.graph_plotting()
def column_manipulation(self):
return True
def graph_plotting(self):
return False
c = castle()
c.get_input()
result = c.process()
print(result)
Related
I have created this class that works as expected, I want only to expose one method, get_enriched_dataso the other are pretty much private w/ the underscore.
The functionality works, just pretty convinced I am not doing the most pythonic/OOP way:
class MergeClients:
def __init__(self,source_df,extra_info_df,type_f):
self.df_all = pd.merge(source_df,extra_info_df, on='clientID', how='left')
self.avg_age = self._get_avg_age()
self.type_f = 'Medium'
def _filter_by_age(self, age):
return self.df_all[self.df_all['Age'] > age]
def _filter_by_family_type(self, f_type):
return self.df_all[self.df_all['familyType'] == f_type]
def _get_avg_age(self):
return self.df_all['Age'].mean()
def get_enriched_data(self):
self.df_all = self._filter_by_age(self.avg_age)
self.df_all=self._filter_by_family_type(self.type_f)
return self.df_all
But I find the code looks so ugly with so many self references, for example in the get_enriched_datamethod there are three self references per line, how can I correct this? Any direction on how to correctly Python classes is welcome.
Edit:
Example of working code:
main_df = pd.DataFrame({'clientID':[1,2,3,4,5],
'Name':['Peter','Margaret','Marc','Alice','Maria']})
extra_info = pd.DataFrame({'clientID':[1,2,3,4,5],'Age':[19,35,18,65,57],'familyType':['Big','Medium','Single','Medium','Medium']})
family_stats = MergeClients(main_df,extra_info,'Medium')
family_filtered = family_stats.get_enriched_data()
There are some odd things about your code. I will point out one thing about instances: every method has access to all attributes, so you don't always need to pass them as parameters:
class MergeClients:
def __init__(self,source_df,extra_info_df,type_f):
self.df_all = pd.merge(source_df,extra_info_df, on='clientID', how='left')
self.avg_age = self._get_avg_age()
self.type_f = 'Medium'
def _filter_by_age(self): #No need for age param
return self.df_all[self.df_all['Age'] > self.avg_age]
def _filter_by_family_type(self): #No need for f_type param
return self.df_all[self.df_all['familyType'] == self.type_f]
def _get_avg_age(self):
return self.df_all['Age'].mean()
def get_enriched_data(self):
self.df_all = self._filter_by_age()
self.df_all = self._filter_by_family_type()
return self.df_all
Since the two methods in question: _filter_by_age() and _filter_by_family_type() are private by convention, this means that clients of your class are not expected to call them. So if only other methods of this class call these methods and only the ones you have shown, then there is no need to pass parameters which are already attributes.
Alternatively there is the argument that for other private methods where sometimes they should use attributes, but at other times they should take a parameter, then I would make those methods take a parameter as you had originally.
Functions declared within a Python Class can be effectively made 'private' by preceding the name with double underscore. For example:
class Clazz():
def __work(self):
print('Working')
def work(self):
self.__work()
c = Clazz()
c.work()
c.__work()
The output of this would be:
Working
Traceback (most recent call last):
File "/Volumes/G-DRIVE Thunderbolt 3/PythonStuff/play.py", line 575, in
c = Clazz()
AttributeError: 'Clazz' object has no attribute '__work'
In other words, the __work function has been 'hidden'
I'm learning about classes in Python, particularly about nested classes.
I'm trying to execute the below code and I get an error: int object is not callable, but
I don't understand why!
All I want is to create an object that identify Man, and he has hands, and the hands have their own size, length, etc...
I want to be able to set the hand size and get its value in the most elegant and easy way as possible and nothing work for me... I tried the below code and I really thought it would work but it didn't and now I know that "I Don't know" what to do for real.
class Man:
def __init__(self, name):
self.name = name
self.hand = self.Hand_Object() # Here we reference an Object called
# "hand" to the subvlass "Hand_Object".
def length(self , length):
self.length = length
def handsize(self, size=None): # This "handsize()" function will call the
# subclass function "length()" out from the
# Hand_Object vlass when it will be issued
# in the program.
if size==None:
return = self.hand.length()
else:
self.hand.length(size) # The "length()" function of the "Hand_Object"
# class requires a variable, so when we call
# that function we need to add a variable to it.
class Hand_Object:
def length(self, length=None):
if length == None:
return self.length
else:
self.length = length
def fingers(self, fingers):
self.fingers = fingers
myman = Man('shlomi')
myman.handsize(6)
print(myman.handsize()) # Here I get the error.
The issue is the line self.length = length in Hand_Object. You're overwriting the function length with an integer. You should call the function and the variable something different.
i was following this tutorial on decision trees and I tried to recreate it on my own as python project, instead of a notebook, using spyder.
I create different py files where I put different methods and classes, specifically I create a file named tree_structure with the following classes: Leaf, DecisionNode and Question. (I hope it's correct to put more classes in a single py file)
When I tried to use isinstance() in method "classify" in another py file, I was expecting True instead it returned False:
>>>leaf0
<tree_structure.Leaf at 0x11b7c3450>
>>>leaf0.__class__
tree_structure.Leaf
>>>isinstance(leaf0,Leaf)
False
>>>isinstance(leaf0,tree_structure.Leaf)
False
leaf0 was created iteratively from "build_tree" method (I just saved it to t0 for debugging.. during execution is not saved as a variable) :
t0 = build_tree(train_data)
leaf0 = t0.false_branch.false_branch
I tried also using type(leaf0) is Leaf instead of isinstance but it still returns False.
Can someone explain me why this happen?
tree_structure.py
class Question:
def __init__(self,header, column, value):
self.column = column
self.value = value
self.header = header
def match(self, example):
# Compare the feature value in an example to the
# feature value in this question.
val = example[self.column]
if is_numeric(val):
return val >= self.value
else:
return val == self.value
def __repr__(self):
# This is just a helper method to print
# the question in a readable format.
condition = "=="
if is_numeric(self.value):
condition = ">="
return "Is %s %s %s?" % (
self.header[self.column], condition, str(self.value))
class Leaf:
def __init__(self, rows):
self.predictions = class_counts(rows)
class DecisionNode:
def __init__(self,
question,
true_branch,
false_branch):
self.question = question
self.true_branch = true_branch
self.false_branch = false_branch
classifier.py
from tree_structure import Question,Leaf,DecisionNode
def classify(row, node):
# Base case: we've reached a leaf
if isinstance(node, Leaf):
print("----")
return node.predictions
# Decide whether to follow the true-branch or the false-branch.
# Compare the feature / value stored in the node,
# to the example we're considering.
if node.question.match(row):
print("yes")
return classify(row, node.true_branch)
else:
print("no")
return classify(row, node.false_branch)
build_tree
def build_tree(rows,header):
"""Builds the tree.
Rules of recursion: 1) Believe that it works. 2) Start by checking
for the base case (no further information gain). 3) Prepare for
giant stack traces.
"""
gain, question = find_best_split(rows,header)
print("--best question is ''{}'' with information gain: {}".format(question,round(gain,2)))
# Base case: no further info gain
# Since we can ask no further questions,
# we'll return a leaf.
if isinstance(rows,pd.DataFrame):
rows = rows.values.tolist()
if gain == 0:
return Leaf(rows)
# If we reach here, we have found a useful feature / value
# to partition on.
true_rows, false_rows = partition(rows, question)
# Recursively build the true branch.
print("\n----TRUE BRANCH----")
true_branch = build_tree(true_rows,header)
# Recursively build the false branch.
print("\n----FALSE BRANCH----")
false_branch = build_tree(false_rows,header)
# Return a Question node.
# This records the best feature / value to ask at this point,
# as well as the branches to follow
# dependingo on the answer.
return DecisionNode(question, true_branch, false_branch)
I'm trying to return variable name, but i keep getting this:
<classes.man.man object at (some numbers (as example:0x03BDCA50))>
Below is my code:
from classes.man import man
def competition(guy1, guy2, counter1=0, counter2=0):
.......................
some *ok* manipulations
.......................
if counter1>counter2:
return guy1
bob = man(172, 'green')
bib = man(190, 'brown')
print(competition(bob , bib ))
Epilogue
If anyone want to, explain please what I can write instead of __class__ in example below to get variable name.
def __repr__(self):
return self.__class__.__name__
Anyway, thank you for all of your support
There are different ways to approach your problem.
The simplest I can fathom is if you can change the class man, make it accept an optional name in its __init__ and store it in the instance. This should look like this:
class man:
def __init__(number, color, name="John Doe"):
self.name = name
# rest of your code here
That way in your function you could just do with:
return guy1.name
Additionnally, if you want to go an extra step, you could define a __str__ method in your class man so that when you pass it to str() or print(), it shows the name instead:
# Inside class man
def __str__(self):
return self.name
That way your function could just do:
return guy1
And when you print the return value of your function it actually prints the name.
If you cannot alter class man, here is an extremely convoluted and costly suggestion, that could probably break depending on context:
import inspect
def competition(guy1, guy2, counter1=0, counter2=0):
guy1_name = ""
guy2_name = ""
for name, value in inspect.stack()[-1].frame.f_locals.items():
if value is guy1:
guy1_name = name
elif value is guy2:
guy2_name = name
if counter1 > counter2:
return guy1_name
elif counter2 > counter2:
return guy1_name
else:
return "Noone"
Valentin's answer - the first part of it at least (adding a name attribute to man) - is of course the proper, obvious solution.
Now wrt/ the second part (the inspect.stack hack), it's brittle at best - the "variables names" we're interested in might not necessarily be defined in the first parent frame, and FWIW they could as well just come from a dict etc...
Also, it's definitly not the competition() function's responsability to care about this (don't mix domain layer with presentation layer, thanks), and it's totally useless since the caller code can easily solve this part by itself:
def competition(guy1, guy2, counter1=0, counter2=0):
.......................
some *ok* manipulations
.......................
if counter1>counter2:
return guy1
def main():
bob = man(172, 'green')
bib = man(190, 'brown')
winner = competition(bob, bib)
if winner is bob:
print("bob wins")
elif winner is bib:
print("bib wins")
else:
print("tie!")
Python prints the location of class objects in memory if they are passed to the print() function as default. If you want a prettier output for a class you need to define the __repr__(self) function for that class which should return a string that is printed if an object is passed to print(). Then you can just return guy1
__repr__ is the method that defines the name in your case.
By default it gives you the object type information. If you want to print more apt name then you should override the __repr__ method
Check below code for instance
class class_with_overrided_repr:
def __repr__(self):
return "class_with_overrided_repr"
class class_without_overrided_repr:
pass
x = class_with_overrided_repr()
print x # class_with_overrided_repr
x = class_without_overrided_repr()
print x # <__main__.class_without_overrided_repr instance at 0x7f06002aa368>
Let me know if this what you want?
I'm currently self-learning Python, and I'm reading 'Problem Solving with Algorithms and Data Structures' (Brad Miller, David Ranum). I've stumbled upon the basic example of inheritance. Although I can see what it does, I need an explanation, how it works actually. The Code is as follows:
class LogicGate:
def __init__(self,n):
self.name = n
self.output = None
def getName(self):
return self.name
def getOutput(self):
self.output = self.performGateLogic()
return self.output
class BinaryGate(LogicGate):
def __init__(self,n):
LogicGate.__init__(self,n)
self.pinA = None
self.pinB = None
def getPinA(self):
if self.pinA == None:
return int(input("Enter Pin A input for gate "+self.getName()+"-->"))
else:
return self.pinA.getFrom().getOutput()
def getPinB(self):
if self.pinB == None:
return int(input("Enter Pin B input for gate "+self.getName()+"-->"))
else:
return self.pinB.getFrom().getOutput()
def setNextPin(self,source):
if self.pinA == None:
self.pinA = source
else:
if self.pinB == None:
self.pinB = source
else:
print("Cannot Connect: NO EMPTY PINS on this gate")
class AndGate(BinaryGate):
def __init__(self,n):
BinaryGate.__init__(self,n)
def performGateLogic(self):
a = self.getPinA()
b = self.getPinB()
if a==1 and b==1:
return 1
else:
return 0
class OrGate(BinaryGate):
def __init__(self,n):
BinaryGate.__init__(self,n)
def performGateLogic(self):
a = self.getPinA()
b = self.getPinB()
if a ==1 or b==1:
return 1
else:
return 0
class UnaryGate(LogicGate):
def __init__(self,n):
LogicGate.__init__(self,n)
self.pin = None
def getPin(self):
if self.pin == None:
return int(input("Enter Pin input for gate "+self.getName()+"-->"))
else:
return self.pin.getFrom().getOutput()
def setNextPin(self,source):
if self.pin == None:
self.pin = source
else:
print("Cannot Connect: NO EMPTY PINS on this gate")
class NotGate(UnaryGate):
def __init__(self,n):
UnaryGate.__init__(self,n)
def performGateLogic(self):
if self.getPin():
return 0
else:
return 1
class Connector:
def __init__(self, fgate, tgate):
self.fromgate = fgate
self.togate = tgate
tgate.setNextPin(self)
def getFrom(self):
return self.fromgate
def getTo(self):
return self.togate
def main():
g1 = AndGate("G1")
g2 = AndGate("G2")
g3 = OrGate("G3")
g4 = NotGate("G4")
c1 = Connector(g1,g3)
c2 = Connector(g2,g3)
c3 = Connector(g3,g4)
print(g4.getOutput())
main()
I'm mostly doubted by the tgate.setNextPin(self) statement in Connector class __init__. Is it a method call? If it is, why is it called with just one parameter, when there are two actually required by the setNexPin function in UnaryGate and BinaryGate classes (self, source)? How does the fromgate variable ends up as the source arrgument? Does this statement 'initiliaze' anything actually, and what?
Next thing which troubles me is, for example, when I print(type(g4)) before declaring g4.getOutput(), I get <class '__main__.OrGate'>, but when the g4.getOutput() starts, and functions start calling each other, to the point of calling getPin function, if I put print (self.pinA) before return self.pinA.getFrom().getOutput(), I get the <__main__.Connector object at 0x2b387e2f74d0>, although self.Pin is a variable from g4 OrGate instance. How can one variable from one class instance can become object of another class, which isn't inheriting it? Does this have something with a work of setNextPin() function?
Can someone explain this to me as I am new to OOP and thorougly confused by this piece of code. Thank you.
Regarding your first question, tgate.setNextPin(self) is a method call. tgate is an object, presumably an instance of one of the gate types. When you access instance.method, Python gives you a "bound method" object, which works pretty much like a function, but which puts the instance in as the first argument when it is actually called. So, tgate.setNextPin(self) is really calling type(tgate).setNextPin(tgate, self)
Your second question seems to reflect a misunderstanding of what attributes are. There's no requirement that an object's attributes be of its own type. In the various LogicGate sub-classes, the pin/pinA/pinB attributes are either going to be None (signaling that the user should be prompted for an input value) or an instance of a Connector (or something else with a getFrom method). Neither of those values is a LogicGate instance.
As for where the Connector instance you saw came from, its going to be one of the c1 through c3 values you created. Connector instances install themselves onto a pin of their tgate argument, with the setNextPin call you were asking about in your first question. I can't really speak to the g4 gate you are looking at, as it seems to be different than the g4 variable created in your example code's main() function (it is a different type), but I suspect that it is working as designed, and it's just a bit confusing. Try accessing the pinX attributes via g4.pinA rather than inspecting them inside of the methods and you may have a bit less confusion.
Here's a bit of code with outputs that should help you understand things a bit better:
# create a few gates
g1 = AndGate("G1")
g2 = OrGate("G2")
# at this point, no connectors have been hooked up, so all pinX attrs are None:
print("G1 pins:", g1.pinA, g1.pinB) # "None, None"
print("G2 pins:", g2.pinA, g2.pinB) # "None, None"
# test that the gates work independently of each other:
print("G1 output:", g1.getOutput()) # will prompt for two inputs, then print their AND
print("G2 output:", g2.getOutput()) # will prompt for two inputs, then print their OR
# create a connection between the gates
c1 = Connector(g1, g2) # connects output of g1 to first available pin (pinA) of g2
# we can see that g2.pinA has changed
print("G2 pins after connection:", g2.pinA, g2.pinB)
# "<__main__.Connector object at SomeHexAddress>, None"
# now, if we get g2's output, it will automatically request g1's output via the Connector
print("G2 output:", g2.getOutput())
# will prompt for 2 G1 inputs, and one G2 input and print (G1_A AND G1_B) OR G2_B
If you want to play around with these classes more, you might want to add a __str__ (and/or __repr__) method to some or all of the classes. __str__ is used by Python to convert an instance of the class into a string whenever necessary (such as when you pass it as an argument to print or str.format). Here's a quick __str__ implementation for Connector:
def __str__(self):
return "Connector between {} and {}".format(self.fgate.name, self.tgate.name)