Trying to split up and tokenize a poem (or haiku in this case), which is more of a way to teach myself how to use nltk and classes than anything else. When I run the code below, I get a Name Error: name 'psplit' is not defined even though (my thinking is) that it's defined when I return it from the split function. Can anyone help me figure out what's going wrong under the hood here?
import nltk
poem = "In the cicada's cry\nNo sign can foretell\nHow soon it must die"
class Intro():
def __init__(self, poem):
self.__poem = poem
def split(self):
psplit = (poem.split('\n'))
psplit = str(psplit)
return psplit
def tokenizer(self):
t = nltk.tokenize(psplit)
return t
i = Intro(poem)
print(i.split())
print(i.tokenizer())
There are some issues in your code:
In the split method you have to use self.__poem to access the the poem attribute of your class - as you did in the constructor.
The psplit variable in the split method is only a local variable so you can just use it in this method and nowhere else. If you want to make the variable available in the tokenize method you have to either pass it as an argument or store it as an additional attribute:
...
def tokenizer(self, psplit):
t = nltk.tokenize(psplit)
return t
...
psplit = i.split()
print(i.tokenizer(psplit))
Or:
def __init__(self, poem):
...
self._psplit = None
...
def split(self):
self._psplit = (poem.split('\n'))
self._psplit = str(psplit)
def tokenizer(self):
t = nltk.tokenize(self._psplit)
return t
...
i.split()
print(i.tokenizer())
In addition make sure your indentation is correct.
Related
I have a class in Python that initializes the attributes of an environment. I am attempting to grab the topographyRegistry attribute list of my Environment class in a separate function, which when called, should take in the parameters of 'self' and the topography to be added. When this function is called, it should simply take an argument such as addTopographyToEnvironment(self, "Mountains") and append it to the topographyRegistry of the Environment class.
When implementing what I mentioned above, I ran into an error regarding the 'self' method not being defined. Hence, whenever I call the above line, it gives me:
print (Environment.addTopographyToEnvironment(self, "Mountains"))
^^^^
NameError: name 'self' is not defined
This leads me to believe that I am unaware of and missing a step in my implementation, but I am unsure of what that is exactly.
Here is the relevant code:
class EnvironmentInfo:
def __init__(self, perceivableFood, perceivableCreatures, regionTopography, lightVisibility):
self.perceivableFood = perceivableFood
self.perceivableCreatures = perceivableCreatures
self.regionTopography = regionTopography
self.lightVisibility = lightVisibility
class Environment:
def __init__(self, creatureRegistry, foodRegistry, topographyRegistery, lightVisibility):
logging.info("Creating new environment")
self.creatureRegistry = []
self.foodRegistry = []
self.topographyRegistery = []
self.lightVisibility = True
def displayEnvironment():
creatureRegistry = []
foodRegistry = []
topographyRegistery = ['Grasslands']
lightVisibility = True
print (f"Creatures: {creatureRegistry} Food Available: {foodRegistry} Topography: {topographyRegistery} Contains Light: {lightVisibility}")
def addTopographyToEnvironment(self, topographyRegistery):
logging.info(
f"Registering {topographyRegistery} as a region in the Environment")
self.topographyRegistery.append(topographyRegistery)
def getRegisteredEnvironment(self):
return self.topographyRegistry
if __name__ == "__main__":
print (Environment.displayEnvironment()) #Display hardcoded attributes
print (Environment.addTopographyToEnvironment(self, "Mountains"))#NameError
print (Environment.getRegisteredEnvironment(self)) #NameError
What am I doing wrong or not understanding when using 'self'?
Edit: In regard to omitting 'self' from the print statement, it still gives me an error indicating a TypeError:
print (Environment.addTopographyToEnvironment("Mountains"))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: Environment.addTopographyToEnvironment() missing 1 required positional argument: 'topographyRegistery'
Comments
Despite having def getRegisteredEnvironment(self): it wasn't indented, so it's not recognized as a class method.
self is a keyword used in conjunction with classes (class methods or attributes) - not functions. self is implied to be the instantiated object (eg a = Environment(...) -> self would refer to a) or the module's (I can't think of the proper term) class.
You didn't have your addTopographyToEnvironment class method defined.
In terms of your Environment class, you aren't using the variables you are passing to the class, so I made that change as well - I don't know if that was intentional or not.
As per your comment from the other answer, if you had def my_class_method(self) and you try to invoke it through an object with additional parameters, like so a = my_object(); a.my_class_method("Mountains"), you should get an error of the sorts, "2 positional arguments passed, expected 1.".
Your main problem is that you are doing Environment.class_method() and not creating an object from the class. Do a = Environment(whatever arguments here) to create an object from the class, then do a.addTopographyToEnvironment("Mountains") to do what you were going to do with "Mountains" and that object. What you have currently may be right, its just is missing the proper implementation, but the below article does a great job explaining the differences between all of them (Class Methods vs Static Methods vs Instance Methods), and is definitely worth the read.
class EnvironmentInfo:
def __init__(self, perceivableFood, perceivableCreatures, regionTopography, lightVisibility):
self.perceivableFood = perceivableFood
self.perceivableCreatures = perceivableCreatures
self.regionTopography = regionTopography
self.lightVisibility = lightVisibility
class Environment:
def __init__(self, creatureRegistry, foodRegistry, topographyRegistery, lightVisibility):
logging.info("Creating new environment")
self.creatureRegistry = creatureRegistry
self.foodRegistry = foodRegistry
self.topographyRegistery = topographyRegistery
self.lightVisibility = lightVisibility
def displayEnvironment(self):
creatureRegistry = []
foodRegistry = []
topographyRegistery = ['Grasslands']
lightVisibility = True
print (f"Creatures: {creatureRegistry} Food Available: {foodRegistry} Topography: {topographyRegistery} Contains Light: {lightVisibility}")
def addTopographyToEnvironment(self, environment):
return "Whatever this is supposed to return." + environment
def getRegisteredEnvironment(self):
return self.topographyRegistry
if __name__ == "__main__":
print (Environment.displayEnvironment()) #Display hardcoded attributes
print (Environment.addTopographyToEnvironment("Mountains"))#NameError
print (Environment.getRegisteredEnvironment()) #NameError
Object Instantiation In Python
With all that out of the way, I will answer the question as is posed, "Is there a way to grab list attributes that have been initialized using self and append data to them in Python?". I am assuming you mean the contents of the list and not the attributes of it, the attributes would be "got" or at least printed with dir()
As a simple example:
class MyClass:
def __init__(self, my_list):
self.my_list = my_list
if __name__ == "__main__":
a = MyClass([1, 2, 3, 4, 5])
print(a.my_list)
# will print [1, 2, 3, 4, 5]
a.my_list.append(6)
print(a.my_list)
# will print [1, 2, 3, 4, 5, 6]
print(dir(a.my_list))
# will print all object methods and object attributes for the list associated with object "a".
Sub Classing In Python
Given what you have above, it looks like you should be using method sub classing - this is done with the keyword super. From what I can guess, it would look like you'd implement that kind of like this:
class EnvironmentInfo:
def __init__(self, perceivableFood, perceivableCreatures, regionTopography, lightVisibility):
self.perceivableFood = perceivableFood
self.perceivableCreatures = perceivableCreatures
self.regionTopography = regionTopography
self.lightVisibility = lightVisibility
class Environment(EnvironmentInfo):
def __init__(self, creatureRegistry, foodRegistry, topographyRegistery, lightVisibility, someOtherThingAvailableToEnvironmentButNotEnvironmentInfo):
logging.info("Creating new environment")
super.__init__(foodRegistry, creatureRegistry, topographyRegistery, lightVisibility)
self.my_var1 = someOtherThingAvailableToEnvironmentButNotEnvironmentInfo
def displayEnvironment(self):
creatureRegistry = []
foodRegistry = []
topographyRegistery = ['Grasslands']
lightVisibility = True
print (f"Creatures: {creatureRegistry} Food Available: {foodRegistry} Topography: {topographyRegistery} Contains Light: {lightVisibility}")
def addTopographyToEnvironment(self, environment):
return "Whatever this is supposed to return." + environment
def getRegisteredEnvironment(self):
return self.topographyRegistry
def methodAvailableToSubClassButNotSuper(self)
return self.my_var1
if __name__ == "__main__":
a = Environment([], [], [], True, "Only accessible to the sub class")
print(a.methodAvailableToSubClassButNotSuper())
as the article describes when talking about super(), methods and attributes from the super class are available to the sub class.
Extra Resources
Class Methods vs Static Methods vs Instance Methods - "Difference #2: Method Defination" gives an example that would be helpful I think.
What is sub classing in Python? - Just glanced at it; probably an okay read.
Self represents the instance of the class and you don't have access to it outside of the class, by the way when you are calling object methods of a class you don't need to pass self cause it automatically be passed to the method you just need to pass the parameters after self so if you want to call an object method like addTopographyToEnvironment(self, newVal) you should do it like:
Environment.addTopographyToEnvironment("Mountains")
and it should work fine
I have created this class that works as expected, I want only to expose one method, get_enriched_dataso the other are pretty much private w/ the underscore.
The functionality works, just pretty convinced I am not doing the most pythonic/OOP way:
class MergeClients:
def __init__(self,source_df,extra_info_df,type_f):
self.df_all = pd.merge(source_df,extra_info_df, on='clientID', how='left')
self.avg_age = self._get_avg_age()
self.type_f = 'Medium'
def _filter_by_age(self, age):
return self.df_all[self.df_all['Age'] > age]
def _filter_by_family_type(self, f_type):
return self.df_all[self.df_all['familyType'] == f_type]
def _get_avg_age(self):
return self.df_all['Age'].mean()
def get_enriched_data(self):
self.df_all = self._filter_by_age(self.avg_age)
self.df_all=self._filter_by_family_type(self.type_f)
return self.df_all
But I find the code looks so ugly with so many self references, for example in the get_enriched_datamethod there are three self references per line, how can I correct this? Any direction on how to correctly Python classes is welcome.
Edit:
Example of working code:
main_df = pd.DataFrame({'clientID':[1,2,3,4,5],
'Name':['Peter','Margaret','Marc','Alice','Maria']})
extra_info = pd.DataFrame({'clientID':[1,2,3,4,5],'Age':[19,35,18,65,57],'familyType':['Big','Medium','Single','Medium','Medium']})
family_stats = MergeClients(main_df,extra_info,'Medium')
family_filtered = family_stats.get_enriched_data()
There are some odd things about your code. I will point out one thing about instances: every method has access to all attributes, so you don't always need to pass them as parameters:
class MergeClients:
def __init__(self,source_df,extra_info_df,type_f):
self.df_all = pd.merge(source_df,extra_info_df, on='clientID', how='left')
self.avg_age = self._get_avg_age()
self.type_f = 'Medium'
def _filter_by_age(self): #No need for age param
return self.df_all[self.df_all['Age'] > self.avg_age]
def _filter_by_family_type(self): #No need for f_type param
return self.df_all[self.df_all['familyType'] == self.type_f]
def _get_avg_age(self):
return self.df_all['Age'].mean()
def get_enriched_data(self):
self.df_all = self._filter_by_age()
self.df_all = self._filter_by_family_type()
return self.df_all
Since the two methods in question: _filter_by_age() and _filter_by_family_type() are private by convention, this means that clients of your class are not expected to call them. So if only other methods of this class call these methods and only the ones you have shown, then there is no need to pass parameters which are already attributes.
Alternatively there is the argument that for other private methods where sometimes they should use attributes, but at other times they should take a parameter, then I would make those methods take a parameter as you had originally.
Functions declared within a Python Class can be effectively made 'private' by preceding the name with double underscore. For example:
class Clazz():
def __work(self):
print('Working')
def work(self):
self.__work()
c = Clazz()
c.work()
c.__work()
The output of this would be:
Working
Traceback (most recent call last):
File "/Volumes/G-DRIVE Thunderbolt 3/PythonStuff/play.py", line 575, in
c = Clazz()
AttributeError: 'Clazz' object has no attribute '__work'
In other words, the __work function has been 'hidden'
it does not work. I want to split data as in code in lines attribute.
class movie_analyzer:
def __init__(self,s):
for c in punctuation:
import re
moviefile = open(s, encoding = "latin-1")
movielist = []
movies = moviefile.readlines()
def lines(movies):
for movie in movies:
if len(movie.strip().split("::")) == 4:
a = movie.strip().split("::")
movielist.append(a)
return(movielist)
movie = movie_analyzer("movies-modified.dat")
movie.lines
It returns that:
You can use #property decorator to be able to access the result of the method as a property. See this very simple example of how this decorator might be used:
import random
class Randomizer:
def __init__(self, lower, upper):
self.lower = lower
self.upper = upper
#property
def rand_num(self):
return random.randint(self.lower, self.upper)
Then, you can access it like so:
>>> randomizer = Randomizer(0, 10)
>>> randomizer.rand_num
5
>>> randomizer.rand_num
7
>>> randomizer.rand_num
3
Obviously, this is a useless example; however, you can take this logic and apply it to your situation.
Also, one more thing: you are not passing self to lines. You pass movies, which is unneeded because you can just access it using self.movies. However, if you want to access those variables using self you have to set (in your __init__ method):
self.movielist = []
self.movies = moviefile.readlines()
To call a function you use movie.lines() along with the argument. What you are doing is just accessing the method declaration. Also, make sure you use self as argument in method definitions and save the parameters you want your Object to have. And it is usually a good practice to keep your imports at the head of the file.
import re
class movie_analyzer:
def __init__(self,s):
for c in punctuation:
moviefile = open(s, encoding = "latin-1")
self.movielist = []
self.movies = moviefile.readlines()
#property
def lines(self):
for movie in self.movies:
if len(movie.strip().split("::")) == 4:
a = movie.strip().split("::")
self.movielist.append(a)
return self.movielist
movie = movie_analyzer("movies-modified.dat")
movie.lines()
I'm trying to return variable name, but i keep getting this:
<classes.man.man object at (some numbers (as example:0x03BDCA50))>
Below is my code:
from classes.man import man
def competition(guy1, guy2, counter1=0, counter2=0):
.......................
some *ok* manipulations
.......................
if counter1>counter2:
return guy1
bob = man(172, 'green')
bib = man(190, 'brown')
print(competition(bob , bib ))
Epilogue
If anyone want to, explain please what I can write instead of __class__ in example below to get variable name.
def __repr__(self):
return self.__class__.__name__
Anyway, thank you for all of your support
There are different ways to approach your problem.
The simplest I can fathom is if you can change the class man, make it accept an optional name in its __init__ and store it in the instance. This should look like this:
class man:
def __init__(number, color, name="John Doe"):
self.name = name
# rest of your code here
That way in your function you could just do with:
return guy1.name
Additionnally, if you want to go an extra step, you could define a __str__ method in your class man so that when you pass it to str() or print(), it shows the name instead:
# Inside class man
def __str__(self):
return self.name
That way your function could just do:
return guy1
And when you print the return value of your function it actually prints the name.
If you cannot alter class man, here is an extremely convoluted and costly suggestion, that could probably break depending on context:
import inspect
def competition(guy1, guy2, counter1=0, counter2=0):
guy1_name = ""
guy2_name = ""
for name, value in inspect.stack()[-1].frame.f_locals.items():
if value is guy1:
guy1_name = name
elif value is guy2:
guy2_name = name
if counter1 > counter2:
return guy1_name
elif counter2 > counter2:
return guy1_name
else:
return "Noone"
Valentin's answer - the first part of it at least (adding a name attribute to man) - is of course the proper, obvious solution.
Now wrt/ the second part (the inspect.stack hack), it's brittle at best - the "variables names" we're interested in might not necessarily be defined in the first parent frame, and FWIW they could as well just come from a dict etc...
Also, it's definitly not the competition() function's responsability to care about this (don't mix domain layer with presentation layer, thanks), and it's totally useless since the caller code can easily solve this part by itself:
def competition(guy1, guy2, counter1=0, counter2=0):
.......................
some *ok* manipulations
.......................
if counter1>counter2:
return guy1
def main():
bob = man(172, 'green')
bib = man(190, 'brown')
winner = competition(bob, bib)
if winner is bob:
print("bob wins")
elif winner is bib:
print("bib wins")
else:
print("tie!")
Python prints the location of class objects in memory if they are passed to the print() function as default. If you want a prettier output for a class you need to define the __repr__(self) function for that class which should return a string that is printed if an object is passed to print(). Then you can just return guy1
__repr__ is the method that defines the name in your case.
By default it gives you the object type information. If you want to print more apt name then you should override the __repr__ method
Check below code for instance
class class_with_overrided_repr:
def __repr__(self):
return "class_with_overrided_repr"
class class_without_overrided_repr:
pass
x = class_with_overrided_repr()
print x # class_with_overrided_repr
x = class_without_overrided_repr()
print x # <__main__.class_without_overrided_repr instance at 0x7f06002aa368>
Let me know if this what you want?
pI am working on a bit of code that does nothing important, but one of the things I am trying to make it do is call a function from another class, and the class name is pulled out of a list and put into a variable. Mind you I have literally just learned python over the last 2 weeks, and barely know my way around how to program.
What I believe that this should do is when getattr() is called, it will pass the attribute 'run_question' that is contained in the respective class with the same name as what is in question_type, and then pass it onto 'running_question'. I know there are probably better ways to do what I am attempting, but I want to know why this method doesn't work how I think it should.
#! /usr/bin/python
rom random import randrange
class QuestionRunner(object):
def __init__(self):
##initialize score to zero
self.score = 0
##initialize class with the types of questions
self.questiontypes = ['Addition', 'Subtraction', 'Division', 'Multiplication']
##randomly selects question type from self.questiontypes list
def random_type(self):
type = self.questiontypes[randrange(0, 4)]
return type
##question function runner, runs question function from self
def run_questions(self):
try:
question_type = self.random_type()
running_question = getattr(question_type, 'run_question' )
except AttributeError:
print question_type
print "Attribute error:Attribute not found"
else: running_question()
class Question(object):
pass
class Multiplication(Question):
def run_question(self):
print "*"
class Division(Question):
def run_question(self):
print "/"
class Subtraction(Question):
def run_question(self):
print "-"
class Addition(Question):
def run_question(self):
print "+"
test = QuestionRunner()
test.run_questions()
This outputs:
[david#leonid mathtest] :( $ python mathtest.py
Division
Attribute error:Attribute not found
[david#leonid mathtest] :) $
Which indicates that I am not getting the run_question attribute as I expect.
I should note that when I put the functions into the QuestionRunner class in the following way, everything works as expected. The main reason I am using classes where it really isn't needed it to actually get a good grasp of how to make them do what I want.
#! /usr/bin/python
from random import randrange
class QuestionRunner(object):
def __init__(self):
##initialize score to zero
self.score = 0
##initialize class with the types of questions
self.questiontypes = ['addition', 'subtraction', 'division', 'multiplication']
##randomly selects question type from self.questiontypes list
def random_type(self):
type = self.questiontypes[randrange(0, 4)]
return type
##question function runner, runs question function from self
def run_questions(self):
try:
question_type = self.random_type()
running_question = getattr(self, question_type)
except AttributeError:
exit(1)
else: running_question()
def multiplication(self):
print "*"
def division(self):
print "/"
def addition(self):
print "+"
def subtraction(self):
print "-"
test = QuestionRunner()
test.run_questions()
Any help on why this isn't working would be great, and I appreciate it greatly.
Any help on why this isn't working would be great, and I appreciate it greatly.
Ah, I have found out the missing concept that was causing my logic to be faulty. I assumed that I could pass the name of an object to getattr, when in reality I have to pass the object itself.