Python: can an object have an object as a "default" representation? - python

I am just getting started with OOP, so I apologise in advance if my question is as obvious as 2+2. :)
Basically I created a class that adds attributes and methods to a panda data frame. That's because I am sometimes looking to do complex but repetitive tasks like merging with a bunch of other tables, dropping duplicates, etc. So it's pleasant to be able to do that in just one go with a predefined method. For example, I can create an object like this:
mysupertable = MySupperTable(original_dataframe)
And then do:
mysupertable.complex_operation()
Where original_dataframe is the original panda data frame (or object) that is defined as an attribute to the class. Now, this is all good and well, but if I want to print (or just access) that original data frame I have to do something like
print(mysupertable.original_dataframe)
Is there a way to have that happening "by default" so if I just do:
print(mysupertable)
it will print the original data frame, rather than the memory location?
I know there are the str and rep methods that can be implemented in a class which return default string representations for an object. I was just wondering if there was a similar magic method (or else) to just default showing a particular attribute. I tried looking this up but I think I am somehow not finding the right words to describe what I want to do, because I can't seem to be able to find an answer.
Thank you!
Cheers

In your MySupperTable class, do:
class MySupperTable:
# ... other stuff in the class
def __str__(self) -> str:
return str(self.original_dataframe)
That will make it so that when a MySupperTable is converted to a str, it will convert its original_dataframe to a str and return that.

When you pass an object to print() it will print the object's string representation, which under the hood is retrieved by calling the object.__str__(). You can give a custom definition to this method the way that you would define any other method.

Related

New Python object instance already containing data

ANSWERED! :)
I have to create a function that
Initializes a new object
Creates and adds data to that object
Returns the object containing the data.
The object basically contains a json dict (geojson) and offers a way of oop with geojson data.
In the python script in question, I create multiple instances of the class Feature.
However, for some reason, the second Feature object I create already has the attributes of the first Feature object - at a different RAM address AND without passing anything to the 2nd object (obj_1 and obj_2 are created in different functions, completely unrelated to each other...)
I am relatively new to OOP with Python so maybe I am missing something obvious.
This is also my first question on stackoverflow so just keep that in mind :)
I just can't wrap my head around this problem though.
This is the code I use (I will only give you the Feature init, if you need more I'll happily provide more!):
class Feature:
def __init__(self, featuretype: str="Point", coordinates: list=[], \
properties: dict={}, dictionary: dict={}):
# basically loads a provided dictionary or auto-creates one in json format
if dictionary:
self.dict = dictionary
else:
self.new(featuretype, coordinates, properties)
self.update()
First function (outside class):
def create_checkerboard(extent, dist_x, dist_y) -> bk.Feature:
checker_ft = bk.Feature(featuretype="MultiPoint") #init
checker_ft.gen_grid_adv(extent, dist_x, dist_y, matrixname="checkerboard") #populate
return checker_ft
The returned checker_ft object now has a checker_ft.dict which is now populated with a MultiPoint grid.
All is well.
Second function (outside class):
def random_grid(extent, grid_spacing_x, grid_spacing_y, shift) -> bk.Feature:
shift_ft = bk.Feature(featuretype="MultiPoint") #init
shift_ft.gen_grid(extent, grid_spacing_x, grid_spacing_y) #populate
return shift_ft
Now, for a reason which is obviously beyond me, the shift_ft.dict contains data from the checker_ft.
And both objects are at different RAM locations:
<bk_functions.Feature object at 0x000001A726A3CF70>
<bk_functions.Feature object at 0x000001A726A1F190>
I hope that this is a simple oversight on my part.
Thank you for your kind attention!

How can I store documentation in a pickle file?

In Python, I generate quite often a pickle file to conserve the work I have done during programming.
Is there any possibility to store something like a docstring in the pickle that explains how this pickle was generated and what it's meaning is.
Because you can combine all kinds of items into dictionaries, tuples, and lists before pickling them, I would say the most straightforward solution would be to use a dictionary that has a docstring key.
pickle_dict = {'objs': [some, stuff, inhere],
'docstring': 'explanation of those objects'}
Of course, depending on what you are pickling, you may want key-value pairs for each object instead of a list of objects.
When you open the pickle back up, you can just read the docstring to remember how this pickle came to be.
As an alternative solution, I often just need to save one or two integer values about the pickle. In this case, I choose to save in the title of the pickle file. Depending on what you are doing, this could be preferred so you can read the "docstring" without having to unpickle it.
DataFrames and lists don't typically have docstrings because they are data. The docstring specification says:
A docstring is a string literal that occurs as the first statement in a module, function, class, or method definition. Such a docstring becomes the __doc__ special attribute of that object.
You can create any of these to make a docstring associated with the process that uses your data. The main class of your module for example.
class MyClass:
"""My docstring"""
def __init__(self, df):
self.df = df # Your dataframe
...
Something like this seems like it is closest to what you were asking within the conventions of the language.

Defining class init well: inheritance and new attributes:

With Python 3.5, I want to build a new class which is tuple wrapping an association of datastructures in order to perform some operation.
It happens that I would also need some attributes. So I wrote this:
class tuplewrapper:
def __init__(self, dict={}):
print("initiating")
data=dict
self=tuple([dict(),dict()])
self.keys=iter(self[0])
self.values=iter(self[1])
self.put(data)# another function that add data in the dict as self.keys and self.values. *No need to describe, as the problem is in __init__ and mostly about syntax...*
I know this look similar to keys and values of a dict, but I need to do some special processing and this class would be very useful doing it.
The problem being that since I define self as tuple([dict(),dict()]); Python is returning AttributeError since tuple has no such keys as keys and values.
Which is precisely why I built this class in addition to add functions to this.
So what am I doing bad?
How to correct this? I don't know how to use "super" as documentation is not pretty explicit about it (and this didn't help), and for me it was pretty much acquired that in init, I could define the things however I wanted because it is the interest of the thing, but It seems I pretty much misunderstood the concept.
So, how do I do this, please?

What is the interpreter looking for?

I never realized just how poor a programmer I was until I came across this exercise below. I am to write a Python file that allows all of the tests below to pass without error.
I believe the file I write needs to be a class, but I have absolutely no idea what should be in my class. I know what the question is asking, but not how to make classes or to respond to the calls to the class with the appropriate object(s).
Please review the exercise code below, and then see my questions at the end.
File with tests:
import unittest
from allergies import Allergies
class AllergiesTests(unittest.TestCase):
def test_ignore_non_allergen_score_parts(self):
self.assertEqual(['eggs'], Allergies(257).list)
if __name__ == '__main__':
unittest.main()
1) I don't understand the "list" method at the end of this assertion. Is it the the Built-In Python function "list()," or is it an attribute that I need to define in my "Allergies" class?
2) What type of object is "Allergies(257).list"
self.assertEqual(['eggs'], Allergies(257).list)
3) Do I approach this by defining something like the following?
def list(self):
list_of_allergens = ['eggs','pollen','cat hair', 'shellfish']
return list_of_allergens[0]
1) I don't understand the "list" method at the end of this assertion. Is it the the Built-In Python function "list()," or is it an attribute that I need to define in my "Allergies" class?
From the ., you can tell that it's an attribute that you need to define on your Allergies class—or, rather, on each of its instances.*
2) What type of object is "Allergies(257).list"
Well, what is it supposed to compare equal to? ['eggs'] is a list of strings (well, of string). So, unless you're going to create a custom type that likes to compare equal to lists, you need a list.
3) Do I approach this by defining something like the following?
def list(self):
list_of_allergens = ['eggs','pollen','cat hair', 'shellfish']
return ist_of_allergens
No. You're on the wrong track right off the bat. This will make Allergies(257).list into a method. Even if that method returns a list when it's called, the test driver isn't calling it. It has to be a list. (Also, more obviously, ['eggs','pollen','cat hair', 'shellfish'] is not going to compare equal to ['eggs'], and ist_of_allergens isn't the same thing as list_of_allergens.)
So, where is that list going to come from? Well, your class is going to need to assign something to self.list somewhere. And, since the only code from your class that's getting called is your constructor (__new__) and initializer (__init__), that "somewhere" is pretty limited. And you probably haven't learned about __new__ yet, which means you have a choice of one place, which makes it pretty simple.
* Technically, you could use a class attribute here, but that seems less likely to be what they're looking for. For that matter, Allergies doesn't even have to be a class; it could be a function that just defines a new type on the fly, constructs it, and adds list to its dict. But both PEP 8 naming standards and "don't make things more complex for no good reason" both point to wanting a class here.
From how it's used, list is an attribute of the object returned by Allergies, which may be a function that returns an object or simply the call to construct an object of type Allergies. In this last case, the whole thing can be easily implemented as:
class Allergies:
def __init__(self, n):
# probably you should do something more
# interesting with n
if n==257:
self.list=['eggs']
This looks like one of the exercises from exercism.io.
I have completed the exercise, so I know what's involved.
'list' is supposed to be a class attribute of the class Allergies, and is itself an object of type list. At least that's one straight-forward way of dealing with it. I defined it in the __init__ method of the class. In my opinion, it's confusing that they called it 'list', as this clashes with Pythons list type.
snippet from my answer:
class Allergies(object):
allergens = ["eggs", "peanuts",
"shellfish", "strawberries",
"tomatoes", "chocolate",
"pollen","cats"]
def __init__(self, score):
# score_breakdown returns a list
self.list = self.score_breakdown(score) # let the name of this function be a little clue ;)
If I were you I would go and do some Python tutorials. I would start with basics, even if it feels like you are covering ground you already travelled. It's absolutely worth knowing your basics/fundamentals as solidly as possible. For this, I could recommend Udacity or codeacademy.

How to properly document hasattr() use

I see it's not considered pythonic to use isinstance(), and people suggest e.g. to use hasattr().
I wonder what the best way is to document the proper use of a function that uses hasattr().
Example:
I get stock data from different websites (e.g. Yahoo Finance, Google Finance), and there are classes GoogleFinanceData and YahooFinanceData which both have a method get_stock(date).
There is also a function which compares the value of two stocks:
def compare_stocks(stock1,stock2,date):
if hasattr(stock1,'get_stock') and hasattr(stock2,'get_stock'):
if stock1.get_stock(date) < stock2.get_stock(date):
print "stock1 < stock2"
else:
print "stock1 > stock2"
The function is used like this:
compare_stocks(GoogleFinanceData('Microsoft'),YahooFinanceData('Apple'),'2012-03-14')
It is NOT used like this:
compare_stocks('Tree',123,'bla')
The question is: How do I let people know which classes they can use for stock1 and stock2? Am I supposed to write a docstring like "stock1 and stock2 ought to have a method get_stock" and people have to look through the source themselves? Or do I put all right classes into one module and reference that file in the docstring?
If all you ever do is call the function with *FinanceData instances, I'd not even bother with testing for the get_stock method; it's an error to pass in anything else and the function should just break if someone passes in strings.
In other words, just document your function as expecting a get_stock() method, and not test at all. Duck typing is for code that needs to accept distinctly different types of input, not for code that only works for one specific type.
I don't see whats unpythonic about the use of isinstance(), I would create a base class and refer to the base class' documentation.
def compare_stocks(stock1, stock2, date):
""" Compares stock data of two FinanceData objects at a certain time. """
if isinstance(stock1, FinanceData) and isinstance(stock2, FinanceData):
return 'comparison'
class FinanceData(object):
def get_stock(self, date):
""" Returns stock data in format XX, expects parameter date in format YY """
raise NotImplementedError
class GoogleFinanceData(FinanceData):
def get_stock(self, date):
""" Implements FinanceData.get_stock() """
return 'important data'
As you see I don't use duck-typing here, but since you've asked this question in regards to documentation, I think this is the cleaner approach for readability. Whenever another developer sees the compare_stocks function or a get_stock method he knows exactly where he has to look for further information regarding functionality, data structure or implementation details.
Do what you suggest, put in the docstring that passed arguments should have a get_stock function, that is what your function requires, listing classes is bad since the code may well be used with derived or other classes when it suits someone.

Categories