I enjoy all the python libraries for scraping websites and I am experimenting with BeautifulSoup and IMDB just for fun.
As I come from Java, I have some Java-practices incorporated into my programming styles. I am trying to get the info of a certain movie, I can either create a Movie class or just use a dictionary with keys for the attributes.
My question is, should I just use dictionaries when a class will only contain data and perhaps almost no behaviour? In other languages creating a type will help you enforce certain restrictions and because of type checks the IDE will help you program, this is not always the case in python, so what should I do?
Should I resort to creating a class only when there's both, behaviour and data? Or create a movie class even though it'll probably be just a data container?
This all depends on your model, in this particular case either one is fine but I'm wondering about what's a good practice.
It's fine to use a class just to store attributes. You may also wish to use a namedtuple instead
The main differences between dict and class are the way you access the attributes [] vs . and inheritence.
instance.__dict__ is just a dict after all
You can even just use a single class for all of those types of objects if you wish
class Bunch:
def __init__(self, **kwargs):
self.__dict__.update(kwargs)
movie = Bunch(title='foo', director='bar', ...)
In your case you could use a class that inherits from dict (e.g class MyClass(dict)) so that you can define custom behavior to your dict-like class or use UserDict.
It depends on what you really mean for "perhaps almost no behaviour", if dict already provides what you need stay with it. Otherwise consider to subclass dict adding your specific behaviour. Since Python 2.2 it is possible. Using UserDict is an older approach to the problem.
You could also use a plain dictionary and implement the behaviour externally via some function. I use this approach for prototyping, and eventually refactor the code later to make it Object Oriented (generally more scalable).
You can see what a dictionary offers typing this at the interpreter:
>>> help({})
or referring to the docs.
I would stick to KISS (Keep it simple stupid). If you only want to store values you are better off with a dictionary, because you can dynamically add values at runtime. WRONG:(But you can not add new filds to a class at runtime.)
So classes are useful if they provide state and behaviour.
EDIT: You can add fields to classes in python.
Related
This question already has answers here:
What's the pythonic way to use getters and setters?
(8 answers)
Closed 2 months ago.
Using get/set seems to be a common practice in Java (for various reasons), but I hardly see Python code that uses this.
Why do you use or avoid get/set methods in Python?
In python, you can just access the attribute directly because it is public:
class MyClass:
def __init__(self):
self.my_attribute = 0
my_object = MyClass()
my_object.my_attribute = 1 # etc.
If you want to do something on access or mutation of the attribute, you can use properties:
class MyClass:
def __init__(self):
self._my_attribute = 0
#property
def my_attribute(self):
# Do something if you want
return self._my_attribute
#my_attribute.setter
def my_attribute(self, value):
# Do something if you want
self._my_attribute = value
Crucially, the client code remains the same.
Cool link: Python is not Java :)
In Java, you have to use getters and setters because using public fields gives you no opportunity to go back and change your mind later to using getters and setters. So in Java, you might as well get the chore out of the way up front. In Python, this is silly, because you can start with a normal attribute and change your mind at any time, without affecting any clients of the class. So, don't write getters and setters.
Here is what Guido van Rossum says about that in Masterminds of Programming
What do you mean by "fighting the language"?
Guido: That usually means that they're
trying to continue their habits that
worked well with a different language.
[...] People will turn everything into
a class, and turn every access into an
accessor method,
where that is really not a wise thing to do in Python;
you'll have more verbose code that is
harder to debug and runs a lot slower.
You know the expression "You can write
FORTRAN in any language?" You can write Java in any language, too.
No, it's unpythonic. The generally accepted way is to use normal data attribute and replace the ones that need more complex get/set logic with properties.
The short answer to your question is no, you should use properties when needed. Ryan Tamyoko provides the long answer in his article Getters/Setters/Fuxors
The basic value to take away from all this is that you want to strive to make sure every single line of code has some value or meaning to the programmer. Programming languages are for humans, not machines. If you have code that looks like it doesn’t do anything useful, is hard to read, or seems tedious, then chances are good that Python has some language feature that will let you remove it.
Your observation is correct. This is not a normal style of Python programming. Attributes are all public, so you just access (get, set, delete) them as you would with attributes of any object that has them (not just classes or instances). It's easy to tell when Java programmers learn Python because their Python code looks like Java using Python syntax!
I definitely agree with all previous posters, especially #Maximiliano's link to Phillip's famous article and #Max's suggestion that anything more complex than the standard way of setting (and getting) class and instance attributes is to use Properties (or Descriptors to generalize even more) to customize the getting and setting of attributes! (This includes being able to add your own customized versions of private, protected, friend, or whatever policy you want if you desire something other than public.)
As an interesting demo, in Core Python Programming (chapter 13, section 13.16), I came up with an example of using descriptors to store attributes to disk instead of in memory!! Yes, it's an odd form of persistent storage, but it does show you an example of what is possible!
Here's another related post that you may find useful as well:
Python: multiple properties, one setter/getter
I had come here for that answer(unfortunately i couldn't) . But i found a work around else where . This below code could be alternative for get .
class get_var_lis:
def __init__(self):
pass
def __call__(self):
return [2,3,4]
def __iter__(self):
return iter([2,3,4])
some_other_var = get_var_lis
This is just a workaround . By using the above concept u could easily build get/set methodology in py too.
Our teacher showed one example on class explaining when we should use accessor functions.
class Woman(Human):
def getAge(self):
if self.age > 30:
return super().getAge() - 10
else:
return super().getAge()
I am coming from a Java background and I am used to POJO to define a data model. Basically what I want to define is a data model using Python classes with strong attribute type checking and not being able to add more attribute then the one defined (basically something similar to django model). For example I am trying to do create a class:
class A:
element1: int
element2: str
but in this way I am not forcing that A object will not have attribute element3.
What is the pythonic way to achieve that, is there any framework you advice for python 3.6+?
You can do this using __slots__, you can find more information using the magic method __slots__ here: Slots docs.
Slots are a nice way to work around this space consumption problem. Instead of having a dynamic dict that allows adding attributes to objects dynamically, slots provide a static structure which prohibits additions after the creation of an instance.
Keep in mind that this comes at a cost.
It will break serialization (e.g. pickle). It will also break multiple inheritance. A class can't inherit from more than one class that either defines slots or has an instance layout defined in C code (like list, tuple or int).
I hope this information is sufficient.
My question is not really technical. I use a python package with well suited classes which contain numerous attributes and methods. But for my application I would like to store several additional attributes which do not exist in that classes.
Using python, if I set an attribute that does not exist in the class, python just create that attribute. It works. But my question is to know if this is recommended or if I must implement a subclass with that additional attributes ? For example it will allow us to document that new attributes etc ...
In a sense, this is slightly subjective but there is a clear answer. Do not add new attributes to the original object, create a new class.
I could argue about at least one clear reason for that. A new attribute is making the object do more than it used to do, so it is surely violating the single responsibility principle.
So, create a new class to hold this data. Yet, we could give you more information if we actually knew what you are trying to do. So, what about adding some more info?
Looks like there are multiple ways to do that but couldn't find the latest best method.
Subclass UserDict
Subclass DictMixin
Subclass dict
Subclass MutableMapping
What is the correct way to do? I want to abstract actual data which is in a database.
Since your dict-like class isn't in fact a dictionary, I'd go with MutableMapping. Subclassing dict implies dict-like characteristics, including performance characteristics, which won't be true if you're actually hitting a database.
If you are doing your own thing (e.g. inventing your own wheel) you might as well write the class from scratch (i.e. subclass from object), just providing the correct special members (e.g. __getitem__) and other such functions as described in the object data model, so that it quacks like a dict. Internally, you might even own a number of dicts (has-a) to help your implementation.
This way, you don't have to shoehorn your design to "fit" some existing implementation, and you aren't paying for some things you aren't necessarily using .This recommendation in part is because your DB-backed class will already be considerably more complex than a standard dict if you make it account for performance/caching/consistency/optimal querying etc.
When is a class more useful to use than a function? Is there any hard or fast rule that I should know about? Is it language dependent? I'm intending on writing a script for Python which will parse different types of json data, and my gut feeling is that I should use a class to do this, versus a function.
You should use a class when your routine needs to save state. Otherwise a function will suffice.
First of all, I think that isn't language-dependent (if the language permit you to define classes and function as well).
As a general rule I can tell you that a Class wrap into itself a behaviour. So, if you have a certain type of service that you have to implement (with, i.e. different functions) a class is what you're lookin' for.
Moreover classes (say object that is more correct) has state and you can instantiate more occurrences of a class (so different objects with different states).
Not less important, a class can be inearthed: so you can overwrite a specific behaviour of your function only with small changes.
the class when you have the state - something that should be persistent across the calls
the function in other cases
exception: if your class is only storing couple of values and has a single method besides __init__, you should better use the function
For anything non-trivial, you should probably be using a class. I tend to limit all of my "free-floating" functions to a utils.py file.
This is language-dependent.
Some languages, like Java, insist that you use a class for everything. There's simply no concept of a standalone function.
Python isn't like that. It's perfectly OK - in fact recommended - to define functions standalone, and related functions can be grouped together in modules. As others have stated, the only time you really want a class in Python is when you have state that you need to keep - ie, encapsulating the data within the object.