Python - Class variables vs dictionary of values

Python - Class variables vs dictionary of values - python

Lets say I've got a class which represents an object that has many properties (simple data types like strings and integers). Should they be represented as instance variables or would the better "pythonic" be to put them into a dictionary?
For example:
class FruitBasket:
def __init__(self,apples, oranges, bananas, pears): #number of apples, oranges etc...
self.apples = apples
self.oranges = oranges
self.bananas = bananas
self.pears = pears
class FruitBasket:
def __init__(self, fruits): #fruits is a dictionary
self.fruits = fruits

My general philosophy is to use attributes if the set of items is more or less fixed, and use a dictionary if the set may change on an ad-hoc basis. If your FruitBasket is specifically made to contain apples, oranges, bananas and pears, then use attributes. If it may contain any random assortment of other things (e.g., you might sometimes throw in a pineapple or a raspberry), use a dictionary.
One reason can be sort of seem even in your example code. If you use attributes, you have to specify each one literally in the code (e.g., self.pears). Moreover, you often wind up doing what you did here, where you explicitly pass each item as an argument to __init__. This obviously won't work if you later decide to add new fruits. You could keep adding more arguments to __init__, but that quickly becomes unwieldy.
In addition, if you have a fixed set of items, you'll probably be accessing them individually. That is, if you know you only have apples, oranges, bananas, and pears, you can directly access them by name as you did here (self.apples, self.oranges, etc.). If you don't know ahead of time what fruits may be in the basket, you can't know what names to use a prior, so you'll typically process them by iterating over them. It is very easy to iterate over the items of a dictionary. By contrast, iterating over the attributes of an object is fraught with peril, since you can't easily distinguish the attributes that contain data that the object is "about" (e.g., self.pears) from those that pertain to the structure of the object itself (e.g., self.__init__, self.basketColor, self.basketSize, etc.).
In short, if you don't know ahead of time what will be in the basket, you'll want to iterate over its contents, and if you want to iterate over something's contents, it's best to use a type designed for containment (like a list or dict), because these types cleanly separate the container from its contents.

It depends what are you going to do with it. A dictionary is more flexible, as it easier to expand with new fruits, as you can iterate to get them all. Representing the fruits as members saves you from some typing, but you have to hard-code all the accesses to them.
A middle ground exists, and it is to use the pattern to syncronize both. Here there is some discussion about how to implement it.
And, before you write another class, remember: Stop writing classes.

You probably want to use a dictionary or you can use the new python 3.4 enum! If it should be an enum. https://docs.python.org/3/library/enum.html
from enum import Enum
animal = Enum('Animal', 'ant bee cat dog')
animal.ant

This and very similiar questions have been asked many times before, and there are various AttrDict implementations out there.
However, you should ask yourself if you have any reason at all not to use a dict. If you don't, then the pythonic thing to do is to use a dict, obviously. A class with no methods should probably not be a class at all. You should also consider the fact that not all valid dict keys are valid attribute names.

Related

Python typed collections

I am new to python (using python 3.6).
I have some class that represents amounts of some fictional coins.
So an instance could represent say 10 bluecoins or negative sums such as -20 redcoins and so on.
I can now hold in a list several such CoinAmounts in a list.
e.g.
[CoinAmount(coin='blue',amount=-10), CoinAmount(coin='blue',amount=20),
CoinAmount(coin='red',amount=5), CoinAmount(coin='red',amount=-5),
CoinAmount(coin='green',amount=5)]
I want to be able to "compress" the above list by summing each type of coin so that I will have.
[CoinAmount(coin='blue',amount=10), CoinAmount(coin='green',amount=5)]
or
[CoinAmount(coin='blue',amount=10), CoinAmount(coin='red',amount=0), CoinAmount(coin='green',amount=5)]
from which it is easy to derive the former...
My Q's are:
1) Would it make sense to have some sort of a ListOfCoinAmounts that subclasses list and adds a compress method? or should I use so CoinAmountUtils class that has a static method that works on a list and Compreses it?
2) Is there a way to ensure that the list actually holds only CoinAmounts or is this should just be assumed and followed (or both - i.e. it can be done but shouldn't ?
3) In a more general way what is the best practice "pythonic" way to handle a "List of something specific"?

Inheritance - when not used for typing - is mostly a very restricted form of composition / delegation, so inheriting from list is ihmo a bad design.
Having some CoinContainer class that delegates to a list is a much better design, in that 1/ it gives you full control of the API and 2/ it lets you change the implementation as you want (you may find out that a list is not the best container for your needs).
Also it will be easier to implement since you don't have to make sure you override all of the list methods and magicmethods, only the ones you need (cf point #1).
wrt/ type-cheking, it's usually not considered pythonic - it's the client code responsability to make sure it only passes compatible objects. If you really want some type-checking here at least use an ABC and test against this ABC, not against a fixed type.

1) Subclassing list and having only CoinAmount type of elements in it is a good and cleaner method IMO.
2) Yes, that can be done. You can inherit the python list and override append method to check for types.
A good example here : Overriding append method after inheriting from a Python List
3) A good practice is indeed extending the list and putting your customizations.

Character class VS. Character list

On nearly all of the example programs for pygame, characters are instantiated as classes with some code like this one:
class Character(object):
def__init__(self,image,stuff):
self.image = image
self.stuff = stuff[:]
bob = Character(image,stuff)
I am wondering what the benefit of using a class is over using just a plain list. I could instead of using class instantiation just create a list like this:
bob = [image,stuff[:]]
I was wondering if the reason that people use classes is to have functions that interact directly with the character and are just defined as a part of the class rather than as a separate function that can be used on the character.
Thank you!

Generally, I'd say it's more clear. With the list, you'll end up wondering "what was at index 0? what was at index 1?" and so forth. Then you'd have to trace back through the code to find where bob was defined to make sure.
Additionally, if you create other characters throughout the code, you have to create them all the same way. With the class, you can easily search the codebase for character creations and update it (e.g. if you want to add another property to characters) and if you miss any, python will throw an Exception so you know where to fix it. With the list, it's going to be really hard to find and python won't tell you if you miss any -- You'll get a funky IndexError that you need to trace back to the root cause which is more work.

When using a class you might be able to inherit from other class and create methods, which doesn't apply to lists. But if you know that you will only be using static values like your class Character does, you might check out namedtuple. Here's a simple example how to use it:
from collections import namedtuple
Character = namedtuple('Character', 'image stuff')
bob = Character(image, stuff)

Why use a class Bob over a list bob in this simple case:
Easy access to an attribute. It's simpler to remember Bob.image than bob[0]. The longer the list is, the harder it gets.
Code readability. I have no idea what the line bob[7]=bob[3]+bob[6] does. With a class, the same line becomesBob.armor=Bob.shield+Bob.helmet, and I know what it does.
Organization. If some functions are only meant to be use on characters, it's practical to have them declared just after the attributes. A class forces you to have everything related to characters at the same place.
Instead of a list though, you could use a dictionary:
bob = {"image":image, "stuff":stuff[:], ...}
bob["armor"]=bob["shield"]+bob["helmet"]
As with a class, you have an easy access to attributes and code is readable.

Best way to make 'n' objects with each with unique data contents?

Firstly, you should know that I am incredibly new to programming, so I will love any detailed explanations.
So what I am attempting to make is a program that basically creates people. This includes unique characteristics as such their name, income, job, etc. And since I planned to make a large number of 'people,' I hoped I could merely state how many people I wanted made, and I would get each of them as a object class. To name them I figured I could do 'person1,' 'person2,' and so on. My trouble came when I found out you can't make strings into objects. (Or rather, it is heavily frowned upon.)
After researching I was able to make each person a dictionary, with a key like 'income' and a value like '60000.' However, when it comes to manipulating the data created it seems much better to uses classes and methods instead.
Thank you, and sorry if this is bad or if I am overlooking something.
Edit: I realized I could ask this better, how can I instantiate a large number of persons, or how do I make the needed variables to instantiate? I suck at explaining things...

It seems to me that you are asking two distinct questions (correct me if I'm wrong). The first - how should you store your data. The second - how can you do that repeatedly with ease.
There are a couple of ways you can store the data. I don't know your exact usecase so I can't say exactly which one would work best (you mentioned creating objects in your question so I'll use that for further examples)
Objects
class Person(object):
def __init__(self, name, income):
self.name = name
self.income = income
Namedtuples
>>> from collections import namedtuple
>>> a = namedtuple("person", ['name', 'income'])
>>> a
<class '__main__.person'>
>>> ab = a("Dannnnno", 100)
>>> ab
person(name='Dannnnno', income=100)
>>> ab.name
'Dannnnno'
>>> ab.income
100
Dictionaries
someperson = {0 : {name:"Dannnno", income:100}}
someotherperson = {1: {name:"kcd", income:100}}
As for creating large numbers of them - either create a class like GroupOfPeople or use a function.
Using the Classes example from above (I assume you could translate the other two examples appropriately)
class GroupOfPeople(object):
def __init__(self, num_people):
self.people = [Person("Default", 0) for i in range num_people]
####
def MakeLotsOfPeople(num_people):
return [Person("Default", 0) for i in range num_people]
You could then edit those separate Person instances to whatever you want. You could also edit the class/function to accept another input (like a filename perhaps) that stored all of your name/income/etc data.
If you want a dictionary of the group of people just replace the list comprehensions with a dictionary comprehension, like so
{i : Person("Default", 0) for i in range num_people}

Look up Object Oriented Programming. This is the concept you are trying to wrap your head around.
http://en.wikipedia.org/wiki/Object-oriented_programming

Is it possible to create unique instances that have the same input?

I am working on code in Python that creates Compound objects (as in chemical compounds) that are be composed of Bond and Element objects. These Element objects are created with some inputs about them (Name, symbol, atomic number, atomic mass, etc). If I want to populate an array with Element objects, and I want the Element objects to be unique so I can do something to one and leave the rest unchanged, but they should all have the information related to a 'Hydrogen' element.
This question Python creating multiple instances for a single object/class leads me to believe that I should create sub-classes to Element - ie a Hydrogen object and a Carbon object, etc.
Is this doable without creating sub-classes, and if so how?

Design your object model based on making the concepts make sense, not based on what seems easiest to implement.
If, in your application, hydrogen atoms are a different type of thing than oxygen atoms, then you want to have a Hydrogen class and an Oxygen class, both probably subclasses of an Element class.*
If, on the other hand, there's nothing special about hydrogen or oxygen (e.g., if you don't want to distinguish between, say, oxygen and sulfur, since they both have the same valence), then you don't want subclasses.
Either way, you can create multiple instances. It's just a matter of whether you do it like this:
atoms = [Hydrogen(), Hydrogen(), Oxygen(), Oxygen()]
… or this:
atoms = [Element(1), Element(1), Element(-2), Element(-2)]
If your instances take a lot of arguments, and you want a lot of instances with the same arguments, repeating yourself like this can be a bad thing. But you can use a loop—either an explicit statement, or comprehension—to make it better:
for _ in range(50):
atoms.append(Element(group=16, valence=2, number=16, weight=32.066))
… or:
atoms.extend(Element(group=16, valence=2, number=16, weight=32.066)
for _ in range(50))
* Of course you may even want further subclasses, e.g., to distinguish Oxygen-16, Oxygen-17, Oxygen-18, or maybe even different mixtures, like the 99.762% Oxygen-16 with small amounts of -18 and tiny bits of the others that's standard in Earth's atmosphere, vs. the different mixture that was common millions of years ago…

Python: iterating through a list of objects within a list of objects

I've made two classes called House and Window. I then made a list containing four Houses. Each instance of House has a list of Windows. I'm trying to iterate over the windows in each house and print it's ID. However, I seem to get some odd results :S I'd greatly appreciate any help.
#!/usr/bin/env python
# Minimal house class
class House:
ID = ""
window_list = []
# Minimal window class
class Window:
ID = ""
# List of houses
house_list = []
# Number of windows to build into each of the four houses
windows_per_house = [1, 3, 2, 1]
# Build the houses
for new_house in range(0, len(windows_per_house)):
# Append the new house to the house list
house_list.append(House())
# Give the new house an ID
house_list[new_house].ID = str(new_house)
# For each new house build some windows
for new_window in range(0, windows_per_house[new_house]):
# Append window to house's window list
house_list[new_house].window_list.append(Window())
# Give the window an ID
house_list[new_house].window_list[new_window].ID = str(new_window)
#Iterate through the windows of each house, printing house and window IDs.
for house in house_list:
print "House: " + house.ID
for window in house.window_list:
print " Window: " + window.ID
####################
# Desired output:
#
# House: 0
# Window: 0
# House: 1
# Window: 0
# Window: 1
# Window: 2
# House: 2
# Window: 0
# Window: 1
# House: 3
# Window: 0
####################

Currently you are using class attributes instead of instance attributes. Try changing your class definitions to the following:
class House:
def __init__(self):
self.ID = ""
self.window_list = []
class Window:
def __init__(self):
self.ID = ""
The way your code is now all instances of House are sharing the same window_list.

Here's the updated code.
# Minimal house class
class House:
def __init__(self, id):
self.ID = id
self.window_list = []
# Minimal window class
class Window:
ID = ""
# List of houses
house_list = []
# Number of windows to build into each of the for houses
windows_per_house = [1, 3, 2, 1]
# Build the houses
for new_house in range(len(windows_per_house)):
# Append the new house to the house list
house_list.append(House(str(new_house)))
# For each new house build some windows
for new_window in range(windows_per_house[new_house]):
# Append window to house's window list
house_list[new_house].window_list.append(Window())
# Give the window an ID
house_list[new_house].window_list[new_window].ID = str(new_window)
#Iterate through the windows of each house, printing house and window IDs.
for house in house_list:
print "House: " + house.ID
for window in house.window_list:
print " Window: " + window.ID
The actual problem is that the window_list attribute is mutable, so when the different instances are using it, they end up sharing the same one. By moving window_list into __init__ each instance gets its own.

C++, Java, C# etc. have this really strange behaviour regarding instance variables, whereby data (members, or fields, depending on which culture you belong to) that's described within a class {} block belongs to instances, while functions (well, methods, but C++ programmers seem to hate that term and say "member functions" instead) described within the same block belong to the class itself. Strange, and confusing, when you actually think about it.
A lot of people don't think about it; they just accept it and move on. But it actually causes confusion for a lot of beginners, who assume that everything within the block belongs to the instances. This leads to bizarre (to experienced programmers) questions and concerns about the per-instance overhead of these methods, and trouble wrapping their heads around the whole "vtable" implementation concept. (Of course, it's mostly the teachers' collective fault for failing to explain that vtables are just one implementation, and for failing to make clear distinctions between classes and instances in the first place.)
Python doesn't have this confusion. Since in Python, functions (including methods) are objects, it would be bizarrely inconsistent for the compiler to make a distinction like that. So, what happens in Python is what you should intuitively expect: everything within the class indented block belongs to the class itself. And, yes, Python classes are themselves objects as well (which gives a place to put those class attributes), and you don't have to jump through standard library hoops to use them reflectively. (The absence of manifest typing is quite liberating here.)
So how, I hear you protest, do we actually add any data to the instances? Well, by default, Python doesn't restrict you from adding anything to any instance. It doesn't even require you to make different instances of the same class contain the same attributes. And it certainly doesn't pre-allocate a single block of memory to contain all the object's attributes. (It would only be able to contain references, anyway, given that Python is a pure reference-semantics language, with no C# style value types or Java style primitives.)
But obviously, it's a good idea to do things that way, so the usual convention is "add all the data at the time that the instance is constructed, and then don't add any more (or delete any) attributes".
"When it's constructed"? Python doesn't really have constructors in the C++/Java/C# sense, because this absence of "reserved space" means there's no real benefit to considering "initialization" as a separate task from ordinary assignment - except of course the benefit of initialization being something that automatically happens to a new object.
So, in Python, our closest equivalent is the magic __init__ method that is automatically called upon newly-created instances of the class. (There is another magic method called __new__, which behaves more like a constructor, in the sense that it's responsible for the actual creation of the object. However, in nearly every case we just want to delegate to the base object __new__, which calls some built-in logic to basically give us a little pointer-ball that can serve as an object, and point it to a class definition. So there's no real point in worrying about __new__ in almost every case. It's really more analogous to overloading the operator new for a class in C++.) In the body of this method (there are no C++-style initialization lists, because there is no pre-reserved data to initialize), we set initial values for attributes (and possibly do other work), based on the parameters we're given.
Now, if we want to be a little bit neater about things, or efficiency is a real concern, there is another trick up our sleeves: we can use the magic __slots__ attribute of the class to specify class attribute names. This is a list of strings, nothing fancy. However, this still doesn't pre-initialize anything; an instance doesn't have an attribute until you assign it. This just prevents you from adding attributes with other names. You can even still delete attributes from an object whose class has specified __slots__. All that happens is that the instances are given a different internal structure, to optimize memory usage and attribute lookup.
The __slots__ usage requires that we derive from the built-in object type, which we should do anyway (although we aren't required in Python 2.x, this is intended only for backwards-compatibility purposes).
Ok, so now we can make the code work. But how do we make it right for Python?
First off, just as with any other language, constantly commenting to explain already-self-explanatory things is a bad idea. It distracts the user, and doesn't really help you as a learner of the language, either. You're supposed to know what a class definition looks like, and if you need a comment to tell you that a class definition is a class definition, then reading the code comments isn't the kind of help you need.
With this whole "duck typing" thing, it's poor form to include data type names in variable (or attribute) names. You're probably protesting, "but how am I supposed to keep track of the type otherwise, without the manifest type declaration"? Don't. The code that uses your list of windows doesn't care that your list of windows is a list of windows. It just cares that it can iterate over the list of windows, and thus obtain values that can be used in certain ways that are associated with windows. That's how duck typing works: stop thinking about what the object is, and worry about what it can do.
You'll notice in the code below that I put the string conversion code into the House and Window constructors themselves. This serves as a primitive form of type-checking, and also makes sure that we can't forget to do the conversion. If someone tries to create a House with an ID that can't even be converted to a string, then it will raise an exception. Easier to ask for forgiveness than permission, after all. (Note that you actually have to go out of your way a bit in Python to create
As for the actual iteration... in Python, we iterate by actually iterating over the objects in a container. Java and C# have this concept as well, and you can get at it with the C++ standard library too (although a lot of people don't bother). We don't iterate over indices, because it's a useless and distracting indirection. We don't need to number our "windows_per_house" values in order to use them; we just need to look at each value in turn.
How about the ID numbers, I hear you ask? Simple. Python provides us with a function called 'enumerate', which gives us (index, element) pairs given an input sequence of elements). It's clean, it lets us be explicit about our need for the index to solve the problem (and the purpose of the index), and it's a built-in that doesn't need to be interpreted like the rest of the Python code, so it doesn't incur all that much overhead. (When memory is a concern, it's possible to use a lazy-evaluation version instead.)
But even then, iterating to create each house, and then manually appending each one to an initially-empty list, is too low-level. Python knows how to construct a list of values; we don't need to tell it how. (And as a bonus, we typically get better performance by letting it do that part itself, since the actual looping logic can now be done internally, in native C.) We instead describe what we want in the list, with a list comprehension. We don't have to walk through the steps of "take each window-count in turn, make the corresponding house, and add it to the list", because we can say "a list of houses with the corresponding window-count for each window-count in this input list" directly. That's arguably clunkier in English, but much cleaner in a programming language like Python, because you can skip a bunch of the little words, and you don't have to expend effort to describe the initial list, or the act of appending the finished houses to the list. You don't describe the process at all, just the result. Made-to-order.
Finally, as a general programming concept, it makes sense, whenever possible, to delay the construction of an object until we have everything ready that's needed for that object's existence. "Two-phase construction" is ugly. So we make the windows for a house first, and then the house (using those windows). With list comprehensions, this is simple: we just nest the list comprehensions.
class House(object):
__slots__ = ['ID', 'windows']
def __init__(self, id, windows):
self.ID = str(id)
self.windows = windows
class Window(object):
__slots__ = ['ID']
def __init__(self, id):
self.ID = str(id)
windows_per_house = [1, 3, 2, 1]
# Build the houses.
houses = [
House(house_id, [Window(window_id) for window_id in range(window_count)])
for house_id, window_count in enumerate(windows_per_house)
]
# See how elegant the list comprehensions are?
# If you didn't quite follow the logic there, please try **not**
# to imagine the implicitly-defined process as you trace through it.
# (Pink elephants, I know, I know.) Just understand what is described.
# And now we can iterate and print just as before.
for house in houses:
print "House: " + house.ID
for window in house.windows:
print " Window: " + window.ID

Apart from some indentation errors, you're assigning the IDs and window_lists to the class and not the instances.
You want something like
class House():
def __init__(self, ID):
self.ID = ID
self.window_list = []
etc.
Then, you can do house_list.append(House(str(newHouse))) and so on.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.