How to directly access class instances through class dictionary in Python - python

I need a way of accessing directly an instance of a class through an ID number.
As I tried to explain here, I am importing a .csv file and I want to create an instance of my class Person() for every line in the .csv file, plus I want to be able to directly access these instances using as a key a unique identifier, already present in the .csv file.
What I have done so far, thanks to the help of user433831, is this:
from sys import argv
from csv import DictReader
from person import Person
def generateAgents(filename):
reader = DictReader(open(filename, 'rU'))
persons = [Person(**entry) for entry in reader]
return persons
where person is just the module where I define the class Person() as:
class Person:
def __init__(self, **kwds):
self.__dict__.update(kwds)
Now I have a list of my person instances, namely persons, which is already something neat.
But now I need to create a network among these persons, using the networkx module, and I definitely need a way to access directly every person (at present my instances don't have any name).
For example, every person has an attribute called "father_id", which is the unique ID of the father. Persons not having a father alive in the current population has a "father_id" equal to "-1".
Now, to link every person to his/her father, I'd do something like:
import networkx as nx
G=nx.Graph()
for person in persons:
G.add_edge(person, person_with_id_equal_to_father_id)
My problem is that I am unable to access directly this "person_with_id_equal_to_father_id".
Keep in mind that I will need to do this direct access many many times, so I would need a pretty efficient way of doing it, and not some form of searching in the list (also considering that I have around 150000 persons in my population).
It would be great to implement something like a dictionary feature in my class Person(), with the key of every instance being a unique identifier. This unique identifier is already present in my csv file and therefore I already have it as an attribute of every person.
Thank you for any help, as always greatly appreciated. Also please keep in mind I am a total python newbie (as you can probably tell... ;) )

Simply use a dictionary:
persons_by_id = {p.id: p for p in persons}
This requires a recent version of Python. If yours doesn't support this syntax, use the following:
persons_by_id = dict((p.id, p) for p in persons)
Having done either of the above, you can locate the person by their id like so:
persons_by_id[id]
The networkx example becomes:
import networkx as nx
G=nx.Graph()
for person in persons:
if person.father_id != -1:
G.add_edge(person, persons_by_id[person.father_id])

Here's an efficient way to do what #Toote suggests:
def generateAgents(filename):
with open(filename, 'rU') as input:
reader = DictReader(input)
persons = dict((entry['father_id'], Person(**entry)) for entry in reader)
return persons

Related

Creating Object With A For Loop

Firstly, I do apologise as I'm not quite sure how to word this query within the Python syntax. I've just started learning it today having come from a predominantly PowerShell-based background.
I'm presently trying to obtain a list of projects within our organisation within Google Cloud. I want to display this information in two columns: project name and project number - essentially an object. I then want to be able to query the object to say: where project name is "X", give me the project number.
However, I'm rather having difficulty in creating said object. My code is as follows:
import os
from pprint import pprint
from googleapiclient import discovery
from oauth2client.client import GoogleCredentials
credentials = GoogleCredentials.get_application_default()
service = discovery.build('cloudresourcemanager', 'v1', credentials=credentials)
request = service.projects().list()
response = request.execute()
projects = response.get('projects')
The 'projects' variable then seems to be a list, rather than an object I can explore and run queries against. I've tried running things like:
pprint(projects.name)
projects.get('name')
Both of which return the error:
"AttributeError: 'list' object has no attribute 'name'"
I looked into creating a Class within a For loop as well, which nearly gave me what I wanted, but only displayed one project name and project number at a time, rather than the entire collection I can query against:
projects=[]
for project in response.get('projects', []):
class ProjectClass:
name = project['name']
projectNumber = project['projectNumber']
projects.append(ProjectClass.name)
projects.append(ProjectClass.projectNumber)
I thought if I stored each class in a list it might work, but alas, no such joy! Perhaps I need to have the For loop within the class variables?
Any help with this would be greatly appreciated!
As #Code-Apprentice mentioned in a comment, I think you are missing a critical understanding of object-oriented programming, namely the difference between a class and an object. Think of a class as a "blueprint" for creating objects. I.E. your class ProjectClass tells python that objects of type ProjectClass will have two fields, name and projectNumber. However, ProjectClass itself is just the blueprint, not an object. You then need to create an instance of ProjectClass, which you would do like so:
project_class_1 = ProjectClass()
Great, now you have an object of type ProjectClass, and it will have fields name and projectNumber, which you can reference like so:
project_class_1.name
project_class_1.projectNumber
However, you will notice that all instances of the class that you create will have the same value for name and projectNumber, this just won't do! We need to be able to specify values when we create each instance. Enter init(), a special python method colloquially referred to as the constructor. This function is called by python automatically when we create a new instance of our class as above, and is responsible for setting up all the fields of that class. Another powerful feature of classes and objects is that you can define a collection of different functions that can be called at will.
class ProjectClass:
def __init__(self, name, projectNumber):
self.name = name
self.projectNumber = projectNumber
Much better. But wait, what's that self variable? Well, just as before we were able reference the fields of our instance via the "project_class_1" variable name, we need a way to access the fields of our instance when we're running functions that are a part of that instance, right? Enter self. Self is another python builtin parameter that contains a reference to the current instance of the ProjectClass that is being accessed. That way, we can set fields on the instance of the class that will persist, but not be shared or overwritten by other instances of the ProjectClass. It's important to remember that the first argument passed to any function defined on a class will always be self (except for some edge-cases you don't need to worry about now).
So restructuring your code, you would have something like this:
class ProjectClass:
def __init__(self, name, projectNumber):
self.name = name
self.projectNumber = projectNumber
projects = []
for project in response.get('projects', []):
projects.append(ProjectClass(project["name"], project["projectNumber"])
Hopefully I've explained this well and given you a complete answer on how all these pieces fit together. The hope is for you to be able to write that code on your own and not just give you the answer!

I can't think of a way to aviod dynamic variable in this case

EDIT BELOW
I read a lot of discussion about dynamic variable in python and a lot have shown that they are things that you generally shouldn't do. However I have a case that I really can't think of a way to avoid it (and hence why I am here)
I am currently reading in a text file that stores the information of different members of a store. Each members have their own information: like their phone number, email, points in their accounts, etc. I want to create a class and objects that stores this information. Don't worry this is just a part of an assignment and they are not real people. Here is the sample code:
class Member:
def __init__(self, name, phoneNumber, email, points):
self.name = name
self.phoneNumber = phoneNumber
self.email = email
self.points = points
self.totalPointsSpent = 0
#There are methods below that will return some calculated results, like total points
spent or their ranking. I will not show those here as they are irrelevant
And for each member in the file will be read in and create an object out of it. For example if there are 5 members in that file, five objects will be created, and I want to name them member1, member2, etc. However, this will be confusing when accessing it, as it will be hard to tell them apart, so I add them to a dictionary, with their memberID as the key:
dictonary[memberID] = member1
dictonary[memberID] = member2 #and so on
which would result in a dictionary that look like something like this:
dictionary = {'jk1234':member1,'gh5678':member2,...#etc}
This is the interesting thing about python is the fact that the dictionary value does not need to be a value, it can be an object. This is something new to me, coming from Java.
However, here is the first problem. I do not know how many member are there in a file, or I should say, the number of members in the file varies from file to file. If the number is 5, I need 5 variables; if there are 8, I need 8, and so on. I have thought of using a while loop, and the code will look something like this:
a = len(#the number of members in the file.)
i = 0
while i <= a:
member + i = Member(#pass in the information)
but the '+' operator only works for strings when combining names, not for identifiers. And thus cannot work (I think).
Many solution I read has indicated that I should use a dictionary or a list in such case. However, since my variables are pointing towards a class object, I cannot think of a way to use list/dictionary as the implementation.
My current solution is to use a tuple to store the members information, and do the calculations elsewhere (i.e. I took out the class methods and defined them as functions) so that the dictionary looks like this:
dictionary = {'jk1234': (name, phoneNumber, email, points),'gh5678':(name, phoneNumber, email, points),...#etc}
However given the amount of information I need to pass in it is less than ideal. My current algorithm works, but I want to optimized it and make it more encapsulated.
If you made this far I appreciate it, and would like to know if there is a better algorithm for this problem. Please excuse me for my less than a year of Python experience.
EDIT: Okay I just discovered something really interesting that might seems basic to experienced Python programmers. That is you can pass the values inside the dictionary, instead of naming a variable exclusively. For my problem, that would be:
dictionary = {'jk1234': Member(name, phoneNumber, email, points),'gh5678':Member(name, phoneNumber, email, points),...#etc}
#and they are retrievable:
dictionary['jk1234'].get_score() #get_score is a getter function inside the class
And it would return the proper value. This seems to be a good solution. However, I would still love to hear other ways to think about this problem
It looks like you are on the right track with your update.
It's not clear to me if your MemberID is a value for each member, but if it is, you can use it in a loop creating the objects
d = {}
for member in members_file:
d[MemberID] = Member(name, phoneNumber, email, points)
This, of course, assumes that MemberID is unique, otherwise you will overwrite existing entries.
You could also use a list instead
l = []
for member in members_file:
l.append(Member(name, phoneNumber, email, points))
Further, you could also do the above with list/dict comprehensions which are basically just condensed for-loops.
Dict-comprehension
d = {MemberID: Member(name, phoneNumber, email, points) for member in members_file}
List-comprehension
l = [Member(name, phoneNumber, email, points) for member in members_file]
Whether a list or a dictionary makes more sense, is up to your use-case.
Also note that the above assumes various things, e.g. on what form you get the data from the file, since you did not provide that information.
You can create a dictionary with a for loop:
d = {}
for i in the_members_in_the_file:
d[i] = Member(your, parameters, that\'s, inputted)
You will get your expected dictionary.

Why shouldn't one dynamically generate variable names in python?

Right now I am learning Python and struggling with a few concepts of OOP, one of that being how difficult it is (to me) to dynamically initialize class instances and assign them to a dynamically generated variable name and why I am reading that I shouldn't do that in the first place.
In most threads with a similar direction, the answer seems to be that it is un-Pythonic to do that.
For example generating variable names on fly in python
Could someone please elaborate?
Take the typical OOP learning case:
LOE = ["graham", "eric", "terry_G", "terry_J", "john", "carol"]
class Employee():
def __init__(self, name, job="comedian"):
self.name = name
self.job = job
Why is it better to do this:
employees = []
for name in LOE:
emp = Employee(name)
employees.append(emp)
and then
for emp in employees:
if emp.name == "eric":
print(emp.job)
instead of this
for name in LOE:
globals()[name] = Employee(name)
and
print(eric.job)
Thanks!
If you dynamically generate variable names, you don't know what names exist, and you can't use them in code.
globals()[some_unknown_name] = Foo()
Well, now what? You can't safely do this:
eric.bar()
Because you don't know whether eric exists. You'll end up having to test for eric's existence using dictionaries/lists anyway:
if 'eric' in globals(): ...
So just store your objects in a dictionary or list to begin with:
people = {}
people['eric'] = Foo()
This way you can also safely iterate one data structure to access all your grouped objects without needing to sort them from other global variables.
globals() gives you a dict which you can put names into. But you can equally make your own dict and put the names there.
So it comes down to the idea of "namespaces," that is the concept of isolating similar things into separate data structures.
You should do this:
employees = {}
employees['alice'] = ...
employees['bob'] = ...
employees['chuck'] = ...
Now if you have another part of your program where you describe parts of a drill, you can do this:
drill['chuck'] = ...
And you won't have a name collision with Chuck the person. If everything were global, you would have a problem. Chuck could even lose his job.

Names of instances and loading objects from a database

I got for example the following structure of a class.
class Company(object):
Companycount = 0
_registry = {}
def __init__(self, name):
Company.Companycount +=1
self._registry[Company.Companycount] = [self]
self.name = name
k = Company("a firm")
b = Company("another firm")
Whenever I need the objects I can access them by using
Company._registry
which gives out a dictionary of all instances.
Do I need reasonable names for my objects since the name of the company is a class attribute, and I can iterate over Company._registry?
When loading the data from the database does it matter what the name of the instance (here k and b) is? Or can I just use arbitrary strings?
Both your Company._registry and the names k and b are just references to your actual instances. Neither play any role in what you'd store in the database.
Python's object model has all objects living on a big heap, and your code interacts with the objects via such references. You can make as many references as you like, and objects automatically are deleted when there are no references left. See the excellent Facts and myths about Python names and values article by Ned Batchelder.
You need to decide, for yourself, if the Company._registry structure needs to have names or not. Iteration over a list is slow if you already have a name for a company you wanted to access, but a dictionary gives you instant access.
If you are going to use an ORM, then you don't really need that structure anyway. Leave it to the ORM to help you find your objects, or give you a sequence of all objects to iterate over. I recommend using SQLAlchemy for this.
the name doesn't matter but if you are gonna initialize a lot of objects you are still gonna make it reasonable somehow

Unpickle class instances by interating over a list

I am trying to unpickle various class instances which are saved in separate .pkl files by iterating over a list containing all the class instances (each class instance appends itself to the appropriate list when instantiated).
This works:
# LOAD IN INGREDIENT INSTANCES
for each in il:
with open('Ingredients/{}.pkl'.format(each), 'rb') as f:
globals()[each] = pickle.load(f)
For example, one ingredient is Aubergine:
print(Aubergine)
output:
Name: Aubergine
Price: £1.00
Portion Size: 1
However, this doesn't work:
# LOAD IN RECIPE INSTANCES
for each in rl:
with open('Recipes/{}.pkl'.format(each.name), 'rb') as f:
globals()[each] = pickle.load(f)
I can only assume that the issue stems from each.name being used for the file names of the recipes, whereas each is used for the ingredient file names. This is intentional, however, as the name attribute of the recipes is formatted for the end-user (i.e. contains white space etc.) I think this may be the issue, but I am not sure.
Both the ingredient and recipe classes use:
def __repr__(self):
return self.name
For example:
I have a recipe class instance SausageAubergineRagu, for which self.name is 'Sausage & Aubergine Ragu', and this is inside the list rl. I have tried testing this individually:
input:
rl
output:
[Sausage & Aubergine Ragu]
So I believe that this code:
# LOAD IN RECIPE INSTANCES
for each in rl:
with open('Recipes/{}.pkl'.format(each.name), 'rb') as f:
globals()[each] = pickle.load(f)
...should result in this:
with open('Recipes/Sausage & Aubergine Ragu.pkl', 'rb') as f:
globals()[SausageAubergineRagu] = pickle.load(f)
But attempting to access the recipe class instances results in a NameError.
One final note - please don't ask why I am doing things this way. Instead help me to address and solve the problem, so I can make it work, and understand what is going on. Appreciated :)
The NameError you are getting is Python telling you that you are trying to use a variable that hasn't been defined yet.
You aren't defining SausageAubergineRagu before you use it in this line:
globals()[SausageAubergineRagu] = pickle.load(f)
In your first example, you are adding keys and values to globals. You are using instances of recipes (each) as keys, and the pickled data as values.
In your second example, you are attempting to do the same thing, but instead of using instances of recipes (each) as keys, you are using SausageAubergineRagu, which is undefined.
How is Python supposed to know what SausageAubergineRagu is? If you want that line to work, you will need to define it first, or use something that is already defined, like each, which is what you do in your other snippet.
Honestly, using instances of custom classes as keys in globals seems bizarre to me anyway (usually people use strings), but since you apparently want to make it work, the answer is simple:
Define SausageAubergineRagu before attempting to use it as a key in a dictionary.

Categories