how to get class instances out of a dict in python - python

I have a class `Collection' that looks like this:
class Collection():
def __init__(self, db, collection_name):
self.db = db
self.collection_name = collection_name
if not hasattr(self.__class__, 'client'):
self.__class__.client = MongoClient()
self.data_base = getattr(self.client, self.db)
self.collection = getattr(self.data_base, self.collection_name)
def getCollectionKeys(self):
....etc.
I cleverly created a function to create a dictionary of class instances as follows:
def getCollections():
collections_dict = {}
for i in range(len(db_collection_names)):
collections_dict[db_collection_names[i]] = Collection(database_name, db_collection_names[i])
return collections_dict
it works. however, whenever I want to access a class instance, I have to go through the dictionary:
agents_keys = collections_dict['agents'].getCollectionKeys()
I would love to just write:
agents_keys = agents.getCollectionKeys()
Is there a simple way to get those instances "out" of the dict?

You can get a reference to items in a vanilla python dictionary using a generator object in a for loop, or by using a list expression.
agent_keys = [x.getCollectionKeys() for x in collections_dict.values()]
or this
agent_keys = []
for name in db_collection_names:
#do something with individual item
#there could also be some logic here about which keys to append
agent_keys.append(collections_dict[name].getCollectionKeys())
#now agent_keys is full of all the keys
My mental model of how objects are interacted with in python. Feel free to edit if you actually know how it works.
You cannot "take" items of the dictionary per say unless you call the del operator which removes the association of a variable name (that is what you type in the editor like "foo" and "bar") with an object ( the actual collections of bits in the program your machine sees). What you can do is get a reference to the object, which in python is a symbol that for all your intents and purposes is the object you want.
The dictionary just holds a bunch of references to your database objects.
The expression collections_dict['agents'] is equivalent to your original database object that you put into the dictionary like this
collections_dict['agents'] = my_particular_object

Related

Python instance variable hiding class variables

class Arrival(db.Document):
...
articles = db.ListField(db.EmbeddedDocumentField(ArticlesArrival))
articles_ncmd = db.ListField(db.EmbeddedDocumentField(ArticlesArrival))
...
def __repr__(self):
return '<Arrival {}>'.format(self.code)
def __unicode__(self):
return self.code
def modify_article_arrival(self, data, item_id, index):
index_dict = {'ncmd': self.articles_ncmd, 'cmd': self.articles}
item_to_update = next(i for i in index_dict[index] if i._id == ObjectId(item_id))
I have a mongodb collection represented by model Arrival. I have two Embedded documents represented by articles and articles_ncmd which contain similar data but whose context of use is different. The modify_article_arrival method allows me to update an element of one of the two Embedded documents according to the id of the element (item_id) and its index. The index can be either cmd or ncmd and is used to determine which of the two Embedded documents to manipulate.
Currently I am storing the reference to the two Embedded documents in a dictionary mapped with the two indexes (index_dict). I would like to be able to make this dictionnary global to the whole class to use in other methods. The problem is that if I put this in the __init__ method, the fields of my model defined as class variables are no longer accessible.
def __init__(self):
self.index = {'ncmd': self.articles_ncmd, 'cmd': self.articles}
def modify_article_arrival(self, data, item_id, index):
item_to_update = next(i for i in self.index[index] if i._id == ObjectId(item_id))
So my question is: what is the most pythonic strategy to achieve this?

Method __init__ has too many parameters

I'm super new to Python (I started about 3 weeks ago) and I'm trying to make a script that scrapes web pages for information. After it's retrieved the information it runs through a function to format it and then passes it to a class that takes 17 variables as parameters. The class uses this information to calculate some other variables and currently has a method to construct a dictionary. The code works as intended but a plugin I'm using with Pycharm called SonarLint highlights that 17 variables is too many to use as parameters?
I've had a look for alternate ways to pass the information to the class, such as in a tuple or a list but couldn't find much information that seemed relevant. What's the best practice for passing many variables to a class as parameters? Or shouldn't I be using a class for this kind of thing at all?
I've reduced the amount of variables and code for legibility but here is the class;
Class GenericEvent:
def __init__(self, type, date_scraped, date_of_event, time, link,
blurb):
countdown_delta = date_of_event - date_scraped
countdown = countdown_delta.days
if countdown < 0:
has_passed = True
else:
has_passed = False
self.type = type
self.date_scraped = date_scraped
self.date_of_event = date_of_event
self.time = time
self.link = link
self.countdown = countdown
self.has_passed = has_passed
self.blurb = blurb
def get_dictionary(self):
event_dict = {}
event_dict['type'] = self.type
event_dict['scraped'] = self.date_scraped
event_dict['date'] = self.date_of_event
event_dict['time'] = self.time
event_dict['url'] = self.link
event_dict['countdown'] = self.countdown
event_dict['blurb'] = self.blurb
event_dict['has_passed'] = self.has_passed
return event_dict
I've been passing the variables as key:value pairs to the class after I've cleaned up the data the following way:
event_info = GenericEvent(type="Lunar"
date_scraped=30/01/19
date_of_event=28/07/19
time=12:00
link="www.someurl.com"
blurb="Some string.")
and retrieving a dictionary by calling:
event_info.get_dictionary()
I intend to add other methods to the class to be able to perform other operations too (not just to create 1 dictionary) but would like to resolve this before I extend the functionality of the class.
Any help or links would be much appreciated!
One option is a named tuple:
from typing import Any, NamedTuple
class GenericEvent(NamedTuple):
type: Any
date_scraped: Any
date_of_event: Any
time: Any
link: str
countdown: Any
blurb: str
#property
def countdown(self):
countdown_delta = date_of_event - date_scraped
return countdown_delta.days
#property
def has_passed(self):
return self.countdown < 0
def get_dictionary(self):
return {
**self._asdict(),
'countdown': self.countdown,
'has_passed': self.has_passed,
}
(Replace the Anys with the fields’ actual types, e.g. datetime.datetime.)
Or, if you want it to be mutable, a data class.
I don't think there's anything wrong with what you're doing. You could, however, take your parameters in as a single dict object, and then deal with them by iterating over the dict or doing something explicitly with each one. Seems like that would, in your case, make your code messier.
Since all of your parameters to your constructor are named parameters, you could just do this:
def __init__(self, **params):
This would give you a dict named params that you could then process. The keys would be your parameter names, and the values the parameter values.
If you aligned your param names with what you want the keys to be in your get_dictionary method's return value, saving off this parameter as a whole could make that method trivial to write.
Here's an abbreviated version of your code (with a few syntax errors fixed) that illustrates this idea:
from pprint import pprint
class GenericEvent:
def __init__(self, **params):
pprint(params)
event_info = GenericEvent(type="Lunar",
date_scraped="30/01/19",
date_of_event="28/07/19",
time="12:00",
link="www.someurl.com",
blurb="Some string.")
Result:
{'blurb': 'Some string.',
'date_of_event': '28/07/19',
'date_scraped': '30/01/19',
'link': 'www.someurl.com',
'time': '12:00',
'type': 'Lunar'}

Python Dict Switcher results in memory leak

I have extensively read about immutable and mutable objects in Python for a couple of months now and I seem to begin to understand the concept. Still I cannot spot the problem why my code below produces memory leaks. The dicts function as references to immutable records of specific type. In many cases, I get an update of an existing record, in this case, the existing record will only be updated if the two records (oldrecord and newrecord) are not equal. However, I have the feeling that newrecord gets never deleted if oldrecord and newrecord match, although all references appear to cease to exist in such a case.
My question:
Is the code below good practice for selecting a reference to a dict based on record type or should I do it differently (e.g. through dictSwitcher)?
class myRecordDicts():
def __init__(self, type1Dict=dict(), type2Dict=dict(),
type3Dict=dict(),type4Dict=dict(),type5Dict=dict(),type6Dict=dict(), type7Dict=dict()):
self.type1Dict = type1Dict
self.type2Dict = type2Dict
self.type3Dict = type3Dict
self.type4Dict = type4Dict
self.type5Dict = type5Dict
self.type6Dict = type6Dict
self.type7Dict = type7Dict
def dictSelector(self, record):
dictSwitcher = {
myCustomRecordType1().name: self.type1Dict,
myCustomRecordType2().name: self.type2Dict,
myCustomRecordType3().name: self.type3Dict,
myCustomRecordType4().name: self.type4Dict,
myCustomRecordType5().name: self.type5Dict,
myCustomRecordType6().name: self.type6Dict,
myCustomRecordType7().name: self.type7Dict,
}
return dictSwitcher.get(record.name)
def AddRecordToDict(self, newrecord):
dict = self.dictSelector(newrecord)
recordID = newrecord.id
if recordID in dict:
oldrecord = dict[recordID]
self.MergeExistingRecords(oldrecord,newrecord)
else:
dict[recordID] = newrecord
def MergeExistingRecords(self, oldrecord, newrecord):
# Basic Compare function
oldRecordString = oldrecord.SerializeToString()
newRecordString = newrecord.SerializeToString()
# no need to do anything if same length
if not len(oldRecordString) == len(newRecordString):
oldrecord.CustomMergeFrom(newrecord)
Well, it seems always like that: I was working for hours on this problem and could not make progress. 5 Minutes after formulating the question correctly on StackExchange, I found my issue:
I needed to remove the references in init since I was never passing dicts when instantiating myRecordsDicts(), the following code does not leak memory:
class myRecordDicts():
def __init__(self):
self.type1Dict = dict()
self.type2Dict = dict()
self.type3Dict = dict()
self.type4Dict = dict()
self.type5Dict = dict()
self.type6Dict = dict()
self.type7Dict = dict()

Set a nested object property from a dot notation string in python

I’m writing a function that takes in a parent object data and a string inputString that may or may not include dot notation to represent nested objects (i.e. ‘nestedObject.itemA). The function should set the inputString attribute of data to a random string. If the string inputString is a nested object, the function should set the nested object’s value to be a random string. I can’t figure out how to handle this all in a for-loop. I want to do something like this:
split_objects = value.split(“.”)
for item in split_objects:
data.__setattr__(item, get_random_string())
However, in the case of nested objects, the above would set the nested object to be a random string, instead of the field inside. Would someone be able to help me with the syntax to handle both cases? Thanks in advance…
You need to get a reference to data.nestedObject before you can use setattr to change data.nestedObject.itemA.
prefix, suffix = value.rsplit(".",1)
# now prefix is nestedOjbect and suffix is itemA
ref = getattr(data,prefix)
setattr(ref,suffix,get_random_string())
You need to get the reference as many times as there are dots in inputString. So, if you have an arbitrarily deeply nested structure in data
value = "nestedObject.nestedObject2.nestedObject3.itemA"
path, attribute = value.rsplit(".",1)
path = path.split(".")
ref = data
while path:
element, path = path[0], path[1:]
ref = getattr(ref, element)
setattr(ref, attribute, get_random_string())
Here is some example code I to demo a "setField" function I wrote that similar to what you are looking for:
def setField(obj, fieldPath, value):
fields = fieldPath.split(".")
cur = obj
# use all but the last field to traverse the objects
for field in fields[:-1]:
cur = getattr(cur, field)
# use the last field as the property within the object to be overwritten (not traversed)
setattr(cur, fields[-1], value)
# USE CASE EXAMPLE:
class PrintBase:
def dump(self, level=0):
for key, value in vars(self).iteritems():
print " "*(level*4) + key + ":", value
if isinstance(value, PrintBase):
value.dump(level+1)
class BottomObject(PrintBase):
def __init__(self):
self.fieldZ = 'bottomX'
class MiddleObject(PrintBase):
def __init__(self):
self.fieldX = 'middleQ'
self.fieldY = BottomObject()
class TopObject(PrintBase):
def __init__(self):
self.fieldA = 'topA'
self.fieldB = MiddleObject()
top_obj = TopObject()
print "=== BEFORE ==="
top_obj.dump()
print "=== AFTER ==="
setField(top_obj, 'fieldB.fieldY.fieldZ', '!!!! test value !!!!')
top_obj.dump()
And here is the example output:
=== BEFORE ===
fieldB: <__main__.MiddleObject instance at 0x7f5eb1cc6b48>
fieldX: middleQ
fieldY: <__main__.BottomObject instance at 0x7f5eb1cc6b90>
fieldZ: bottomX
fieldA: topA
=== AFTER ===
fieldB: <__main__.MiddleObject instance at 0x7f5eb1cc6b48>
fieldX: middleQ
fieldY: <__main__.BottomObject instance at 0x7f5eb1cc6b90>
fieldZ: !!!! test value !!!!
fieldA: topA

Accessing elements of lists, and calling their fuctions

Here is Customer class:
class Customer:
def __init__(self, timestamp, cid, item_count):
self.time_stamp = timestamp
self.customer_name = cid
self.item_count = item_count
def checkout(self, new_timestamp):
self.time_stamp = new_timestamp
def get_cus_name(self):
return self.customer_name
If I create an empty list of Customer objects like:
customers = [Customer]
And then somewhere else I try to call Customer methods in a loop like:
def checkout_customer(self, cid):
for cus in self.customers:
if cus.get_cus_name == cid:
cus.checkout(self.cur_num_customers + 7)
why do I get an error when I try to call cus.checkout? My ide tells me that it expects a Customer but got an int. Why doesn't it pass itself into the 'self' arg here?
However if I just create a Customer object and directly call its methods, it works fine:
def foo(self):
cus = Customer(1,'pop',2)
cus.checkout(23)
This is my first time learning python, and ive been stuck trying to figure out lists, and accessing its members. Perhaps my initialization of self.custormers = [Customer] is incorrect?
EDIT:
In my constructor of tester class I create an empty list like this:
self.customer = [Customer]
I am able to add customers no problem:
def add_custormer(self, customer):
self.customers.append(customer)
My problem is not adding customers, but accessing their methods once they are in a list. Doing something like this self.customers[0].checkout(1,'pop',2) gives me an error "Expected type 'Customer' got int".
I am not sure of the class where checkout_customer lives but I am assuming you declare the list self.customers somewhere in it.
self.costumers = []
If you intend to add an element Customer to the list you should use something like: self.customers.append(Customer(x,y,z)) since you want to add a new customer to the list and when doing so you are required to initialize the Customer class.
I didn't try the code but I believe something like this should work:
def foo(self):
self.customers.append(Customer(1,'pop',2))
self.checkout_customers(23)

Categories