Using multiprocessing module in class - python

I have the following program and I want to use multiprocessing module. It uses external files, in which I call the PSO class from another file. costfunc is a function from another file and the other args are just variables.
Swarm is a list containing as much objects as the value of ps, and each object has multiple attributes which need to update at every iteration.
Following Hannu implemented multiprocessing.pool and it is working, however it is taking much more time than running sequentially.
Would appreciate if you could tell me what are the reasons for it happening and how can I make it run faster?
# IMPORT PACKAGES -----------------------------------------------------------+
import random
import numpy as np
# IMPORT FILES --------------------------------------------------------------+
from Reducer import initial
# Particle Class ------------------------------------------------------------+
class Particle:
def __init__(self,D,bounds_x,bounds_v):
self.Position_i = [] # particle position
self.Velocity_i = [] # particle velocity
self.Cost_i = -1 # cost individual
self.Position_Best_i = [] # best position individual
self.Cost_Best_i = -1 # best cost individual
self.Constraint_Best_i = [] # best cost individual contraints
self.Constraint_i = [] # constraints individual
self.Penalty_i = -1 # constraints individual
x0,v0 = initial(D,bounds_x,bounds_v)
for i in range(0,D):
self.Velocity_i.append(v0[i])
self.Position_i.append(x0[i])
# evaluate current fitness
def evaluate(self,costFunc,i):
self.Cost_i, self.Constraint_i,self.Penalty_i = costFunc(self.Position_i,i)
# check to see if the current position is an individual best
if self.Cost_i < self.Cost_Best_i or self.Cost_Best_i == -1:
self.Position_Best_i = self.Position_i
self.Cost_Best_i = self.Cost_i
self.Constraint_Best_i = self.Constraint_i
self.Penalty_Best_i = self.Penalty_i
return self
def proxy(gg, costf, i):
print(gg.evaluate(costf, i))
# Swarm Class ---------------------------------------------------------------+
class PSO():
def __init__(self,costFunc,bounds_x,bounds_v,ps,D,maxiter):
self.Cost_Best_g = -1 # Best Cost for Group
self.Position_Best_g = [] # Best Position for Group
self.Constraint_Best_g = []
self.Penalty_Best_g = -1
# Establish Swarm
Swarm = []
for i in range(0,ps):
Swarm.append(Particle(D,bounds_x,bounds_v))
# Begin optimization Loop
i = 1
self.Evol = []
while i <= maxiter:
pool = multiprocessing.Pool(processes = 4)
results = pool.map_async(partial(proxy, costf = costFunc, i=i), Swarm)
pool.close()
pool.join()
Swarm = results.get()
if Swarm[j].Cost_i< self.Cost_Best_g or self.Cost_Best_g == -1:
self.Position_Best_g = list(Swarm[j].Position_i)
self.Cost_Best_g = float(Swarm[j].Cost_i)
self.Constraint_Best_g = list(Swarm[j].Constraint_i)
self.Penalty_Best_g = float(Swarm[j].Penalty_i)
self.Evol.append(self.Cost_Best_g)
i += 1

You need a proxy function to do the function call, and as you need to deliver arguments to the function, you will need partial as well. Consider this:
from time import sleep
from multiprocessing import Pool
from functools import partial
class Foo:
def __init__(self, a):
self.a = a
self.b = None
def evaluate(self, CostFunction, i):
xyzzy = CostFunction(i)
sleep(0.01)
self.b = self.a*xyzzy
return self
def CostFunc(i):
return i*i
def proxy(gg, costf, i):
return gg.evaluate(costf, i)
def main():
Swarm = []
for i in range(0,10):
nc = Foo(i)
Swarm.append(nc)
p = Pool()
for i in range(100,102):
results = p.map_async(partial(proxy, costf=CostFunc, i=i), Swarm)
p.close()
p.join()
Swarm = []
for a in results.get():
Swarm.append(a)
for s in Swarm:
print (s.b)
main()
This creates a Swarm list of objects, and within each of these objects is evaluate that is the function you need to call. Then we have parameters (CostFunc and an integer as in your code).
We will now use Pool.map_async to map your Swarm list to your pool. This gives each worker one instance of Foo from your Swarm list, and we have a proxy function that actually calls then evaluate().
However, as apply_async only sends an object from the iterable to the function, instead of using proxy as the target function to pool, we use partial to create the target function to pass the "fixed" arguments.
And as you apparently want to get the modified objects back, this requires another trick. If you modify the target object in Pool process, it just modifies the local copy and throws it away as soon as the processing completes. There would be no way for the subprocess to modify main process memory anyway (or vice versa), this would cause a segmentation fault.
Instead, after modifying the object, we return self. When your pool has completed its work, we discard the old Swarm and reassemble it from the result objects.

Related

How to pass arguments to class after initialized?

I'm trying to create threads to run a class method. However, when I try to pass one class to another, it tries to initialize the class and never gets threaded.
I'm taking a list of tuples and trying to pass that list to the cfThread class, along with the class method that I want to use. From here, I'd like to create a separate thread to run the classes method and take action on one of tuples from the list. The REPLACEME is a placeholder because the class is looking for a tuple but I don't have one to pass to it yet. My end goal is to be able to pass a target (class / function) to a thread class that can create it's own queue and manage the threads without having to manually do it.
Below is a simple example to hopefully do a better job of explaining what I'm trying to do.
#!/bin/python3.10
import concurrent.futures
class math:
def __init__(self, num) -> None:
self.num = num
def add(self):
return self.num[0] + self.num[1]
def sub(self):
return self.num[0] - self.num[1]
def mult(self):
return self.num[0] * self.num[1]
class cfThread:
def __init__(self, target, args):
self.target = target
self.args = args
def run(self):
results = []
with concurrent.futures.ThreadPoolExecutor(10) as execute:
threads = []
for num in self.args:
result = execute.submit(self.target, num)
threads.append(result)
for result in concurrent.futures.as_completed(threads):
results.append(result)
return results
if __name__ == '__main__':
numbers = [(1,2),(3,4),(5,6)]
results = cfThread(target=math(REPLACEME).add(), args=numbers).run()
print(results)
target has to be a callable; you want to wrap your call to add in a lambda expression.
results = cfThread(target=lambda x: math(x).add(), args=numbers)

Python Classes and inheritance. Calling super method returns Error

I have the following python code. The code analyses a digital signal for amplification processing however I am having trouble applying inheritance concepts to this for some reason.
I have a parent class and a child class, which sets the parent class as a parameter to inherit its properties in the child class.
But when I try accessing a method within the parent class using OOP inheritance principles I am getting an error. I originally tried extending the class and not overloading it, but when I tried to access the dofilter() method in the extended parent class, it would throw an error. However, if I try overloading the dofilter() method in the child class and then use "super" to call it from the parent class, there is no error, but returns "NaN", which clearly means I am doing something wrong still. I know data is being passed to the objects, there should be no reason why it returns NaN. It has left me at a little bit of an impass now. Can anyone please explain why this is not working?
What I have tried/my summarized script, to paint a better picture of the issue:
# -*- coding: utf-8 -*-
"""
Spyder Editor
This is a temporary script file.
"""
import numpy as np
import matplotlib.pyplot as plt
# =============================================================================
# CLASSES
# =============================================================================
class fir_filter:
def __init__(self, coefficients):
self.coefficients = coefficients
self.buffer = np.zeros(taps)
def dofilter(self, value, offset):
result = 0
#Splice coefficients and buffer arrays into smaller arrays
buffer_newest = self.buffer[0:offset+1]
buffer_oldest = self.buffer[offset+1:taps]
coefficients1 = self.coefficients[0:offset+1]
coefficients2 = self.coefficients[offset+1:taps]
#Accumulate
self.buffer[offset] = value
for i in range(len(coefficients1)):
result += coefficients1[i]*buffer_newest[offset-1-i]
for i in range(len(coefficients2), 0, -1):
result += coefficients2[len(coefficients2)-i] * buffer_oldest[i-1]
return result
class matched_filter(fir_filter):
def __init__(self,coefficients):
self.coefficients = coefficients
def dofilter(self,value,offset):
super().dofilter(value, offset)
return result
########################################
#START OF MAIN SCRIPT
########################################
#...
#REMOVED- import data files, create vars, perform various calculations.
########################################
#... RESUME CODE pertinent variables here
m_wavelet = (2/(np.sqrt(3*a*(np.pi**(1/4)))))*(1-((n/a)**2))*np.exp((-n**2)/(2*(a**2)))
m_wavelet = m_wavelet [::-1]
result = np.zeros(l)
offset = 0
plt.figure(6)
plt.plot(m_wavelet)
plt.plot(ecg_3[8100:8800]/len(ecg_3)/300)
########################################
#instantiated object here, no errors thrown
########################################
new_filter = matched_filter(m_wavelet)
########################################
#issue below
########################################
for k in range(len(ecg_3)):
result[k] = new_filter.dofilter(ecg_3[k], offset) #<- Every item in the array is "Nan"
offset += 1
if (offset == 2000):
offset = 0
########################################
#More removed code/end of script
########################################
Important bits:
Parent class:
class fir_filter:
def __init__(self, coefficients):
self.coefficients = coefficients
self.buffer = np.zeros(taps)
def dofilter(self, value, offset):
result = 0
#Splice coefficients and buffer arrays into smaller arrays
buffer_newest = self.buffer[0:offset+1]
buffer_oldest = self.buffer[offset+1:taps]
coefficients1 = self.coefficients[0:offset+1]
coefficients2 = self.coefficients[offset+1:taps]
#Accumulate
self.buffer[offset] = value
for i in range(len(coefficients1)):
result += coefficients1[i]*buffer_newest[offset-1-i]
for i in range(len(coefficients2), 0, -1):
result += coefficients2[len(coefficients2)-i] * buffer_oldest[i-1]
return result
Child Class:
class matched_filter(fir_filter):
def __init__(self,coefficients):
self.coefficients = coefficients
def dofilter(self,value,offset):
super().dofilter(value, offset)
return result
Code and issue:
new_filter = matched_filter(m_wavelet)
for k in range(len(ecg_3)):
result[k] = new_filter.dofilter(ecg_3[k], offset)
How can I access "dofilter()", in "fir_filter", which is inherited in "matched_filter"?
Any insight would be much appreciated. I can provide more code if necessary.

Poor Python Multiprocessing Performance

I attempted to speed up my python program using the multiprocessing module but I found it was quite slow.
A Toy example is as follows:
import time
from multiprocessing import Pool, Manager
class A:
def __init__(self, i):
self.i = i
def score(self, x):
return self.i - x
class B:
def __init__(self):
self.i_list = list(range(1000))
self.A_list = []
def run_1(self):
for i in self.i_list:
self.x = i
map(self.compute, self.A_list) #map version
self.A_list.append(A(i))
def run_2(self):
p = Pool()
for i in self.i_list:
self.x = i
p.map(self.compute, self.A_list) #multicore version
self.A_list.append(A(i))
def compute(self, some_A):
return some_A.score(self.x)
if __name__ == "__main__":
st = time.time()
foo = B()
foo.run_1()
print("Map: ", time.time()-st)
st = time.time()
foo = B()
foo.run_2()
print("MultiCore: ", time.time()-st)
The outcomes on my computer(Windows 10, Python 3.5) is
Map: 0.0009996891021728516
MultiCore: 19.34994912147522
Similar results can be observed on Linux Machine (CentOS 7, Python 3.6).
I guess it was caused by the pickling/depicking of objects among processes? I tried to use the Manager module but failed to get it to work.
Any help will be appreciated.
Wow that's impressive (and slow!).
Yes, this is because Objects must be accessed concurrently by workers, which is costly.
So I played a little bit and managed to gain a lot of perf by making the compute method static. So basically, you don't need to share the B object instance anymore. Still very slow but better.
import time
from multiprocessing import Pool, Manager
class A:
def __init__(self, i):
self.i = i
def score(self, x):
return self.i - x
x=0
def static_compute(some_A):
res= some_A.score(x)
return res
class B:
def __init__(self):
self.i_list = list(range(1000))
self.A_list = []
def run_1(self):
for i in self.i_list:
x=i
map(self.compute, self.A_list) #map version
self.A_list.append(A(i))
def run_2(self):
p = Pool(4)
for i in self.i_list:
x=i
p.map(static_compute, self.A_list) #multicore version
self.A_list.append(A(i))
The other reason that makes it slow, to me, is the fixed cost of using Pool. You're actually launching a Pool.map 1000 times. If there is a fixed cost associated with launching those processes, that would make the overall strategy slow. Maybe you should test that with longer A_list (longer than the i_list, which requires a different algo).
The reasoning behind this is:
the map call is performed by main
*meaning when foo.run_1() is called. The main is mapping for itself.
much like telling your self what to do.
*when foo_run2() is called the main is mapping for max process capablilites of that pc.
If your max process is 6, then the main is mapping for 6 Threads.
much like orginizing 6 people to tell you something.
Side Note:
if you use:
p.imap(self.compute,self.A_list)
the items will append in order to A_list

Python27: random() after a setstate() doesn't produce the same random number

I have been subclassing an Python's random number generator to make a generator that doesn't repeat results (it's going to be used to generate unique id's for a simulator) and I was just testing to see if it was consistent in it's behavior after it has been loaded from a previours state
Before people ask:
It's a singleton class
No there's nothing else that should be using that instance (a tear down sees to that)
Yes I tested it without the singleton instance to check
and yes when I create this subclass I do call a new instance ( super(nrRand,self).__init__())
And yes according to another post I should get consistent results see: Rolling back the random number generator in python?
Below is my test code:
def test_stateSavingConsitantcy(self):
start = int(self.r.random())
for i in xrange(start):
self.r.random()
state = self.r.getstate()
next = self.r.random()
self.r.setstate(state)
nnext = self.r.random()
self.assertEqual(next, nnext, "Number generation not constant got {0} expecting {1}".format(nnext,next))
Any help that can be provided would greatly appreciated
EDIT:
Here is my subclass as requested
class Singleton(type):
_instances = {}
def __call__(self, *args, **kwargs):
if self not in self._instances:
self._instances[self] = super(Singleton,self).__call__(*args,**kwargs)
return self._instances[self]
class nrRand(Random):
__metaclass__ = Singleton
'''
classdocs
'''
def __init__(self):
'''
Constructor
'''
super(nrRand,self).__init__()
self.previous = []
def random(self):
n = super(nrRand,self).random()
while n in self.previous:
n = super(nrRand,self).random()
self.previous.append(n)
return n
def seed(self,x):
if x is None:
x = long(time.time()*1000)
self.previous = []
count = x
nSeed = 0
while count < 0:
nSeed = super(nrRand,self).random()
count -= 1
super(nrRand,self).seed(nSeed)
while nSeed < 0:
super(nrRand,self).seed(nSeed)
count -= 1
def getstate(self):
return (self.previous, super(nrRand,self).getstate())
def setstate(self,state):
self.previous = state[0]
super(nrRand,self).setstate(state[1])
getstate and setstate only manipulate the state the Random class knows about; neither method knows that you also need to roll back the set of previously-generated numbers. You're rolling back the state inherited from Random, but then the object sees that it's already produced the next number and skips it. If you want getstate and setstate to work properly, you'll have to override them to set the state of the set of already-generated numbers.
UPDATE:
def getstate(self):
return (self.previous, super(nrRand,self).getstate())
This shouldn't directly use self.previous. Since you don't make a copy, you're returning the actual object used to keep track of what numbers have been produced. When the RNG produces a new number, the state returned by getstate reflects the new number. You need to copy self.previous, like so:
def getstate(self):
return (self.previous[:], super(nrRand, self).getstate())
I also recommend making a copy in setstate:
def setstate(self, state):
previous, parent_state = state
self.previous = previous[:]
super(nrRand, self).setstate(parent_state)

Dictionary vs Object - which is more efficient and why?

What is more efficient in Python in terms of memory usage and CPU consumption - Dictionary or Object?
Background:
I have to load huge amount of data into Python. I created an object that is just a field container. Creating 4M instances and putting them into a dictionary took about 10 minutes and ~6GB of memory. After dictionary is ready, accessing it is a blink of an eye.
Example:
To check the performance I wrote two simple programs that do the same - one is using objects, other dictionary:
Object (execution time ~18sec):
class Obj(object):
def __init__(self, i):
self.i = i
self.l = []
all = {}
for i in range(1000000):
all[i] = Obj(i)
Dictionary (execution time ~12sec):
all = {}
for i in range(1000000):
o = {}
o['i'] = i
o['l'] = []
all[i] = o
Question:
Am I doing something wrong or dictionary is just faster than object? If indeed dictionary performs better, can somebody explain why?
Have you tried using __slots__?
From the documentation:
By default, instances of both old and new-style classes have a dictionary for attribute storage. This wastes space for objects having very few instance variables. The space consumption can become acute when creating large numbers of instances.
The default can be overridden by defining __slots__ in a new-style class definition. The __slots__ declaration takes a sequence of instance variables and reserves just enough space in each instance to hold a value for each variable. Space is saved because __dict__ is not created for each instance.
So does this save time as well as memory?
Comparing the three approaches on my computer:
test_slots.py:
class Obj(object):
__slots__ = ('i', 'l')
def __init__(self, i):
self.i = i
self.l = []
all = {}
for i in range(1000000):
all[i] = Obj(i)
test_obj.py:
class Obj(object):
def __init__(self, i):
self.i = i
self.l = []
all = {}
for i in range(1000000):
all[i] = Obj(i)
test_dict.py:
all = {}
for i in range(1000000):
o = {}
o['i'] = i
o['l'] = []
all[i] = o
test_namedtuple.py (supported in 2.6):
import collections
Obj = collections.namedtuple('Obj', 'i l')
all = {}
for i in range(1000000):
all[i] = Obj(i, [])
Run benchmark (using CPython 2.5):
$ lshw | grep product | head -n 1
product: Intel(R) Pentium(R) M processor 1.60GHz
$ python --version
Python 2.5
$ time python test_obj.py && time python test_dict.py && time python test_slots.py
real 0m27.398s (using 'normal' object)
real 0m16.747s (using __dict__)
real 0m11.777s (using __slots__)
Using CPython 2.6.2, including the named tuple test:
$ python --version
Python 2.6.2
$ time python test_obj.py && time python test_dict.py && time python test_slots.py && time python test_namedtuple.py
real 0m27.197s (using 'normal' object)
real 0m17.657s (using __dict__)
real 0m12.249s (using __slots__)
real 0m12.262s (using namedtuple)
So yes (not really a surprise), using __slots__ is a performance optimization. Using a named tuple has similar performance to __slots__.
Attribute access in an object uses dictionary access behind the scenes - so by using attribute access you are adding extra overhead. Plus in the object case, you are incurring additional overhead because of e.g. additional memory allocations and code execution (e.g. of the __init__ method).
In your code, if o is an Obj instance, o.attr is equivalent to o.__dict__['attr'] with a small amount of extra overhead.
Have you considered using a namedtuple? (link for python 2.4/2.5)
It's the new standard way of representing structured data that gives you the performance of a tuple and the convenience of a class.
It's only downside compared with dictionaries is that (like tuples) it doesn't give you the ability to change attributes after creation.
Here is a copy of #hughdbrown answer for python 3.6.1, I've made the count 5x larger and added some code to test the memory footprint of the python process at the end of each run.
Before the downvoters have at it, Be advised that this method of counting the size of objects is not accurate.
from datetime import datetime
import os
import psutil
process = psutil.Process(os.getpid())
ITER_COUNT = 1000 * 1000 * 5
RESULT=None
def makeL(i):
# Use this line to negate the effect of the strings on the test
# return "Python is smart and will only create one string with this line"
# Use this if you want to see the difference with 5 million unique strings
return "This is a sample string %s" % i
def timeit(method):
def timed(*args, **kw):
global RESULT
s = datetime.now()
RESULT = method(*args, **kw)
e = datetime.now()
sizeMb = process.memory_info().rss / 1024 / 1024
sizeMbStr = "{0:,}".format(round(sizeMb, 2))
print('Time Taken = %s, \t%s, \tSize = %s' % (e - s, method.__name__, sizeMbStr))
return timed
class Obj(object):
def __init__(self, i):
self.i = i
self.l = makeL(i)
class SlotObj(object):
__slots__ = ('i', 'l')
def __init__(self, i):
self.i = i
self.l = makeL(i)
from collections import namedtuple
NT = namedtuple("NT", ["i", 'l'])
#timeit
def profile_dict_of_nt():
return [NT(i=i, l=makeL(i)) for i in range(ITER_COUNT)]
#timeit
def profile_list_of_nt():
return dict((i, NT(i=i, l=makeL(i))) for i in range(ITER_COUNT))
#timeit
def profile_dict_of_dict():
return dict((i, {'i': i, 'l': makeL(i)}) for i in range(ITER_COUNT))
#timeit
def profile_list_of_dict():
return [{'i': i, 'l': makeL(i)} for i in range(ITER_COUNT)]
#timeit
def profile_dict_of_obj():
return dict((i, Obj(i)) for i in range(ITER_COUNT))
#timeit
def profile_list_of_obj():
return [Obj(i) for i in range(ITER_COUNT)]
#timeit
def profile_dict_of_slot():
return dict((i, SlotObj(i)) for i in range(ITER_COUNT))
#timeit
def profile_list_of_slot():
return [SlotObj(i) for i in range(ITER_COUNT)]
profile_dict_of_nt()
profile_list_of_nt()
profile_dict_of_dict()
profile_list_of_dict()
profile_dict_of_obj()
profile_list_of_obj()
profile_dict_of_slot()
profile_list_of_slot()
And these are my results
Time Taken = 0:00:07.018720, provile_dict_of_nt, Size = 951.83
Time Taken = 0:00:07.716197, provile_list_of_nt, Size = 1,084.75
Time Taken = 0:00:03.237139, profile_dict_of_dict, Size = 1,926.29
Time Taken = 0:00:02.770469, profile_list_of_dict, Size = 1,778.58
Time Taken = 0:00:07.961045, profile_dict_of_obj, Size = 1,537.64
Time Taken = 0:00:05.899573, profile_list_of_obj, Size = 1,458.05
Time Taken = 0:00:06.567684, profile_dict_of_slot, Size = 1,035.65
Time Taken = 0:00:04.925101, profile_list_of_slot, Size = 887.49
My conclusion is:
Slots have the best memory footprint and are reasonable on speed.
dicts are the fastest, but use the most memory.
from datetime import datetime
ITER_COUNT = 1000 * 1000
def timeit(method):
def timed(*args, **kw):
s = datetime.now()
result = method(*args, **kw)
e = datetime.now()
print method.__name__, '(%r, %r)' % (args, kw), e - s
return result
return timed
class Obj(object):
def __init__(self, i):
self.i = i
self.l = []
class SlotObj(object):
__slots__ = ('i', 'l')
def __init__(self, i):
self.i = i
self.l = []
#timeit
def profile_dict_of_dict():
return dict((i, {'i': i, 'l': []}) for i in xrange(ITER_COUNT))
#timeit
def profile_list_of_dict():
return [{'i': i, 'l': []} for i in xrange(ITER_COUNT)]
#timeit
def profile_dict_of_obj():
return dict((i, Obj(i)) for i in xrange(ITER_COUNT))
#timeit
def profile_list_of_obj():
return [Obj(i) for i in xrange(ITER_COUNT)]
#timeit
def profile_dict_of_slotobj():
return dict((i, SlotObj(i)) for i in xrange(ITER_COUNT))
#timeit
def profile_list_of_slotobj():
return [SlotObj(i) for i in xrange(ITER_COUNT)]
if __name__ == '__main__':
profile_dict_of_dict()
profile_list_of_dict()
profile_dict_of_obj()
profile_list_of_obj()
profile_dict_of_slotobj()
profile_list_of_slotobj()
Results:
hbrown#hbrown-lpt:~$ python ~/Dropbox/src/StackOverflow/1336791.py
profile_dict_of_dict ((), {}) 0:00:08.228094
profile_list_of_dict ((), {}) 0:00:06.040870
profile_dict_of_obj ((), {}) 0:00:11.481681
profile_list_of_obj ((), {}) 0:00:10.893125
profile_dict_of_slotobj ((), {}) 0:00:06.381897
profile_list_of_slotobj ((), {}) 0:00:05.860749
There is no question.
You have data, with no other attributes (no methods, nothing). Hence you have a data container (in this case, a dictionary).
I usually prefer to think in terms of data modeling. If there is some huge performance issue, then I can give up something in the abstraction, but only with very good reasons.
Programming is all about managing complexity, and the maintaining the correct abstraction is very often one of the most useful way to achieve such result.
About the reasons an object is slower, I think your measurement is not correct.
You are performing too little assignments inside the for loop, and therefore what you see there is the different time necessary to instantiate a dict (intrinsic object) and a "custom" object. Although from the language perspective they are the same, they have quite a different implementation.
After that, the assignment time should be almost the same for both, as in the end members are maintained inside a dictionary.
Here are my test runs of the very nice script of #Jarrod-Chesney.
For comparison, I also run it against python2 with "range" replaced by "xrange".
By curiosity, I also added similar tests with OrderedDict (ordict) for comparison.
Python 3.6.9:
Time Taken = 0:00:04.971369, profile_dict_of_nt, Size = 944.27
Time Taken = 0:00:05.743104, profile_list_of_nt, Size = 1,066.93
Time Taken = 0:00:02.524507, profile_dict_of_dict, Size = 1,920.35
Time Taken = 0:00:02.123801, profile_list_of_dict, Size = 1,760.9
Time Taken = 0:00:05.374294, profile_dict_of_obj, Size = 1,532.12
Time Taken = 0:00:04.517245, profile_list_of_obj, Size = 1,441.04
Time Taken = 0:00:04.590298, profile_dict_of_slot, Size = 1,030.09
Time Taken = 0:00:04.197425, profile_list_of_slot, Size = 870.67
Time Taken = 0:00:08.833653, profile_ordict_of_ordict, Size = 3,045.52
Time Taken = 0:00:11.539006, profile_list_of_ordict, Size = 2,722.34
Time Taken = 0:00:06.428105, profile_ordict_of_obj, Size = 1,799.29
Time Taken = 0:00:05.559248, profile_ordict_of_slot, Size = 1,257.75
Python 2.7.15+:
Time Taken = 0:00:05.193900, profile_dict_of_nt, Size = 906.0
Time Taken = 0:00:05.860978, profile_list_of_nt, Size = 1,177.0
Time Taken = 0:00:02.370905, profile_dict_of_dict, Size = 2,228.0
Time Taken = 0:00:02.100117, profile_list_of_dict, Size = 2,036.0
Time Taken = 0:00:08.353666, profile_dict_of_obj, Size = 2,493.0
Time Taken = 0:00:07.441747, profile_list_of_obj, Size = 2,337.0
Time Taken = 0:00:06.118018, profile_dict_of_slot, Size = 1,117.0
Time Taken = 0:00:04.654888, profile_list_of_slot, Size = 964.0
Time Taken = 0:00:59.576874, profile_ordict_of_ordict, Size = 7,427.0
Time Taken = 0:10:25.679784, profile_list_of_ordict, Size = 11,305.0
Time Taken = 0:05:47.289230, profile_ordict_of_obj, Size = 11,477.0
Time Taken = 0:00:51.485756, profile_ordict_of_slot, Size = 11,193.0
So, on both major versions, the conclusions of #Jarrod-Chesney are still looking good.
There is yet another way with the help of recordclass library to reduce memory usage if data structure isn't supposed to contain reference cycles.
Let's compare two classes:
class DataItem:
__slots__ = ('name', 'age', 'address')
def __init__(self, name, age, address):
self.name = name
self.age = age
self.address = address
and
$ pip install recordclass
>>> from recordclass import make_dataclass
>>> DataItem2 = make_dataclass('DataItem', 'name age address')
>>> inst = DataItem('Mike', 10, 'Cherry Street 15')
>>> inst2 = DataItem2('Mike', 10, 'Cherry Street 15')
>>> print(inst2)
DataItem(name='Mike', age=10, address='Cherry Street 15')
>>> print(sys.getsizeof(inst), sys.getsizeof(inst2))
64 40
It became possible since dataobject-based subclasses doesn't support cyclic garbage collection, which is not needed in such cases.

Categories