I'm trying to implement a wrapper around a redis database that does some bookkeeping, and I thought about using descriptors. I have an object with a bunch of fields: frames, failures, etc., and I need to be able to get, set, and increment the field as needed. I've tried to implement an Int-Like descriptor:
class IntType(object):
def __get__(self,instance,owner):
# issue a GET database command
return db.get(my_val)
def __set__(self,instance,val):
# issue a SET database command
db.set(instance.name,val)
def increment(self,instance,count):
# issue an INCRBY database command
db.hincrby(instance.name,count)
class Stream:
_prefix = 'stream'
frames = IntType()
failures = IntType()
uuid = StringType()
s = Stream()
s.frames.increment(1) # float' object has no attribute 'increment'
Is seems like I can't access the increment() method in my descriptor. I can't have increment be defined in the object that the __get__ returns. This would require an additional db query if all I want to do is increment! I also don't want increment() on the Stream class, as later on when I want to have additional fields like strings or sets in Stream, then I'd need to type check the heck out of everything.
Does this work?
class Stream:
_prefix = 'stream'
def __init__(self):
self.frames = IntType()
self.failures = IntType()
self.uuid = StringType()
Why not define the magic method iadd as well as get and set. This will allow you to do normal addition with assignment on the class. It will also mean you can treat the increment separately from the get function and thereby minimise the database accesses.
So change:
def increment(self,instance,count):
# issue an INCRBY database command
db.hincrby(instance.name,count)
to:
def __iadd__(self,other):
# your code goes here
Try this:
class IntType(object):
def __get__(self,instance,owner):
class IntValue():
def increment(self,count):
# issue an INCRBY database command
db.hincrby(self.name,count)
def getValue(self):
# issue a GET database command
return db.get(my_val)
return IntValue()
def __set__(self,instance,val):
# issue a SET database command
db.set(instance.name,val)
Related
class Tokenizer()
def __init__(self):
self.name = 'MyTokenizer'
self.tokenizer = Language.create_tokenizer(nlp)
def __call__(self, text):
if text:
with CoreClient(timeout=60000) as client:
doc = client.annotate(text, output_format='json')
else:
doc = Document("")
...
The question I am having is with the creation of 'CoreClient', which creates a http request to a server. The current code introduced by "with ... as client", can insure that the client is destroyed when 'client.annotate' is out of scope after it's done. However, the problem is that, the object 'client' has to be created for each request of processing 'text'. In order to avoid this, I had better create the object in the init method:
self.client = CoreClient(timeout=60000)
But then:
1) How to destroy the 'client' after all requests have been completed? OR
2) Is the current way of creating a Coreclient OK for each request? The creation of the object is heavy, which needs a lot of initialization.
EDIT:
def __enter__(self):
self.start()
return self
def start(self):
if self.start_cmd:
if self.be_quiet:
# Issue #26: subprocess.DEVNULL isn't supported in python 2.7.
stderr = open(os.devnull, 'w')
else:
stderr = self.stderr
print(f"Starting server with command: {' '.join(self.start_cmd)}")
self.server = subprocess.Popen(self.start_cmd,
stderr=stderr,
stdout=stderr)
To make it more clear, I added the implementation of the method enter. It seems it simply returns the object 'self'.
You only need to create the instance of CoreClient once. The with statement just ensures that the __enter__ and __exit__ methods of that instance are called before and after the body of the with statement; you don't need to create a new instance each time.
class Tokenizer()
def __init__(self):
self.name = 'MyTokenizer'
self.tokenizer = Language.create_tokenizer(nlp)
self.client = CoreClient(timeout=60000) # Create client here
def __call__(self, text):
if text:
with self.client:
doc = self.client.annotate(text, output_format='json')
else:
doc = Document("")
It appears that __enter__ and __exit__ together spin up and tear down a new server each time the CoreClient instance is used as a context manager.
The client will be collected when the Tokenizer instance gets collected. However, unless you are in an active with statement, the CoreClient instance isn't doing anything.
In this case I wouldn't worry about it because when the reference count goes to zero, Python will take care of it. Also, del does not actually delete and object. It might, but it might not. del will decrement the reference count to an object.
Take this for example:
In [1]: class Test:
...: def __del__(self):
...: print('deleted')
...:
In [2]: t = Test()
In [3]: del t
deleted
In [4]: t = Test()
In [5]: t1 = t
In [6]: del t # Nothing gets printed here because t1 still exists
In [7]: del t1 # reference count goes to 0 and now gets printed
deleted
This is why I think you should just let Python handle the destruction of your objects. Python keeps track of objects reference counts and knows when they are no longer needed. So let it take care of that stuff for you.
import praw
import time
class getPms():
r = praw.Reddit(user_agent="Test Bot By /u/TheC4T")
r.login(username='*************', password='***************')
cache = []
inboxMessage = []
file = 'cache.txt'
def __init__(self):
cache = self.cacheRead(self, self.file)
self.bot_run(self)
self.cacheSave(self, self.file)
time.sleep(5)
return self.inboxMessage
def getPms(self):
def bot_run():
inbox = self.r.get_inbox(limit=25)
print(self.cache)
# print(r.get_friends())#this works
for message in inbox:
if message.id not in self.cache:
# print(message.id)
print(message.body)
# print(message.subject)
self.cache.append(message.id)
self.inboxMessage.append(message.body)
# else:
# print("no messages")
def cacheSave(self, file):
with open(file, 'w') as f:
for s in self.cache:
f.write(s + '\n')
def cacheRead(self, file):
with open(file, 'r') as f:
cache1 = [line.rstrip('\n') for line in f]
return cache1
# while True: #threading is needed in order to run this as a loop. Probably gonna do this in the main method though
# def getInbox(self):
# return self.inboxMessage
The exception is:
cache = self.cacheRead(self, self.file)
AttributeError: 'getPms' object has no attribute 'cacheRead'
I am new to working with classes in python and need help with what I am doing wrong with this if you need any more information I can add some. It worked when it was all functions but now that I attempted to switch it to a class it has stopped working.
Your cacheRead function (as well as bot_run and cacheSave) is indented too far, so it's defined in the body of your other function getPms. Thus it is only accessible inside of getPms. But you're trying to call it from __init__.
I'm not sure what you're trying to achieve here because getPms doesn't have anything else in it but three function definitions. As far as I can tell you should just take out the def getPms line and unindent the three functions it contains so they line up with the __init__ method.
Here are few points:
Unless you're explicitly inheriting from some specific class, you can omit parenthesis:
class A(object):, class A():, class A: are equivalent.
Your class name and class method have the same name. I'm not sure does Python confuse about this or not, but you probably do. You can name your class PMS and your method get, for example, so you'll obtain PMS.get(...)
In the present version of indentation cacheRead and cacheSave functions are simply inaccessible from init; why not move them to generic class namespace?
When calling member functions, you don't need to specify self as the first argument since you're already calling the function from this object. So instead of cache = self.cacheRead(self, self.file) you have to do it like this: cache = self.cacheRead(self.file)
So I've looked around and read many postings covering the TypeError: message, where it "takes exactly X arguments but only 1 is given".
I know about self. I don't think I have an issue understanding self. Regardless, I was trying to to create a class with some properties and as long as I have #property in front of my function hwaddr, I get the following error:
Traceback (most recent call last):
File line 24, in <module>
db.hwaddr("aaa", "bbbb")
TypeError: hwaddr() takes exactly 3 arguments (1 given)
Here is the code. Why is #property messing me up? I take it out, and the code works as expected:
#!/usr/bin/env python2.7
class Database:
"""An instance of our Mongo systems database"""
#classmethod
def __init__(self):
pass
#property
def hwaddr(self, host, interface):
results = [ host, interface ]
return results
db = Database()
print db.hwaddr("aaa", "bbbb"
Process finished with exit code 1
With it gone, the output is:
File
['aaa', 'bbbb']
Process finished with exit code 0
Properties are used as syntactical sugar getters. So they expect that you only pass self. It would basically shorten:
print db.hwaddr()
to:
print db.hwaddr
There is no need to use a property here as you pass two arguments in.
Basically, what Dair said: properties don't take parameters, that's what methods are for.
Typically, you will want to use properties in the following scenarios:
To provide read-only access to an internal attribute
To implement a calculated field
To provide read/write access to an attribute, but control what happens when it is set
So the question would be, what does hwaddr do, and does it match any of those use cases? And what exactly are host and interface? What I think you want to do is this:
#!/usr/bin/env python2.7
class Database:
"""An instance of our Mongo systems database"""
def __init__(self, host, interface):
self._host = host
self._interface = interface
#property
def host(self):
return self._host
#property
def interface(self):
return self._interface
#property
def hwaddr(self):
return self._host, self._interface
db = Database("my_host", "my_interface")
print db.host
print db.interface
print db.hwaddr
Here, your Database class will have host and interface read-only properties, that can only be set when instantiating the class. A third property, hwaddr, will produce a tuple with the database full address, which may be convenient in some cases.
Also, note that I removed the classmethod decorator in init; constructors should be instance methods.
I'm trying to figure out how to chain class methods to improve a utility class I've been writing - for reasons I'd prefer not to get into :)
Now suppose I wanted to chain a chain class methods on a class instance (in this case for setting the cursor) e.g.:
# initialize the class instance
db = CRUD(table='users', public_fields=['name', 'username', 'email'])
#the desired interface class_instance.cursor(<cursor>).method(...)
with sql.read_pool.cursor() as c:
db.cursor(c).get(target='username', where="omarlittle")
The part that's confusing is I would prefer the cursor not to persist as a class attribute after .get(...) has been called and has returned, I'd like to require that .cursor(cursor) must be first called.
class CRUD(object):
def __init__(self, table, public_fields):
self.table = table
self.public_fields = public_fields
def fields(self):
return ', '.join([f for f in self.public_fields])
def get(self, target, where):
#this is strictly for illustration purposes, I realize all
#the vulnerabilities this leaves me exposed to.
query = "SELECT {fields} FROM {table} WHERE {target} = {where}"
query.format(fields=self.fields, table=self.table, target=target,
where=where)
self.cursor.execute(query)
def cursor(self, cursor):
pass # this is where I get lost.
If I understand what you're asking, what you want is for the cursor method to return some object with a get method that works as desired. There's no reason the object it returns has to be self; it can instead return an instance of some cursor type.
That instance could have a back-reference to self, or it could get its own copy of whatever internals are needed to be a cursor, or it could be a wrapper around an underlying object from your low-level database library that knows how to be a cursor.
If you look at the DB API 2.0 spec, or implementations of it like the stdlib's sqlite3, that's exactly how they do it: A Database or Connection object (the thing you get from the top-level connect function) has a cursor method that returns a Cursor object, and that Cursor object has an execute method.
So:
class CRUDCursor(object):
def __init__(self, c, crud):
self.crud = crud
self.cursor = however_you_get_an_actual_sql_cursor(c)
def get(self, target, where):
#this is strictly for illustration purposes, I realize all
#the vulnerabilities this leaves me exposed to.
query = "SELECT {fields} FROM {table} WHERE {target} = {where}"
query.format(fields=self.crud.fields, table=self.crud.table,
target=target, where=where)
self.cursor.execute(query)
# you may want this to return something as well?
class CRUD(object):
def __init__(self, table, public_fields):
self.table = table
self.public_fields = public_fields
def fields(self):
return ', '.join([f for f in self.public_fields])
# no get method
def cursor(self, cursor):
return CRUDCursor(self, cursor)
However, there still seems to be a major problem with your example. Normally, after you execute a SELECT statement on a cursor, you want to fetch the rows from that cursor. You're not keeping the cursor object around in your "user" code, and you explicitly don't want the CRUD object to keep its cursor around, so… how do you expect to do that? Maybe get is supposed to return self.cursor.fetch_all() at the end or something?
I have a class which makes requests to a remote API. I'd like to be able to reduce the number of calls I'm making. Some of the methods in my class make the same API calls (but for different reasons), so I'ld like the ability for them to 'share' a cached API response.
I'm not entirely sure if it's more Pythonic to use optional parameters or to use multiple methods, as the methods have some required parameters if they are making an API call.
Here are the approches as I see them, which do you think is best?
class A:
def a_method( item_id, cached_item_api_response = None):
""" Seems awkward having to supplied item_id even
if cached_item_api_response is given
"""
api_response = None
if cached_item_api_response:
api_response = cached_item_api_response
else:
api_response = ... # make api call using item_id
... #do stuff
Or this:
class B:
def a_method(item_id = None, cached_api_response = None):
""" Seems awkward as it makes no sense NOT to supply EITHER
item_id or cached_api_response
"""
api_response = None
if cached_item_api_response:
api_response = cached_item_api_response
elif item_id:
api_response = ... # make api call using item_id
else:
#ERROR
... #do stuff
Or is this more appropriate?
class C:
"""Seems even more awkward to have different method calls"""
def a_method(item_id):
api_response = ... # make api call using item_id
api_response_logic(api_response)
def b_method(cached_api_response):
api_response_logic(cached_api_response)
def api_response_logic(api_response):
... # do stuff
Normally when writing method one could argue that a method / object should do one thing and it should do it well. If your method get more and more parameters which require more and more ifs in your code that probably means that your code is doing more then one thing. Especially if those parameters trigger totally different behavior. Instead maybe the same behavior could be produced by having different classes and having them overload methods.
Maybe you could use something like:
class BaseClass(object):
def a_method(self, item_id):
response = lookup_response(item_id)
return response
class CachingClass(BaseClass):
def a_method(self, item_id):
if item_id in cache:
return item_from_cache
return super(CachingClass, self).a_method(item_id)
def uncached_method(self, item_id)
return super(CachingClass, self).a_method(item_id)
That way you can split the logic of how to lookup the response and the caching while also making it flexible for the user of the API to decide if they want the caching capabilities or not.
There is nothing wrong with the method used in your class B. To make it more obvious at a glance that you actually need to include either item_id or cached_api_response, I would put the error checking first:
class B:
def a_method(item_id = None, cached_api_response = None):
"""Requires either item_id or cached_api_response"""
if not ((item_id == None) ^ (cached_api_response == None)):
#error
# or, if you want to allow both,
if (item_id == None) and (cached_api_response == None):
# error
# you don't actually have to do this on one line
# also don't use it if cached_item_api_response can evaluate to 'False'
api_response = cached_item_api_response or # make api call using item_id
... #do stuff
Ultimately this is a judgement that must be made for each situation. I would ask myself, which of these two more closely fits:
Two completely different algorithms or actions, with completely different semantics, even though they may be passed similar information
A single conceptual idea, with consistent semantics, but with nuance based on input
If the first is closest, go with separate methods. If the second is closest, go with optional arguments. You might even implement a single method by testing the type of the argument(s) to avoid passing additional arguments.
This is an OO anti-pattern.
class API_Connection(object):
def do_something_with_api_response(self, response):
...
def do_something_else_with_api_response(self, response):
...
You have two methods on an instance and you're passing state between them explicitly? Why are these methods and not bare functions in a module?
Instead, think about using encapsulation to help you by having the instance of the class own the api response.
For example:
class API_Connection(object):
def __init__(self, api_url):
self._url = api_url
self.cached_response = None
#property
def response(self):
"""Actually use the _url and get the response when needed."""
if self._cached_response is None:
# actually calculate self._cached_response by making our
# remote call, etc
self._cached_response = self._get_api_response(self._url)
return self._cached_response
def _get_api_response(self, api_param1, ...):
"""Make the request and return the api's response"""
def do_something_with_api_response(self):
# just use self.response
do_something(self.response)
def do_something_else_with_api_response(self):
# just use self.response
do_something_else(self.response)
You have caching and any method which needs this response can run in any order without making multiple api requests because the first method that needs self.response will calculate it and every other will use the cached value. Hopefully it's easy to imagine extending this with multiple URLs or RPC calls. If you have a need for a lot of methods that cache their return values like response above then you should look into a memoization decorator for your methods.
The cached response should be saved in the instance, not passed around like a bag of Skittles -- what if you dropped it?
Is item_id unique per instance, or can an instance make queries for more than one? If it can have more than one, I'd go with something like this:
class A(object):
def __init__(self):
self._cache = dict()
def a_method( item_id ):
"""Gets api_reponse from cache (cache may have to get a current response).
"""
api_response = self._get_cached_response( item_id )
... #do stuff
def b_method( item_id ):
"""'nother method (just for show)
"""
api_response = self._get_cached_response( item_id )
... #do other stuff
def _get_cached_response( self, item_id ):
if item_id in self._cache:
return self._cache[ item_id ]
response = self._cache[ item_id ] = api_call( item_id, ... )
return response
def refresh_response( item_id ):
if item_id in self._cache:
del self._cache[ item_id ]
self._get_cached_response( item_id )
And if you may have to get the most current info about item_id, you can have a refresh_response method.