As it was unclear earlier I am posting this scenario:
class Scraper:
def __init__(self,url):
self.start_page = url
def parse_html(self):
pass
def get_all_links(self):
pass
def run(self):
#parse html, get all links, parse them and when done...
return links
Now in a task queue like rq
from rq import Queue
from worker import conn
q = Queue(connection=conn)
result = q.enqueue(what_function, 'http://stackoverflow.com')
I want to know what this what_function would be? I remembered Django does something similar with their CBVs so I used that analogy but it wasn't so clear.
I have a class like
class A:
def run(self,arg):
#do something
I need to past this to a task queue, so I can do something like
a = A()
b = a.run
# q is the queue object
q.enqueue(b,some_arg)
I'd want to know what other method is there to do this, for example, Django does it in their Class Based Views,
class YourListView(ListView):
#code for your view
which is eventually passed as a function
your_view = YourListView.as_view()
How is it done?
Edit: to elaborate, django's class based views are converted to functions because the argument in the pattern function expects a function. Similarly, you might have a function which accepts the following argument
task_queue(callback_function, *parameters):
#add to queue and return result when done
but the functionality of callback_function might have been mostly implemented in a class, which has a run() method via which the process is ran.
I think you're describing a classmethod:
class MyClass(object):
#classmethod
def as_view(cls):
'''method intended to be called on the class, not an instance'''
return cls(instantiation, args)
which could be used like this:
call_later = MyClass.as_view
and later called:
call_later()
Most frequently, class methods are used to instantiate a new instance, for example, dict's fromkeys classmethod:
dict.fromkeys(['foo', 'bar'])
returns a new dict instance:
{'foo': None, 'bar': None}
Update
In your example,
result = q.enqueue(what_function, 'http://stackoverflow.com')
you want to know what_function could go there. I saw a very similar example from the RQ home page. That's got to be your own implementation. It's going to be something you can call with your code. It's only going to be called with that argument once, so if using a class, your __init__ should look more like this, if you want to use Scraper for your what_function replacement:
class Scraper:
def __init__(self,url):
self.start_page = url
self.run()
# etc...
If you want to use a class method, that might look like this:
class Scraper:
def __init__(self,url):
self.start_page = url
def parse_html(self):
pass
def get_all_links(self):
pass
#classmethod
def run(cls, url):
instance = cls(url)
#parse html, get all links, parse them and when done...
return links
And then your what_function would be Scraper.run.
Related
I am building a software and am using one class to only store data:
class Data():
data = [1,2,3]
Now the data in this class can be accessed and changed from other classes without instantiating the Data class which is exactly what I need.
In order to update the of the software properly I have to call functions in other classes whenever the data changes. I looked at the observer pattern in python but could not get it to work without making data an attribute of the class that's only available when instantiated. In other words all the observer pattern implementations I found required:
class Data():
def __init__(self):
self.data = [1,2,3]
Obviously, if Data is my publisher/observable it needs to be instantiated once to get the functionality (as far as I understand) but I am looking for an implementation like:
class Data():
data = [1,2,3]
def __init__(self):
self.subscribers = {}
def register(self, who, callback):
self.subscribers[who] = callback
def dispatch(self):
for susbriber, callback in self.subscribers.items():
callback()
For the sake of the example let's use this other class as Subscriber/Observer that can also change the data with another function. As this will be the main class handling the software, this is where I instantiate Data to get the observer behavior. It is important however that I would not have to instantiate it only to get data as data will be changed from a lot of other classes:
class B():
def __init__(self):
self.data = Data()
self.data.register(self, self.print_something)
def print_something(self):
print("Notification Received")
def change_data(self):
Data.data.append(100)
My question now is, how to automatically send the notification from the Publisher/Observable whenever data gets changed in any way?
I am running python 3.8 on Windows 10.
I am trying to exectute the below code but I get errors.
class base:
def callme(data):
print(data)
class A(base):
def callstream(self):
B.stream(self)
def callme(data):
print("child ", data)
class B:
def stream(data):
# below statement doesn't work but I want this to run to achieve run time
# polymorphism where method call is not hardcoded to a certain class reference.
(base)data.callme("streaming data")
# below statement works but it won't call child class overridden method. I
# can use A.callme() to call child class method but then it has to be
# hardcoded to A. which kills the purpose. Any class A or B or XYZ which
# inherits base call should be able to read stream data from stream class.
# How to achive this in Python? SO any class should read the stream data as
# long as it inherits from the base class. This will give my stream class a
# generic ability to be used by any client class as long as they inherit
# base class.
#base.callme("streaming data")
def main():
ob = A()
ob.callstream()
if __name__=="__main__":
main()
I got the output you say you're looking for (in a comment rather than the question -- tsk, tsk) with the following code, based on the code in your question:
class base:
def callme(self, data):
print(data)
class A(base):
def callstream(self):
B.stream(self)
def callme(self, data):
print("child", data)
class B:
#classmethod
def stream(cls, data):
data.callme("streaming data")
def main():
ob = A()
ob.callstream()
if __name__=="__main__":
main()
Basically, I just made sure the instance methods had self parameters, and since you seem to be using B.stream() as a class method, I declared it as such.
I'm using scrapy and I have the following functioning pipeline class :
class DynamicSQLlitePipeline(object):
#classmethod
def from_crawler(cls, crawler):
# Here, you get whatever value was passed through the "table" parameter
docket = getattr(crawler.spider, "docket")
return cls(docket)
def __init__(self,docket):
try:
db_path = "sqlite:///"+settings.SETTINGS_PATH+"\\data.db"
db = dataset.connect(db_path)
table_name = docket[0:3] # FIRST 3 LETTERS
self.my_table = db[table_name]
except Exception:
# traceback.exec_print()
pass
def process_item(self, item, spider):
try:
test = dict(item)
self.my_table.insert(test)
print('INSERTED')
except IntegrityError:
print('THIS IS A DUP')
In my spider I have:
custom_settings = {
'ITEM_PIPELINES': {
'myproject.pipelines.DynamicSQLlitePipeline': 600,
}
}
From a recent question I was pointed to What is the 'cls' variable used for in Python classes?
If I understand correctly in order for the pipeline object to be instantiated (using the init function), it requires a docket number. The docket number only becomes available once the from_crawler class method is run. But what triggers the from_crawler method. Again the code is working.
The caller of a classmethod has to have an instance of the class. They may just access it by name, like this:
DynamicSQLlitePipeline.from_crawler(crawler)
… or:
sqlitepipeline.DynamicSQLlitePipeline.from_crawler(crawler)
Or maybe you pass the class object to someone, and they store it and use it later like this:
pipelines[i].from_crawler(crawler)
In Scrapy, the usual way to register a set of pipelines with the framework, according to the docs, is like this:
ITEM_PIPELINES = {
'myproject.pipelines.PricePipeline': 300,
'myproject.pipelines.JsonWriterPipeline': 800,
}
(Also see the Extensions user guide, which explains how this fits into a scrapy project.)
Presumably you've done something similar in code you haven't shown us, putting something like 'sqlscraper.pipelines.DynamicSQLlitePipeline' in that dict. At some point, Scrapy goes through that dict, sorts it in order by the values, and instantiates each pipeline. (Because it has the name of the class, as a string, instead of the class object, this is a little trickier, but the details really aren't relevant here.)
I'm writing a GUI library, and I'd like to let the programmer provide meta-information about their program which I can use to fine-tune the GUI. I was planning to use function decorators for this purpose, for example like this:
class App:
#Useraction(description='close the program', hotkey='ctrl+q')
def quit(self):
sys.exit()
The problem is that this information needs to be bound to the respective class. For example, if the program is an image editor, it might have an Image class which provides some more Useractions:
class Image:
#Useraction(description='invert the colors')
def invert_colors(self):
...
However, since the concept of unbound methods has been removed in python 3, there doesn't seem to be a way to find a function's defining class. (I found this old answer, but that doesn't work in a decorator.)
So, since it looks like decorators aren't going to work, what would be the best way to do this? I'd like to avoid having code like
class App:
def quit(self):
sys.exit()
Useraction(App.quit, description='close the program', hotkey='ctrl+q')
if at all possible.
For completeness' sake, the #Useraction decorator would look somewhat like this:
class_metadata= defaultdict(dict)
def Useraction(**meta):
def wrap(f):
cls= get_defining_class(f)
class_metadata[cls][f]= meta
return f
return wrap
You are using decorators to add meta data to methods. That is fine. It can be done e.g. this way:
def user_action(description):
def decorate(func):
func.user_action = {'description': description}
return func
return decorate
Now, you want to collect that data and store it in a global dictionary in form class_metadata[cls][f]= meta. For that, you need to find all decorated methods and their classes.
The simplest way to do that is probably using metaclasses. In metaclass, you can define what happens when a class is created. In this case, go through all methods of the class, find decorated methods and store them in the dictionary:
class UserActionMeta(type):
user_action_meta_data = collections.defaultdict(dict)
def __new__(cls, name, bases, attrs):
rtn = type.__new__(cls, name, bases, attrs)
for attr in attrs.values():
if hasattr(attr, 'user_action'):
UserActionMeta.user_action_meta_data[rtn][attr] = attr.user_action
return rtn
I have put the global dictionary user_action_meta_data in the meta class just because it felt logical. It can be anywhere.
Now, just use that in any class:
class X(metaclass=UserActionMeta):
#user_action('Exit the application')
def exit(self):
pass
Static UserActionMeta.user_action_meta_data now contains the data you want:
defaultdict(<class 'dict'>, {<class '__main__.X'>: {<function exit at 0x00000000029F36C8>: {'description': 'Exit the application'}}})
I've found a way to make decorators work with the inspect module, but it's not a great solution, so I'm still open to better suggestions.
Basically what I'm doing is to traverse the interpreter stack until I find the current class. Since no class object exists at this time, I extract the class's qualname and module instead.
import inspect
def get_current_class():
"""
Returns the name of the current module and the name of the class that is currently being created.
Has to be called in class-level code, for example:
def deco(f):
print(get_current_class())
return f
def deco2(arg):
def wrap(f):
print(get_current_class())
return f
return wrap
class Foo:
print(get_current_class())
#deco
def f(self):
pass
#deco2('foobar')
def f2(self):
pass
"""
frame= inspect.currentframe()
while True:
frame= frame.f_back
if '__module__' in frame.f_locals:
break
dict_= frame.f_locals
cls= (dict_['__module__'], dict_['__qualname__'])
return cls
Then in a sort of post-processing step, I use the module and class names to find the actual class object.
def postprocess():
global class_metadata
def findclass(module, qualname):
scope= sys.modules[module]
for name in qualname.split('.'):
scope= getattr(scope, name)
return scope
class_metadata= {findclass(cls[0], cls[1]):meta for cls,meta in class_metadata.items()}
The problem with this solution is the delayed class lookup. If classes are overwritten or deleted, the post-processing step will find the wrong class or fail altogether. Example:
class C:
#Useraction(hotkey='ctrl+f')
def f(self):
print('f')
class C:
pass
postprocess()
I'm working on a project in Tornado that relies heavily on the asynchronous features of the library. By following the chat demo, I've managed to get long-polling working with my application, however I seem to have run into a problem with the way it all works.
Basically what I want to do is be able to call a function on the UpdateManager class and have it finish the asynchronous request for any callbacks in the waiting list. Here's some code to explain what I mean:
update.py:
class UpdateManager(object):
waiters = []
attrs = []
other_attrs = []
def set_attr(self, attr):
self.attrs.append(attr)
def set_other_attr(self, attr):
self.other_attrs.append(attr)
def add_callback(self, cb):
self.waiters.append(cb)
def send(self):
for cb in self.waiters:
cb(self.attrs, self.other_attrs)
class LongPoll(tornado.web.RequestHandler, UpdateManager):
#tornado.web.asynchronous
def get(self):
self.add_callback(self.finish_request)
def finish_request(self, attrs, other_attrs):
# Render some JSON to give the client, etc...
class SetSomething(tornado.web.RequestHandler):
def post(self):
# Handle the stuff...
self.add_attr(some_attr)
(There's more code implementing the URL handlers/server and such, however I don't believe that's necessary for this question)
So what I want to do is make it so I can call UpdateManager.send from another place in my application and still have it send the data to the waiting clients. The problem is that when you try to do this:
from update import UpdateManager
UpdateManager.send()
it only gets the UpdateManager class, not the instance of it that is holding user callbacks. So my question is: is there any way to create a persistent object with Tornado that will allow me to share a single instance of UpdateManager throughout my application?
Don't use instance methods - use class methods (after all, you're already using class attributes, you just might not realize it). That way, you don't have to instantiate the object, and can instead just call the methods of the class itself, which acts as a singleton:
class UpdateManager(object):
waiters = []
attrs = []
other_attrs = []
#classmethod
def set_attr(cls, attr):
cls.attrs.append(attr)
#classmethod
def set_other_attr(cls, attr):
cls.other_attrs.append(attr)
#classmethod
def add_callback(cls, cb):
cls.waiters.append(cb)
#classmethod
def send(cls):
for cb in cls.waiters:
cb(cls.attrs, cls.other_attrs)
This will make...
from update import UpdateManager
UpdateManager.send()
work as you desire it to.