I'm writing a code that's supposed to get some file names using a recursive function (scan_folder) and writing them into an sqlite database with a second function (update_db).
The first issue is, whenever scan_folder() calls itself, it calls update_db() immediately after, although it shouldn't. Because of this, the database gets updated A LOT. Maybe I could pop the values that get passed to the second function after it finishes, but I'd like to know why this is happening.
class Sub:
def __init__(self, parent, scan_type):
self.database = ConnectionToDatabase()
self.database_name = ConnectionToDatabase().database_name()
def scan_folder(self):
connection = sqlite3.connect(self.database_name)
try:
cursor = connection.cursor()
for file_name in os.listdir(self.parent):
if file_name.endswith('.srt'):
if self.scan_type is True:
cursor.execute('SELECT count(*) FROM subs WHERE name = ?', (file_name,))
else:
current_path = "".join((self.parent, "/", file_name))
if os.path.isdir(current_path):
dot = Sub(current_path, self.scan_type)
# I THINK HERE IS THE ERROR, ACCORDING TO PYCHARM DEBUGGER
# HERE THE update_db() IS CALLED AND ONLY AFTER IT FINISHES, dot.scan_folder() BEGINS
dot.scan_folder()
connection.close() # Closes connection that adds subtitle names into the database
finally:
self.database.update_database(dirty_files_amount)
Here begins the second function:
class ConnectionToDatabase:
def __init__(self):
self.database = './sub_master.db'
def update_database(self, dirty_files_amount):
connection_update = sqlite3.connect(self.database)
cursor = connection_update.cursor()
for sub_name in to_update:
cursor.execute('UPDATE subs SET ad_found = 1 WHERE name = ?', (sub_name,))
connection_update.commit()
connection_update.close()
This is just a hunch, but right here:
dot = Sub(current_path, self.scan_type)
You're setting it to be equals your Sub-method and in that method you have a:
self.database = ConnectionToDatabase()
self.database_name = ConnectionToDatabase().database_name()
That calls itself through your ConnectionToDatabase class, where your update_db is residing
when I call scan_folder it enters and if/else statement which gets every file and folder in the current directory. When it doesn't find anything else there, instead of jumping back to the previous directory, it calls update_db before.
The best thing to do is to just re write the whole thing, as stated previously, the functions are doing too many things.
Related
I have a class which is an ndb.Model.
I am trying to add pagination so I added this:
#classmethod
def get_next_page(cls, cursor):
q = cls.query()
q_forward = q.order(cls.title)
if cursor:
cursor = ndb.datastore_query.Cursor(cursor)
objects, cursor, more = q_forward.fetch_page(10, start_cursor=cursor)
return objects, cursor.urlsafe(), more
However, fetch_page ALWAYS returns more == false and cursor is always just empty. But if I instead of cursor use offset=5 or offset=10 or whatever it works just fine. The cursor does not update so it always starts from the first item.
I am testing this locally with stub context.
I wonder what am I missing? I'm very new to this.
I believe it should be ndb._datastore_query.Cursor (see reference) or just do ndb.Cursor
If the cursor came from UI and you had previously made it urlsafe, then you should be doing ndb._datastore_query.Cursor(urlsafe=cursor) or ndb.Cursor(urlsafe=cursor)
Also, when you don't have a cursor, make sure it's explicitly set to None or just do ndb.Cursor() or ndb._datastore_query.Cursor()
import praw
import time
class getPms():
r = praw.Reddit(user_agent="Test Bot By /u/TheC4T")
r.login(username='*************', password='***************')
cache = []
inboxMessage = []
file = 'cache.txt'
def __init__(self):
cache = self.cacheRead(self, self.file)
self.bot_run(self)
self.cacheSave(self, self.file)
time.sleep(5)
return self.inboxMessage
def getPms(self):
def bot_run():
inbox = self.r.get_inbox(limit=25)
print(self.cache)
# print(r.get_friends())#this works
for message in inbox:
if message.id not in self.cache:
# print(message.id)
print(message.body)
# print(message.subject)
self.cache.append(message.id)
self.inboxMessage.append(message.body)
# else:
# print("no messages")
def cacheSave(self, file):
with open(file, 'w') as f:
for s in self.cache:
f.write(s + '\n')
def cacheRead(self, file):
with open(file, 'r') as f:
cache1 = [line.rstrip('\n') for line in f]
return cache1
# while True: #threading is needed in order to run this as a loop. Probably gonna do this in the main method though
# def getInbox(self):
# return self.inboxMessage
The exception is:
cache = self.cacheRead(self, self.file)
AttributeError: 'getPms' object has no attribute 'cacheRead'
I am new to working with classes in python and need help with what I am doing wrong with this if you need any more information I can add some. It worked when it was all functions but now that I attempted to switch it to a class it has stopped working.
Your cacheRead function (as well as bot_run and cacheSave) is indented too far, so it's defined in the body of your other function getPms. Thus it is only accessible inside of getPms. But you're trying to call it from __init__.
I'm not sure what you're trying to achieve here because getPms doesn't have anything else in it but three function definitions. As far as I can tell you should just take out the def getPms line and unindent the three functions it contains so they line up with the __init__ method.
Here are few points:
Unless you're explicitly inheriting from some specific class, you can omit parenthesis:
class A(object):, class A():, class A: are equivalent.
Your class name and class method have the same name. I'm not sure does Python confuse about this or not, but you probably do. You can name your class PMS and your method get, for example, so you'll obtain PMS.get(...)
In the present version of indentation cacheRead and cacheSave functions are simply inaccessible from init; why not move them to generic class namespace?
When calling member functions, you don't need to specify self as the first argument since you're already calling the function from this object. So instead of cache = self.cacheRead(self, self.file) you have to do it like this: cache = self.cacheRead(self.file)
I am running into a bit of an issue with keeping a context manager open through function calls. Here is what I mean:
There is a context-manager defined in a module which I use to open SSH connections to network devices. The "setup" code handles opening the SSH sessions and handling any issues, and the teardown code deals with gracefully closing the SSH session. I normally use it as follows:
from manager import manager
def do_stuff(device):
with manager(device) as conn:
output = conn.send_command("show ip route")
#process output...
return processed_output
In order to keep the SSH session open and not have to re-establish it across function calls, I would like to do add an argument to "do_stuff" which can optionally return the SSH session along with the data returned from the SSH session, as follows:
def do_stuff(device, return_handle=False):
with manager(device) as conn:
output = conn.send_command("show ip route")
#process output...
if return_handle:
return (processed_output, conn)
else:
return processed_output
I would like to be able to call this function "do_stuff" from another function, as follows, such that it signals to "do_stuff" that the SSH handle should be returned along with the output.
def do_more_stuff(device):
data, conn = do_stuff(device, return_handle=True)
output = conn.send_command("show users")
#process output...
return processed_output
However the issue that I am running into is that the SSH session is closed, due to the do_stuff function "returning" and triggering the teardown code in the context-manager (which gracefully closes the SSH session).
I have tried converting "do_stuff" into a generator, such that its state is suspended and perhaps causing the context-manager to stay open:
def do_stuff(device, return_handle=False):
with manager(device) as conn:
output = conn.send_command("show ip route")
#process output...
if return_handle:
yield (processed_output, conn)
else:
yield processed_output
And calling it as such:
def do_more_stuff(device):
gen = do_stuff(device, return_handle=True)
data, conn = next(gen)
output = conn.send_command("show users")
#process output...
return processed_output
However this approach does not seem to be working in my case, as the context-manager gets closed, and I get back a closed socket.
Is there a better way to approach this problem? Maybe my generator needs some more work...I think using a generator to hold state is the most "obvious" way that comes to mind, but overall should I be looking into another way of keeping the session open across function calls?
Thanks
I found this question because I was looking for a solution to an analogous problem where the object I wanted to keep alive was a pyvirtualdisplay.display.Display instance with selenium.webdriver.Firefox instances in it.
I also wanted any opened resources to die if an exception were raised during the display/browser instance creations.
I imagine the same could be applied to your database connection.
I recognize this probably only a partial solution and contains less-than-best practices. Help is appreciated.
This answer is the result of an ad lib spike using the following resources to patch together my solution:
https://docs.python.org/3/library/contextlib.html#contextlib.ContextDecorator
http://www.wefearchange.org/2013/05/resource-management-in-python-33-or.html
(I do not yet fully grok what is described here though I appreciate the potential. The second link above eventually proved to be the most helpful by providing analogous situations.)
from pyvirtualdisplay.display import Display
from selenium.webdriver import Firefox
from contextlib import contextmanager, ExitStack
RFBPORT = 5904
def acquire_desktop_display(rfbport=RFBPORT):
display_kwargs = {'backend': 'xvnc', 'rfbport': rfbport}
display = Display(**display_kwargs)
return display
def release_desktop_display(self):
print("Stopping the display.")
# browsers apparently die with the display so no need to call quits on them
self.display.stop()
def check_desktop_display_ok(desktop_display):
print("Some checking going on here.")
return True
class XvncDesktopManager:
max_browser_count = 1
def __init__(self, check_desktop_display_ok=None, **kwargs):
self.rfbport = kwargs.get('rfbport', RFBPORT)
self.acquire_desktop_display = acquire_desktop_display
self.release_desktop_display = release_desktop_display
self.check_desktop_display_ok = check_desktop_display_ok \
if check_desktop_display_ok is None else check_desktop_display_ok
#contextmanager
def _cleanup_on_error(self):
with ExitStack() as stack:
"""push adds a context manager’s __exit__() method
to stack's callback stack."""
stack.push(self)
yield
# The validation check passed and didn't raise an exception
# Accordingly, we want to keep the resource, and pass it
# back to our caller
stack.pop_all()
def __enter__(self):
url = 'http://stackoverflow.com/questions/30905121/'\
'keeping-context-manager-object-alive-through-function-calls'
self.display = self.acquire_desktop_display(self.rfbport)
with ExitStack() as stack:
# add XvncDesktopManager instance's exit method to callback stack
stack.push(self)
self.display.start()
self.browser_resources = [
Firefox() for x in range(self.max_browser_count)
]
for browser_resource in self.browser_resources:
for url in (url, ):
browser_resource.get(url)
"""This is the last bit of magic.
ExitStacks have a .close() method which unwinds
all the registered context managers and callbacks
and invokes their exit functionality."""
# capture the function that calls all the exits
# will be called later outside the context in which it was captured
self.close_all = stack.pop_all().close
# if something fails in this context in enter, cleanup
with self._cleanup_on_error() as stack:
if not self.check_desktop_display_ok(self):
msg = "Failed validation for {!r}"
raise RuntimeError(msg.format(self.display))
# self is assigned to variable after "as",
# manually call close_all to unwind callback stack
return self
def __exit__(self, *exc_details):
# had to comment this out, unable to add this to callback stack
# self.release_desktop_display(self)
pass
I had a semi-expected result with the following:
kwargs = {
'rfbport': 5904,
}
_desktop_manager = XvncDesktopManager(check_desktop_display_ok=check_desktop_display_ok, **kwargs)
with ExitStack() as stack:
# context entered and what is inside the __enter__ method is executed
# desktop_manager will have an attribute "close_all" that can be called explicitly to unwind the callback stack
desktop_manager = stack.enter_context(_desktop_manager)
# I was able to manipulate the browsers inside of the display
# and outside of the context
# before calling desktop_manager.close_all()
browser, = desktop_manager.browser_resources
browser.get(url)
# close everything down when finished with resource
desktop_manager.close_all() # does nothing, not in callback stack
# this functioned as expected
desktop_manager.release_desktop_display(desktop_manager)
I'm trying to figure out how to chain class methods to improve a utility class I've been writing - for reasons I'd prefer not to get into :)
Now suppose I wanted to chain a chain class methods on a class instance (in this case for setting the cursor) e.g.:
# initialize the class instance
db = CRUD(table='users', public_fields=['name', 'username', 'email'])
#the desired interface class_instance.cursor(<cursor>).method(...)
with sql.read_pool.cursor() as c:
db.cursor(c).get(target='username', where="omarlittle")
The part that's confusing is I would prefer the cursor not to persist as a class attribute after .get(...) has been called and has returned, I'd like to require that .cursor(cursor) must be first called.
class CRUD(object):
def __init__(self, table, public_fields):
self.table = table
self.public_fields = public_fields
def fields(self):
return ', '.join([f for f in self.public_fields])
def get(self, target, where):
#this is strictly for illustration purposes, I realize all
#the vulnerabilities this leaves me exposed to.
query = "SELECT {fields} FROM {table} WHERE {target} = {where}"
query.format(fields=self.fields, table=self.table, target=target,
where=where)
self.cursor.execute(query)
def cursor(self, cursor):
pass # this is where I get lost.
If I understand what you're asking, what you want is for the cursor method to return some object with a get method that works as desired. There's no reason the object it returns has to be self; it can instead return an instance of some cursor type.
That instance could have a back-reference to self, or it could get its own copy of whatever internals are needed to be a cursor, or it could be a wrapper around an underlying object from your low-level database library that knows how to be a cursor.
If you look at the DB API 2.0 spec, or implementations of it like the stdlib's sqlite3, that's exactly how they do it: A Database or Connection object (the thing you get from the top-level connect function) has a cursor method that returns a Cursor object, and that Cursor object has an execute method.
So:
class CRUDCursor(object):
def __init__(self, c, crud):
self.crud = crud
self.cursor = however_you_get_an_actual_sql_cursor(c)
def get(self, target, where):
#this is strictly for illustration purposes, I realize all
#the vulnerabilities this leaves me exposed to.
query = "SELECT {fields} FROM {table} WHERE {target} = {where}"
query.format(fields=self.crud.fields, table=self.crud.table,
target=target, where=where)
self.cursor.execute(query)
# you may want this to return something as well?
class CRUD(object):
def __init__(self, table, public_fields):
self.table = table
self.public_fields = public_fields
def fields(self):
return ', '.join([f for f in self.public_fields])
# no get method
def cursor(self, cursor):
return CRUDCursor(self, cursor)
However, there still seems to be a major problem with your example. Normally, after you execute a SELECT statement on a cursor, you want to fetch the rows from that cursor. You're not keeping the cursor object around in your "user" code, and you explicitly don't want the CRUD object to keep its cursor around, so… how do you expect to do that? Maybe get is supposed to return self.cursor.fetch_all() at the end or something?
I'm trying to implement a wrapper around a redis database that does some bookkeeping, and I thought about using descriptors. I have an object with a bunch of fields: frames, failures, etc., and I need to be able to get, set, and increment the field as needed. I've tried to implement an Int-Like descriptor:
class IntType(object):
def __get__(self,instance,owner):
# issue a GET database command
return db.get(my_val)
def __set__(self,instance,val):
# issue a SET database command
db.set(instance.name,val)
def increment(self,instance,count):
# issue an INCRBY database command
db.hincrby(instance.name,count)
class Stream:
_prefix = 'stream'
frames = IntType()
failures = IntType()
uuid = StringType()
s = Stream()
s.frames.increment(1) # float' object has no attribute 'increment'
Is seems like I can't access the increment() method in my descriptor. I can't have increment be defined in the object that the __get__ returns. This would require an additional db query if all I want to do is increment! I also don't want increment() on the Stream class, as later on when I want to have additional fields like strings or sets in Stream, then I'd need to type check the heck out of everything.
Does this work?
class Stream:
_prefix = 'stream'
def __init__(self):
self.frames = IntType()
self.failures = IntType()
self.uuid = StringType()
Why not define the magic method iadd as well as get and set. This will allow you to do normal addition with assignment on the class. It will also mean you can treat the increment separately from the get function and thereby minimise the database accesses.
So change:
def increment(self,instance,count):
# issue an INCRBY database command
db.hincrby(instance.name,count)
to:
def __iadd__(self,other):
# your code goes here
Try this:
class IntType(object):
def __get__(self,instance,owner):
class IntValue():
def increment(self,count):
# issue an INCRBY database command
db.hincrby(self.name,count)
def getValue(self):
# issue a GET database command
return db.get(my_val)
return IntValue()
def __set__(self,instance,val):
# issue a SET database command
db.set(instance.name,val)