The race-condition-free way of updating a variable in redis is:
r = redis.Redis()
with r.pipeline() as p:
while 1:
try:
p.watch(KEY)
val = p.get(KEY)
newval = int(val) + 42
p.multi()
p.set(KEY, newval)
p.execute() # raises WatchError if anyone else changed KEY
break
except redis.WatchError:
continue # retry
this is significantly more complex than the straight forward version (which contains a race-condition):
r = redis.Redis()
val = r.get(KEY)
newval = int(val) + 42
r.set(KEY, newval)
so I thought a context manager would make this easier to work with, however, I'm having problems...
My initial idea was
with update(KEY) as val:
newval = val + 42
somehow return newval to the contextmanager...?
there wasn't an obvious way to do the last line, so I tried::
#contextmanager
def update(key, cn=None):
"""Usage::
with update(KEY) as (p, val):
newval = int(val) + 42
p.set(KEY, newval)
"""
r = cn or redis.Redis()
with r.pipeline() as p:
while 1:
try:
p.watch(key) # --> immediate mode
val = p.get(key)
p.multi() # --> back to buffered mode
yield (p, val)
p.execute() # raises WatchError if anyone has changed `key`
break # success, break out of while loop
except redis.WatchError:
pass # someone else got there before us, retry.
which works great as long as I don't catch a WatchError, then I get
File "c:\python27\Lib\contextlib.py", line 28, in __exit__
raise RuntimeError("generator didn't stop")
RuntimeError: generator didn't stop
what am I doing wrong?
I think the problem is that you yield multiple times (when the task is repeated) but a context manager is only entered once (the yield is just a syntactic sugar for the __enter__ method). So as soon as the yield can be executed multiple times, you have a problem.
I’m not prefectly sure how to solve this in a good way, and I can’t test it either, so I’m only giving some suggestions.
First of all, I would avoid yielding the rather internal p; you should yield some object that is specifically made for the update process. For example something like this:
with update(KEY) as updater:
updater.value = int(updater.original) + 42
Of course this still doesn’t solve the multiple yields, and you cannot yield that object earlier as you won’t have the original value at that point either. So instead, we could specify a delegate responsible for the value updating instead.
with update(KEY) as updater:
updater.process = lambda value: value + 42
This would store a function inside the yielded object which you can then use inside the context manager to keep trying to update the value until it succeeded. And you can yield that updater from the context manager early, before entering the while loop.
Of course, if you have made it this far, there isn’t actually any need for a context manager left. Instead, you can just make a function:
update(key, lambda value: value + 42)
Related
I am learning functional programming at the moment, and ofcourse I want to implement what I have learned whenever I can.
I am in the middle of a project where I have to send some http requests to a server, and I want to count how many of these requests returned a status_code 200.
Right now I have some stupid code setup as follows:
global counter
while True:
now_url = 127.0.0.1
status, value = getStatus(now_url)
counter += c
Where counter is a global counter, and if the getStatus gets a status_code of 200, the value will be 1 otherwise it will be 0.
So I was thinking maybe instead of using a global counter I could just pass around the state of the previous loop, and I would get rid of the stupid global counter value.
So I tried to implement the getStatus in a monadic way with the bind and return as such
def bind(f, arg):
res = f(arg[0])
return res[0], arg[1] + res[1]
def ret(f):
return (f, 0)
But this is not trivival since I am not using function composition in the getStatus function that is defined as such
def getStatus(now_url):
try:
respones = requests.get(now_url)
if respones.status_code == 200:
return respones.status_code, 1
else:
return respones.status_code, 0
except Exception as e:
return e, 0
So the question is how to restructure my code in such a way that I can use the power of monads to count the number of status_code == 200.
Hope you can help :)
Thats my first question on Stackoverflow and im a totally Python beginner.
I want to write, to get firm with python, a small Backup-Programm, the main part is done, but now i want to make it a bit "portable" and use a Config file, which i want to Validate.
My class "getBackupOptions" should be give Back a validate dict which should be enriched with "GlobalOptions" and "BackupOption" so that i finally get an fully "BackupOption" dict when i call "getBackupOptions.BackupOptions".
My Question now is, (in this Example is it easy, because its only the Function which check if the Path should be Recursive searched or not) how to simplify my Code?
For each (possible) Error i must write a new "TryExcept" Block - Can i Simplify it?
Maybe is there another way to Validate Config Files/Arrays?
class getBackupOptions:
def __init__(self,BackupOption,GlobalOptions):
self.BackupOption = BackupOption
self.GlobalOptions = GlobalOptions
self.getRecusive()
def getRecusive(self):
try:
if self.BackupOption['recursive'] != None:
pass
else:
raise KeyError
except KeyError:
try:
if self.GlobalOptions['recursive'] != None:
self.BackupOption['recursive'] = self.GlobalOptions['recursive']
else:
raise KeyError
except KeyError:
print('Recusive in: ' + str(self.BackupOption) + ' and Global is not set!')
exit()
Actually i only catch an KeyError, but what if the the Key is there but there is something else than "True" or "False"?
Thanks a lot for you help!
You may try this
class getBackupOptions:
def __init__(self,BackupOption,GlobalOptions):
self.BackupOption = BackupOption
self.GlobalOptions = GlobalOptions
self.getRecusive()
def getRecusive(self):
if self.BackupOption.get('recursive') == 'True' and self.GlobalOptions.get('recursive') == 'True':
self.BackupOption['recursive'] = self.GlobalOptions['recursive']
else:
print('Recusive in: ' + str(self.BackupOption) + ' and Global is not set!')
exit()
Here get method is used, therefore KeyError will not be faced.
If any text other than True comes in the field it will be considered as False.
in fucntion getLink(urls), I have return (cloud,parent,children)
in main function, I have (cloud,parent,children) = getLink(urls) and I got error of this line: TypeError: 'NoneType' object is not iterable
parent and children are all list of http links. since, it is not able to paste them here, parent is a list contains about 30 links; children is a list contains about 30 items, each item is about 10-100 links which is divide by ",".
cloud is a list contain about 100 words, like that: ['official store', 'Java Applets Centre', 'About Google', 'Web History'.....]
I didnot know why I get an error. Is there anything wrong in passing parameter? Or because the list take too much space?
#crawler url: read webpage and return a list of url and a list of its name
def crawler(url):
try:
m = urllib.request.urlopen(url)
msg = m.read()
....
return (list(set(list(links))),list(set(list(titles))) )
except Exception:
print("url wrong!")
#this is the function has gone wrong: it throw an exception here, also the error I mentioned, also it will end while before len(parent) reach 100.
def getLink(urls):
try:
newUrl=[]
parent = []
children =[]
cloud =[]
i=0
while len(parent)<=100:
url = urls[i]
if url in parent:
i += 1
continue
(links, titles) = crawler(url)
parent.append(url)
children.append(",".join(links))
cloud = cloud + titles
newUrl= newUrl+links
print ("links: ",links)
i += 1
if i == len(urls):
urls = list(set(newUrl))
newUrl = []
i = 0
return (cloud,parent,children)
except Exception:
print("can not get links")
def readfile(file):
#not related, this function will return a list of url
def main():
file='sampleinput.txt'
urls=readfile(file)
(cloud,parent,children) = getLink(urls)
if __name__=='__main__':
main()
There might be a way that your function ends without reaching the explicit return statement.
Look at the following example code.
def get_values(x):
if x:
return 'foo', 'bar'
x, y = get_values(1)
x, y = get_values(0)
When the function is called with 0 as parameter the return is skipped and the function will return None.
You could add an explicit return as the last line of your function. In the example given in this answer it would look like this.
def get_values(x):
if x:
return 'foo', 'bar'
return None, None
Update after seing the code
When the exception is triggered in get_link you just print something and return from the function. You have no return statement, so Python will return None. The calling function now tries to expand None into three values and that fails.
Change your exception handling to return a tuple with three values like you do it when everything is fine. Using None for each value is a good idea for it shows you, that something went wrong. Additionally I wouldn't print anything in the function. Don't mix business logic and input/output.
except Exception:
return None, None, None
Then in your main function use the following:
cloud, parent, children = getLink(urls)
if cloud is None:
print("can not get links")
else:
# do some more work
So I'm working on a chemistry project for fun, and I have a function that initializes a list from a text file. What I want to do s make it so the function replaces itself with a list. So here's my first attempt at it which randomly will or won't work and I don't know why:
def periodicTable():
global periodicTable
tableAtoms = open('/Users/username/Dropbox/Python/Chem Project/atoms.csv','r')
listAtoms = tableAtoms.readlines()
tableAtoms.close()
del listAtoms[0]
atoms = []
for atom in listAtoms:
atom = atom.split(',')
atoms.append(Atom(*atom))
periodicTable = atoms
It gets called in in this way:
def findAtomBySymbol(symbol):
try:
periodicTable()
except:
pass
for atom in periodicTable:
if atom.symbol == symbol:
return atom
return None
Is there a way to make this work?
Don't do that. The correct thing to do would be using a decorator that ensures the function is only executed once and caches the return value:
def cachedfunction(f):
cache = []
def deco(*args, **kwargs):
if cache:
return cache[0]
result = f(*args, **kwargs)
cache.append(result)
return result
return deco
#cachedfunction
def periodicTable():
#etc
That said, there's nothing stopping you from replacing the function itself after it has been called, so your approach should generally work. I think the reason it doesn't is because an exception is thrown before you assign the result to periodicTable and thus it never gets replaced. Try removing the try/except block or replacing the blanket except with except TypeError to see what exactly happens.
This is very bad practice.
What would be better is to have your function remember if it has already loaded the table:
def periodicTable(_table=[]):
if _table:
return _table
tableAtoms = open('/Users/username/Dropbox/Python/Chem Project/atoms.csv','r')
listAtoms = tableAtoms.readlines()
tableAtoms.close()
del listAtoms[0]
atoms = []
for atom in listAtoms:
atom = atom.split(',')
atoms.append(Atom(*atom))
_table[:] = atoms
The first two lines check to see if the table has already been loaded, and if it has it simply returns it.
I'm creating some objects from files (validators from templates xsd files, to draw together other xsd files, as it happens), and I'd like to recreate the objects when the file on disk changes.
I could create something like:
def getobj(fname, cache = {}):
try:
obj, lastloaded = cache[fname]
if lastloaded < last_time_written(fname):
# same stuff as in except clause
except KeyError:
obj = create_from_file(fname)
cache[fname] = (obj, currenttime)
return obj
However, I would prefer to use someone else's tested code if it exists. Is there an existing library that does something like this?
Update: I'm using python 2.7.1.
Your code (including the cache logic) looks fine.
Consider moving the cache variable outside the function definition. That will make it possible to add other functions to clear or inspect the cache.
If you want to look at code that does something similar, look at the source for the filecmp module: http://hg.python.org/cpython/file/2.7/Lib/filecmp.py The interesting part is how the stat module is used to determine whether a file has changed. Here is the signature function:
def _sig(st):
return (stat.S_IFMT(st.st_mode),
st.st_size,
st.st_mtime)
Three thoughts.
Use try... except... else for a neater control flow.
File modification times are notoriously unstable -- in particular, they don't necessarily correspond to the most recent time the file was modified!
Python 3 contains a caching decorator: functools.lru_cache. Here's the source.
def lru_cache(maxsize=100):
"""Least-recently-used cache decorator.
If *maxsize* is set to None, the LRU features are disabled and the cache
can grow without bound.
Arguments to the cached function must be hashable.
View the cache statistics named tuple (hits, misses, maxsize, currsize) with
f.cache_info(). Clear the cache and statistics with f.cache_clear().
Access the underlying function with f.__wrapped__.
See: http://en.wikipedia.org/wiki/Cache_algorithms#Least_Recently_Used
"""
# Users should only access the lru_cache through its public API:
# cache_info, cache_clear, and f.__wrapped__
# The internals of the lru_cache are encapsulated for thread safety and
# to allow the implementation to change (including a possible C version).
def decorating_function(user_function,
tuple=tuple, sorted=sorted, len=len, KeyError=KeyError):
hits = misses = 0
kwd_mark = (object(),) # separates positional and keyword args
lock = Lock() # needed because ordereddicts aren't threadsafe
if maxsize is None:
cache = dict() # simple cache without ordering or size limit
#wraps(user_function)
def wrapper(*args, **kwds):
nonlocal hits, misses
key = args
if kwds:
key += kwd_mark + tuple(sorted(kwds.items()))
try:
result = cache[key]
hits += 1
except KeyError:
result = user_function(*args, **kwds)
cache[key] = result
misses += 1
return result
else:
cache = OrderedDict() # ordered least recent to most recent
cache_popitem = cache.popitem
cache_renew = cache.move_to_end
#wraps(user_function)
def wrapper(*args, **kwds):
nonlocal hits, misses
key = args
if kwds:
key += kwd_mark + tuple(sorted(kwds.items()))
try:
with lock:
result = cache[key]
cache_renew(key) # record recent use of this key
hits += 1
except KeyError:
result = user_function(*args, **kwds)
with lock:
cache[key] = result # record recent use of this key
misses += 1
if len(cache) > maxsize:
cache_popitem(0) # purge least recently used cache entry
return result
def cache_info():
"""Report cache statistics"""
with lock:
return _CacheInfo(hits, misses, maxsize, len(cache))
def cache_clear():
"""Clear the cache and cache statistics"""
nonlocal hits, misses
with lock:
cache.clear()
hits = misses = 0
wrapper.cache_info = cache_info
wrapper.cache_clear = cache_clear
return wrapper
return decorating_function
Unless there is a specific reason to use it as argument I would use cache as a global object