How to extend logging without aggressively modifying code? [Write clean code]

How to extend logging without aggressively modifying code? [Write clean code] - python

Let's say I have a calculate() method which have complicated calculation with many variables, while I want to log down what is the value of variables in different phase (EDIT: Not only for verification but for data study purpose). For example.
# These assignment are arbitrary,
# but my calculate() method is more complex
def calculate(a, b):
c = 2*a+b
d = a-b
if c > d+10:
g = another_calc(a, c):
else:
g = another_calc(a, d):
return c, d, g
def another_calc(a, c_d):
e = a+c_d
f = a*c_d
g = e+f
return g
You may assume the method will be modified a lot for experimental exploration.
There is no much logging here and I want to log down what happen, for example I can write aggressive code like this
# These assignment are arbitrary,
# but my calculate() method is more complex
def calculate(a, b):
info = {"a": a, "b": b}
c = 2*a+b
d = a-b
info["c"], info["d"] = c, d
if c > d+10:
info["switch"] = "entered c"
g, info = another_calc(a, c, info):
else:
info["switch"] = "entered d"
g, info = another_calc(a, d, info):
return c, d, g, info
def another_calc(a, c_d, info):
e = a+c_d
f = a*c_d
g = e+f
info["e"], info["f"], info["g"] = e, f, g
return g, info
This serve my purpose (I got the info object, then it will be exported as CSV for my further study)
But it is pretty ugly to add more (non-functional) lines to the original clean calculate() method, changing signature and return value.
But can I write a cleaner code?
I am thinking whether it is possible to use decorator to wrap this method. Hope you guys would have some great answers. Thanks.

One way to write cleaner code (my opinion) is to wrap the info -dictionary inside a class.
Here is my simple code example:
# These assignment are arbitrary,
# but my calculate() method is more complex
def calculate(a, b, logger):
logger.log("a", a)
logger.log("b", b)
c = 2*a+b
d = a-b
logger.log("c", c)
logger.log("d", d)
if c > d+10:
logger.log("switch", "entered c")
g = another_calc(a, c)
else:
logger.log("switch", "entered d")
g = another_calc(a, d)
return c, d, g
def another_calc(a, c_d, logger):
e = a+c_d
f = a*c_d
g = e+f
logger.log("e", e)
logger.log("f", f)
logger.log("g", g)
return g
class Logger(object):
data = []
def log(self, key, value):
self.data.append({key: value})
def getLog(self):
return self.data
logger = Logger()
print(calculate(4, 7, logger))
print(logger.getLog())
Pros and cons
I use separated logger class here because then I don't need to know how the logger is implemented. In the example, it is just a simple dictionary but if needed, you can just change the implementation of creating a new logger.
Also, you have a way to choose how to print the data or choose output. Maybe you can have an interface for Logger.
I used a dictionary because it looked like you was just needing key-value pairs.
Now, using the logger, we need to change method signature. Of course, you can define default value as None, for example. Then None value should be checked all the time but that is why I didn't define the default value. If you own the code and can change every reference for the calculate()method, then it should not be a problem.
There is also one interesting thing that could be important later. When you have debugged your output and not need to log anything anymore, then you can just implement Null object. Using Null object, you can just remove all logging without changing the code again.
I was trying to think how to use decorator but now find any good way. If only output should be logged, then decorator could work.

Related

Pytest fixtures not persisting despite #session scope

I have the following pytest tests set up. I would like my_factory to only be called once per name in names and file_factory to only be called once per filename in filenames
#pytest.fixture(scope="session", params=names)
def my_factory(request):
name = request.param
# create a, b, c which are expensive objects to construct
return a, b, c
#pytest.fixture(scope="session", params=filenames)
def file_factory(request):
fname = request.param
# create f which is a big file to open
return f
def test_single(my_factory, file_factory):
a, b, c = my_factory
f = file_factory
# process file f with a, b, c
# assert successful?
def test_double(my_factory, file_factory):
a, b, c = my_factory
f = file_factory
# process file f backward with a, b, c
# assert successful?
I thought scope='session' would do this for me, however when I print some debugging statements from within each factory function, I see print statements multiple times for each name in my_factory and each filename in filefactory. Is there something I am doing wrong, or is it possible to do what I am looking for?
The only other way I could think of was some sort of function that is executed first which creates a map of all of the objects I am looking for, but that seems clunky.

Refactoring nested loop with shared variables

I have a function that process some quite nested data, using nested loops. Its simplified structure is something like this:
def process_elements(root):
for a in root.elements:
if a.some_condition:
continue
for b in a.elements:
if b.some_condition:
continue
for c in b.elements:
if c.some_condition:
continue
for d in c.elements:
if d.some_condition:
do_something_using_all(a, b, c, d)
This does not look very pythonic to me, so I want to refactor it. My idea was to break it in multiple functions, like:
def process_elements(root):
for a in root.elements:
if a.some_condition:
continue
process_a_elements(a)
def process_a_elements(a):
for b in a.elements:
if b.some_condition:
continue
process_b_elements(b)
def process_b_elements(b):
for c in b.elements:
if c.some_condition:
continue
process_c_elements(c)
def proccess_c_elements(c):
for d in c.elements:
if d.some_condition:
do_something_using_all(a, b, c, d) # Problem: I do not have a nor b!
As you can see, for the more nested level, I need to do something using all its "parent" elements. The functions would have unique scopes, so I couldn't access those elements. Passing all the previous elements to each function (like proccess_c_elements(c, a, b)) does look ugly and not very pythonic to me either...
Any ideas?

I don't know the exact data structures and the complexity of your code but you may try to use a list to pass the object reference to the next daisy chained function something like the following:
def process_elements(root):
for a in root.elements:
if a.some_condition:
continue
listobjects=[]
listobjects.append(a)
process_a_elements(a,listobjects)
def process_a_elements(a,listobjects):
for b in a.elements:
if b.some_condition:
continue
listobjects.append(b)
process_b_elements(b,listobjects)
def process_b_elements(b,listobjects):
for c in b.elements:
if c.some_condition:
continue
listobjects.append(c)
process_c_elements(c,listobjects)
def process_c_elements(c,listobjects):
for d in c.elements:
if d.some_condition:
listobjects.append(d)
do_something_using_all(listobjects)
def do_something_using_all(listobjects):
print(listobjects)

FWIW, I've found a solution, which is to encapsulate all the proccessing inside a class, and having attributes to track the currently processed elements:
class ElementsProcessor:
def __init__(self, root):
self.root = root
self.currently_processed_a = None
self.currently_processed_b = None
def process_elements(self):
for a in self.root.elements:
if a.some_condition:
continue
self.process_a_elements(a)
def process_a_elements(self, a):
self.currently_processed_a = a
for b in a.elements:
if b.some_condition:
continue
self.process_b_elements(b)
def process_b_elements(self, b):
self.currently_processed_b = b
for c in b.elements:
if c.some_condition:
continue
self.process_c_elements(c)
def process_c_elements(self, c):
for d in c.elements:
if d.some_condition:
do_something_using_all(
self.currently_processed_a,
self.currently_processed_b,
c,
d
)

How would I run a function given its name?

I have a large number of blending functions:
mix(a, b)
add(a, b)
sub(a, b)
xor(a, b)
...
These functions all take the same inputs and provide different outputs, all of the same type.
However, I do not know which function must be run until runtime.
How would I go about implementing this behavior?
Example code:
def add(a, b):
return a + b
def mix(a, b):
return a * b
# Required blend -> decided by other code
blend_name = "add"
a = input("Some input")
b = input("Some other input")
result = run(add, a, b) # I need a run function
I have looked online, but most searches lead to either running functions from the console, or how to define a function.

I'm not really big fan of using dictionary in this case so here is my approach using getattr. although technically its almost the same thing and principle is also almost the same, code looks cleaner for me at least
class operators():
def add(self, a, b):
return (a + b)
def mix(self, a, b):
return(a * b)
# Required blend -> decided by other code
blend_name = "add"
a = input("Some input")
b = input("Some other input")
method = getattr(operators, blend_name)
result = method(operators, a, b)
print(result) #prints 12 for input 1 and 2 for obvious reasons
EDIT
this is edited code without getattr and it looks way cleaner. so you can make this class the module and import as needed, also adding new operators are easy peasy, without caring to add an operator in two places (in the case of using dictionary to store functions as a key/value)
class operators():
def add(self, a, b):
return (a + b)
def mix(self, a, b):
return(a * b)
def calculate(self, blend_name, a, b):
return(operators.__dict__[blend_name](self, a, b))
# Required blend -> decided by other code
oper = operators()
blend_name = "add"
a = input("Some input")
b = input("Some other input")
result = oper.calculate(blend_name, a, b)
print(result)

You can create a dictionary that maps the function names to their function objects and use that to call them. For example:
functions = {"add": add, "sub": sub} # and so on
func = functions[blend_name]
result = func(a, b)
Or, a little more compact, but perhaps less readable:
result = functions[blend_name](a, b)

You could use the globals() dictionary for the module.
result = globals()[blend_name](a, b)
It would be prudent to add some validation for the values of blend_name

Python function pointer with different arguments

I have defined three functions.
def evaluate1(a, b):
pass
def evaluate2(a, b):
pass
def evaluate3(a, b, c):
pass
What I want to do use a pointer to record which evaluate function I will use depending on the test inputs. The logic is as shown follows:
def test(a, b, c, d):
# let evaluate_function records which evaluate function I will use
if c > 1:
evaluate_function = evaluate3 # not sure
else:
if d:
evaluate_function = evaluate1
else:
evaluate_function = evaluate2
# execute the evaluate function
evaluate_function(a, b, ?)
However, since evaluate3 has different arguments from evaluate1 and evaluate3. How should I do? Thanks!

You have come up with a good idea of using a 'function pointer' to select the function. But since you know which function you are selecting at the time, you could also bind up the params:
def test(a, b, c, d):
# let evaluate_function records which evaluate function I will use
if c > 1:
evaluate_function = evaluate3 # not sure
params = a,b,d
else:
if d:
evaluate_function = evaluate1
params = a,b
else:
evaluate_function = evaluate2
params = a,c
# execute the evaluate function
evaluate_function(*params)
I'll leave it to you to properly select the params.

Why not just call the evaluate functions directly instead of assigning them to a function as so. Makes it more readable
def evaluate1(a, b):
print('evaluate1')
def evaluate2(a, b):
print('evaluate2')
def evaluate3(a, b, c):
print('evaluate3')
def test(a, b, c=None, d=None):
# let evaluate_function records which evaluate function I will use
if c and c > 1:
evaluate3(a, b, c)
else:
if d:
evaluate1(a, b)
else:
evaluate2(a, c)
test(1,2,c=0.1,d=1)
#evaluate1
test(1,2)
#evaluate2
test(1,2,3)
#evaluate3

Why does Pylint want two public methods per class?

I understand from this answer why the warning exists. However, why would the default value of it be 2?
It seems to me that classes with a single public method aside from __init__ are perfectly normal! Is there any caveat to just setting
min-public-methods=1
in the pylintrc file?

The number 2 is completely arbitrary. If min-public-methods=1 is a more fitting policy for your project and better matches your code esthetic opinions, then by all means go for it. As was once said, "Pylint doesn't know what's best".

For another perspective, Jack Diederich gave a talk at PyCon 2012 called "Stop Writing Classes".
One of his examples is the class with a single method, which he suggests should be just a function. If the idea is to set up an object containing a load of data and a single method that can be called later (perhaps many times) to act on that data, then you can still do that with a regular function by making an inner function the return value.
Something like:
def complicated(a, b, c, d, e):
def inner(k):
return (a*k, b*k, c*k, d*k, e*k)
return inner
foo = complicated(1, 2, 3, 4, 5)
result = foo(100)
This does seem much simpler to me than:
class Complicated:
def __init__(self, a, b, c, d, e):
self.a = a
self.b = b
self.c = c
self.d = d
self.e = e
def calc(self, k)
return (self.a*k, self.b*k, self.c*k, self.d*k, self.e*k)
foo = Complicated(1, 2, 3, 4, 5)
result = Complicated.calc(100)
The main limitation of the function based approach is that you cannot read back the values of a, b, c, d, and e in the example.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to extend logging without aggressively modifying code? [Write clean code] - python

Related

Pytest fixtures not persisting despite #session scope

Refactoring nested loop with shared variables

How would I run a function given its name?

Python function pointer with different arguments

Why does Pylint want two public methods per class?

Categories

Resources