Refactor two almost identical methods that change different attributes - python

So I just started working in a new codebase and I'm trying to help refactor some stuff. There's things like huge methods that should be splitted into chunks and so on. There's this method A that does the exact same thing as method B, with a small difference.
Let's say method A is:
def func_a(data):
# Do some stuff...
obj = get_obj_from(data)
value = 0
# Somewhere inside a loop
for item in items:
value += item.value_a
obj.attribute_a = value
# Do some other stuff...
And method B is:
def func_b(data):
# Do same stuff as func_a()...
obj = get_obj_from(data)
value = 0
count = 0
# Somewhere inside a loop that does the same as in func_a()
for item in items:
value += item.value_b
count += 1
obj.attribute_b = value
avg = value / count
# Do some other stuff just as in func_a()...
Please notice how when assigning to the obj, a different attribute is used for each method. This gets me to the point where I don't know if the right thing is to keep both methods and just extract the things that are similar. I've been trying so hard to somehow just make it into method that can do both but can't quite get it.

You can combine both methods, but you will need to pass a variable(like flag) to identify flow - to go with a or b. Like below
def func_ab(data, is_a=True):
obj = get_obj_from(data)
value = 0
count = 0
for item in items:
if is_a:
value += item.value_a
else:
value += item.value_b
count += 1
if is_a:
obj.attribute_a = value
else:
obj.attribute_b = value
avg = value / count
is_a has default value of True so while calling this function for a you can call as func_ab(data) and while calling for b func_ab(data, False)
Hope this helps!
You can further refactor:
def func_ab(data, is_a=True):
obj = get_obj_from(data)
if is_a:
obj.attribute_a = sum([item.value_a for item in items])
else:
obj.attribute_b = sum([item.value_b for item in items])
avg = obj.attribute_b / len(items)

You can make use of some of the builtins like getattr and setattr.
def func(data, att_key='a'):
obj = get_obj_from(data)
att_name = f'attribute_{att_key}'
value_name = f'value_{att_key}'
# sets the obj.{att_name} to the sum of the item.{value_name}
setattr(obj, att_name, sum([getattr(item, value_name) for item in items]) )
if att_key == 'b':
avg = obj.attribute_b / len(items)
This is flexible to handle additional cases if you have more than just 'a' and 'b' to worry about.

Related

Count frequency of item in tuple with kwargs**

This is what I wrote:
def count_passes(**kwargs)
count = 0 #Complete this function to count the number of passes
for pas in kwargs:
if pas == mark:
count = count + 1
return
result = count_passes(math="Fail", science="Fail", history="Pass", english="Pass")
print(f'The number of passes: {count_occurrence(result, "Pass")}')
How do I make it to count how often 'Pass' is in kwargs?
You seem to be missing some code from the question, but here is how you can do the count:
def count_occurrences(mark, **kwargs):
count = 0 # Complete this function to count the number of passes
for key, value in kwargs.items():
if value == mark:
print(f"Passed {key}")
count = count + 1
return count
kwargs is a dict so you need to address it's items() or values() when iterating. Otherwise you're just going through the keywords. Also the return statement should be after the loop and actually return the count as a value.
In case you wanted to improve on the implementation, here is a lighter way to do the same thing:
def count_occurrences_simpler(mark, **kwargs):
return sum(1 for v in kwargs.values() if v == mark)
Then just call the function and print the result like you were doing
result = count_occurrences("Pass", math="Fail", science="Fail", history="Pass", english="Pass")
print(f'The number of passes: {result}')
kwargs is a dictionary of values:
{"math":"Fail", "science":"Fail", "history":"Pass", "english":"Pass"}
in your example.
When you iterate over that dictionary, your are only getting the keys: "math", "science", etc.
To get the value associated with that key, you need to get it from the original dict: kwargs[pas] in your case.
Also, notice that in your code, you are not returning any value, so you are dropping all the work your function is doing to compute count.
Finally, you are returning in your for loop, right after you've started it, so you need to return at the end of the loop
However, in your case you can use kwargs.items() to get both the key and the values for instance, or even kwargs.values(), since you don't actually use the key in your code:
mark = "Pass"
def count_passes(**kwargs)
count = 0 #Complete this function to count the number of passes
for pas in kwargs.values():
if pas == mark:
count = count + 1
return count
result = count_passes(math="Fail", science="Fail", history="Pass", english="Pass")
print(f'The number of passes: {count_occurrence(result, "Pass")}')
kwargs is a dictionary, so you need to check the values. One option is to convert the values to a list and use the count function
def count_passes(**kwargs):
return list(kwargs.values()).count('Pass')

Can we return one dataframe and one variable from a function in python? [duplicate]

I would like to return two values from a function in two separate variables.
For example:
def select_choice():
loop = 1
row = 0
while loop == 1:
print('''Choose from the following options?:
1. Row 1
2. Row 2
3. Row 3''')
row = int(input("Which row would you like to move the card from?: "))
if row == 1:
i = 2
card = list_a[-1]
elif row == 2:
i = 1
card = list_b[-1]
elif row == 3:
i = 0
card = list_c[-1]
return i
return card
And I want to be able to use these values separately. When I tried to use return i, card, it returns a tuple and this is not what I want.
You cannot return two values, but you can return a tuple or a list and unpack it after the call:
def select_choice():
...
return i, card # or [i, card]
my_i, my_card = select_choice()
On line return i, card i, card means creating a tuple. You can also use parenthesis like return (i, card), but tuples are created by comma, so parens are not mandatory. But you can use parens to make your code more readable or to split the tuple over multiple lines. The same applies to line my_i, my_card = select_choice().
If you want to return more than two values, consider using a named tuple. It will allow the caller of the function to access fields of the returned value by name, which is more readable. You can still access items of the tuple by index. For example in Schema.loads method Marshmallow framework returns a UnmarshalResult which is a namedtuple. So you can do:
data, errors = MySchema.loads(request.json())
if errors:
...
or
result = MySchema.loads(request.json())
if result.errors:
...
else:
# use `result.data`
In other cases you may return a dict from your function:
def select_choice():
...
return {'i': i, 'card': card, 'other_field': other_field, ...}
But you might want consider to return an instance of a utility class (or a Pydantic/dataclass model instance), which wraps your data:
class ChoiceData():
def __init__(self, i, card, other_field, ...):
# you can put here some validation logic
self.i = i
self.card = card
self.other_field = other_field
...
def select_choice():
...
return ChoiceData(i, card, other_field, ...)
choice_data = select_choice()
print(choice_data.i, choice_data.card)
I would like to return two values from a function in two separate variables.
What would you expect it to look like on the calling end? You can't write a = select_choice(); b = select_choice() because that would call the function twice.
Values aren't returned "in variables"; that's not how Python works. A function returns values (objects). A variable is just a name for a value in a given context. When you call a function and assign the return value somewhere, what you're doing is giving the received value a name in the calling context. The function doesn't put the value "into a variable" for you, the assignment does (never mind that the variable isn't "storage" for the value, but again, just a name).
When i tried to to use return i, card, it returns a tuple and this is not what i want.
Actually, it's exactly what you want. All you have to do is take the tuple apart again.
And i want to be able to use these values separately.
So just grab the values out of the tuple.
The easiest way to do this is by unpacking:
a, b = select_choice()
I think you what you want is a tuple. If you use return (i, card), you can get these two results by:
i, card = select_choice()
def test():
....
return r1, r2, r3, ....
>> ret_val = test()
>> print ret_val
(r1, r2, r3, ....)
now you can do everything you like with your tuple.
def test():
r1 = 1
r2 = 2
r3 = 3
return r1, r2, r3
x,y,z = test()
print x
print y
print z
> test.py
1
2
3
And this is an alternative.If you are returning as list then it is simple to get the values.
def select_choice():
...
return [i, card]
values = select_choice()
print values[0]
print values[1]
you can try this
class select_choice():
return x, y
a, b = test()
You can return more than one value using list also. Check the code below
def newFn(): #your function
result = [] #defining blank list which is to be return
r1 = 'return1' #first value
r2 = 'return2' #second value
result.append(r1) #adding first value in list
result.append(r2) #adding second value in list
return result #returning your list
ret_val1 = newFn()[1] #you can get any desired result from it
print ret_val1 #print/manipulate your your result

How to find two items of a list with the same return value of a function on their attribute?

Given a basic class Item:
class Item(object):
def __init__(self, val):
self.val = val
a list of objects of this class (the number of items can be much larger):
items = [ Item(0), Item(11), Item(25), Item(16), Item(31) ]
and a function compute that process and return a value.
How to find two items of this list for which the function compute return the same value when using the attribute val? If nothing is found, an exception should be raised. If there are more than two items that match, simple return any two of them.
For example, let's define compute:
def compute( x ):
return x % 10
The excepted pair would be: (Item(11), Item(31)).
You can check the length of the set of resulting values:
class Item(object):
def __init__(self, val):
self.val = val
def __repr__(self):
return f'Item({self.val})'
def compute(x):
return x%10
items = [ Item(0), Item(11), Item(25), Item(16), Item(31)]
c = list(map(lambda x:compute(x.val), items))
if len(set(c)) == len(c): #no two or more equal values exist in the list
raise Exception("All elements have unique computational results")
To find values with similar computational results, a dictionary can be used:
from collections import Counter
new_d = {i:compute(i.val) for i in items}
d = Counter(new_d.values())
multiple = [a for a, b in new_d.items() if d[b] > 1]
Output:
[Item(11), Item(31)]
A slightly more efficient way to find if multiple objects of the same computational value exist is to use any, requiring a single pass over the Counter object, whereas using a set with len requires several iterations:
if all(b == 1 for b in d.values()):
raise Exception("All elements have unique computational results")
Assuming the values returned by compute are hashable (e.g., float values), you can use a dict to store results.
And you don't need to do anything fancy, like a multidict storing all items that produce a result. As soon as you see a duplicate, you're done. Besides being simpler, this also means we short-circuit the search as soon as we find a match, without even calling compute on the rest of the elements.
def find_pair(items, compute):
results = {}
for item in items:
result = compute(item.val)
if result in results:
return results[result], item
results[result] = item
raise ValueError('No pair of items')
A dictionary val_to_it that contains Items keyed by computed val can be used:
val_to_it = {}
for it in items:
computed_val = compute(it.val)
# Check if an Item in val_to_it has the same computed val
dict_it = val_to_it.get(computed_val)
if dict_it is None:
# If not, add it to val_to_it so it can be referred to
val_to_it[computed_val] = it
else:
# We found the two elements!
res = [dict_it, it]
break
else:
raise Exception( "Can't find two items" )
The for block can be rewrite to handle n number of elements:
for it in items:
computed_val = compute(it.val)
dict_lit = val_to_it.get(computed_val)
if dict_lit is None:
val_to_it[computed_val] = [it]
else:
dict_lit.append(it)
# Check if we have the expected number of elements
if len(dict_lit) == n:
# Found n elements!
res = dict_lit
break

Reset counter to zero if different day - Python

New to python, so I have this setup where I file gets created, and I have to add an extension number. The first file will have an extension number of 1 since being the first. A second file gets created and the extension number will increment, so it will be 2. So each files gets created, the extension number will increment.
Now, if it's a different day then that extension number will reset to 1, and it will increment if new files are created. So each day, the extension number needs to be reset to 1
def get_counter(date):
counter = 1
now = datetime.datetime.utcnow().strftime('%Y-%m-%d')
if date != now:
now = date
counter = 1
return counter
counter += 1
return counter
I have set up this function but it will not work because the now and counter variable will get overwritten. So will need these variables somewhere else. Just wondering if there is a work around this process or is there a python library that can handle this type of situation. Your suggestions will be appreciated!
You could assign the counter outside of that function and send it as a parameter, that way you don't overwrite it every single time you call your function, like so:
counter = 1
for file_to_be_writen in file_collection:
counter = get_counter(date, counter)
and leave your function like this:
def get_counter(date, counter):
now = datetime.datetime.utcnow().strftime('%Y-%m-%d')
if date == now:
counter += 1
return counter
return counter
When you need to preserve state across function calls that is a hint that you need a custom object. You could use global variables as well but encapsulating the state inside an object is usually better.
Here I implement a class Counter that takes care of everything. It has a __next__ method that returns the next number so the calling code only needs to call next(counter). It also has an __iter__ method so it can be used in for loops.
You need to provide a function to get the current (date_getter) time when creating an instance. Besides making the code more testable this allows you to decide if you want to use utc time, local time, the first day of the week so the counter resets each week, etc.
import datetime
class TimeArrowReversedError(Exception):
pass
class Counter:
def __init__(self, date_getter):
self._date_getter = date_getter
self._current_date = date_getter()
self._current_value = 0
def _check_date(self):
current_date = self._date_getter()
if self._current_date > current_date:
message = 'Time arrow got reversed. Abandon all hope.'
raise TimeArrowReversedError(message)
if self._current_date < current_date:
self._current_date = current_date
self._current_value = 0
def __next__(self):
self._check_date()
self._current_value += 1
return self._current_value
def __iter__(self):
return self
This is the code I used to test it. Note that I am using as date_getter a function that actually returns whatever date I want. I do not want to wait until 23:59 to run the test. Instead I tell the function which date to return (including going backwards in time) and see how the counter behaves.
current_date = datetime.date(year=2018, month=5, day=9)
get_current_date = lambda: current_date
counter = Counter(get_current_date)
n = next(counter)
assert n == 1
n = next(counter)
assert n == 2
for expected, result in zip([3, 4], counter):
assert expected == result
current_date = current_date + datetime.timedelta(days=1)
n = next(counter)
assert n == 1
n = next(counter)
assert n == 2
current_date = current_date - datetime.timedelta(days=2)
try:
n = next(counter)
except TimeArrowReversedError:
pass
else:
raise AssertionError('"TimeArrowReversedError" expected.')
Here is a more realistic way in which yo could use this class:
def create_counter():
return Counter(datetime.datetime.utcnow().date)
counter = create_counter()
Print the first couple of numbers:
print(next(counter))
print(next(counter))
Output:
1
2
Using a loop to add numbers to names in a list:
names = ['foo', 'bar']
for name, n in zip(names, counter):
print('{}_{}'.format(name, n))
Output:
foo_3
bar_4
Now I realize that Counter is a really bad choice because there is already a completely unrelated Counter class in the standard library. But I cannot think of a better name right now so I will leave it as is.

Python - How do I keep returning values from a function with a while statement?

So, the code outlined below sends arguments to a function I've created called bsearch and I want the function main() to send the arguments with the key argument scaled down by 1 from 11 (11,10,9,8,7...) until it reaches 0 an I want the value count outputted each time --- currently it only returns the first count. How do I get it to return after each while loop?
def main():
ilist = [x+1 for x in range(10)]
key = 11
start = 0
end = 10
while key > 0:
count = b(ilist,key,start,end)
key = key -1
return count
I think you might want to look at a few tutorials but I imagine you want something like this:
def main():
count_list = []
for x in range(1,11):
count_list.append(bsearch(x)) # append your results to a list
return count_list # return out of the scope of the loop
or using a list comprehension as suggested in the comments:
def main():
return [bsearch(x) for x in range(1,11)]

Categories