Trying to understand the function of reduce in python - python

I recently received an answer from the stackoverflow fellow on my previous question and I tried to inquire more in order to understand the function but somehow no response so I wish to ask it here.
I wanted to know what is the k and v that used in the lambda represent? I thought it was representing like this......
k = dictionary ?
v = string ? # Did I understand it correctly?
dictionary = {"test":"1", "card":"2"}
string = "There istest at the cardboards"
from functools import reduce
res = reduce(lambda k, v: k.replace(v, dictionary[v]), dictionary, string)
since we use lambda then it loop each of the element within both of these variables. But why k.replace? Isnt that a dictionary? Should It be v.replace? Somehow this method works. I wish someone could explain to me how this work and please more details if possible. Thank you!

reduce is equivalent to repeatedly calling a function.
The function in this case is a lambda, but a lambda is just an anonymous function:
def f(k, v):
return k.replace(v, dictionary[v])
The definition of reduce itself is (almost—the None default here is not quite right, nor the len test):
def reduce(func, seq, initial=None):
if initial is not None:
ret = initial
for i in seq:
ret = func(ret, i)
return ret
# initial not supplied, so sequence must be non-empty
if len(seq) == 0:
raise TypeError("reduce() of empty sequence with no initial value")
first = True
for i in seq:
if first:
ret = i
first = False
else:
ret = func(ret, i)
return ret
So, ask yourself what this would do when called on your lambda function. The:
for i in dictionary
loop will iterate over each key in the dictionary. It will pass that key, along with the stored ret (or the initial argument for the first call), to your function. So you'll get each key, plus the string value that's initially "There istest at the cardboards", as your v (key from dictionary, called i in the expansion of reduce) and k (long string, called ret in the expansion of reduce) arguments.
Note that k is the full text string, not the string used as the key in the dictionary, while v is the word that is the key in the dictionary. I've used the variable names k and v here only because you did too. As noted in a comment, text and word might be better variable names in either the expanded def f(...) or the original lambda function.
Trace your code execution
Try the same code, except that instead of just:
def f(k, v):
return k.replace(v, dictionary[v])
you write it as:
def f(text, word):
print("f(text={!r}, word={!r})".format(text, word))
replacement = dictionary[word]
print(" I will now replace {!r} with {!r}".format(word, replacement))
result = text.replace(word, replacement)
print(" I got: {!r}".format(result))
return result
Run the functools.reduce function over function f with dictionary and string as the other two arguments and observe the output.

Related

What can I use instead of lambda in my Python code?

I was wondering if there was a simple alternative to lambda in my code.
def add_attack(self, attack_name):
if attack_name in self.known_attacks and attack_name not in self.attacks:
try:
assert(len(self.attacks) < 4)
self.attacks[attack_name] = self.known_attacks.get(attack_name)
return True
except:
#find the min value of self.attacks
minval = min(self.attacks.keys(), key=(lambda k: self.attacks[k]))
for keys, values in self.attacks.items():
if self.attacks[minval] == values and min(minval, keys) == keys:
minval = keys
del self.attacks[minval]
self.attacks[attack_name] = self.known_attacks.get(attack_name)
return True
else:
return False
I'm still learning python, and the lambda function is throwing me off since I haven't learned that much about it yet. Instead of using lambda, can someone help me out with another function to replace lambda? Thanks!
You could define a function for it:
def return_attacks(self,k):
return self.attacks[k]
And use that function in the key:
minval = min(self.attacks.keys(), key=(self.return_attacks))
I would strongly recommend you get comfortable with lambda functions - and I think it is clear to you now that lambda x : expr(x) is equivalent to func when
def func(x):
return expr(x)
A lambda should not scare you! It's just a small anonymous function.
It can take any number of arguments, but can only have one expression.
minval = min(self.attacks.keys(), key=(lambda k: self.attacks[k]))
Here you are getting the result of the expression min() as minval
The min function can take keys, here is more about that. I can see it can be confusing, but this key is not the same thing with a dictionary key. This key is just a way to tell the min function how it should behave.
If we go back to the code:
So the line basically finds the minimum value in self.attacks.keys(), with a lambda function that returns every element in self.attacks[]
If you do not want to use lambda, you can write a function in your class that does exactly the same thing.
def find_min_key(self, my_dict):
return min(my_dict, key= my_dict.get)
You can use this as:
min_val = self.find_min_key(self.attacks)

python, printing longest length of string in a list

My question is to write a function which returns the longest string and ignores any non-strings, and if there are no strings in the input list, then it should return None.
my answer:
def longest_string(x):
for i in max(x, key=len):
if not type(i)==str:
continue
if
return max
longest_string(['cat', 'dog', 'horse'])
I'm a beginner so I have no idea where to start. Apologies if this is quite simple.
This is how i would do it:
def longest_string(x):
Strings = [i for i in x if isinstance(i, str)]
return(max(Strings, key=len)) if Strings else None
Based on your code:
def longest_string(x):
l = 0
r = None
for s in x:
if isinstance(s, str) and len(s) > l:
l = len(s)
r = s
return r
print(longest_string([None, 'cat', 1, 'dog', 'horse']))
# horse
def longest_string(items):
try:
return max([x for x in items if isinstance(x, str)], key=len)
except ValueError:
return None
def longest_string(items):
strings = (s for s in items if isinstance(s, str))
longest = max(strings, key=len) if strings else None
return longest
print(longest_string(['cat', 'dog', 'horse']))
Your syntax is wrong (second-to-last line: if with no condition) and you are returning max which you did not define manually. In actuality, max is a built-in Python function which you called a few lines above.
In addition, you are not looping through all strings, you are looping through the longest string. Your code should instead be
def longest_string(l):
strings = [item for item in l if type(item) == str]
if len(strings):
return max(strings, key=len)
return None
You're on a good way, you could iterate the list and check each item is the longest:
def longest_string(x)
# handle case of 0 strings
if len(x) == 0:
return None
current_longest = ""
# Iterate the strings
for i in x:
# Handle nonestring
if type(i) != str:
continue
# if the current string is longer than the longest, replace the string.
if len(i) > len(current_longest):
current_longest = i
# This condition handles multiple elements where none are strings and should return None.
if len(current_longest) > 0:
return current_longest
else:
return None
Since you are a beginner, I recommend you to start using python's built-in methods to sort and manage lists. Is the best when it comes to logic and leaves less room for bugs.
def longest_string(x):
x = filter(lambda obj: isinstance(obj, str), x)
longest = max(list(x), key=lambda obj: len(obj), default=None)
return longest
Nonetheless, you were in a good way. Just avoid using python´s keywords for variable names (such as max, type, list, etc.)
EDIT: I see a lot of answers using one-liner conditionals, list comprehension, etc. I think those are fantastic solutions, but for the level of programming the OP is at, my answer attempts to document each step of the process and be as readable as possible.
First of all, I would highly suggest defining the type of the x argument in your function.
For example; since I see you are passing a list, you can define the type like so:
def longest_string(x: list):
....
This not only makes it more readable for potential collaborators but helps enormously when creating docstrings and/or combined with using an IDE that shows type hints when writing functions.
Next, I highly suggest you break down your "specs" into some pseudocode, which is enormously helpful for taking things one step at a time:
returns the longest string
ignores any non-strings
if there are no strings in the input list, then it should return None.
So to elaborate on those "specifications" further, we can write:
Return the longest string from a list.
Ignore any element from the input arg x that is not of type str
if no string is present in the list, return None
From here we can proceed to writing the function.
def longest_string(x: list):
# Immediately verify the input is the expected type. if not, return None (or raise Exception)
if type(x) != list:
return None # input should always be a list
# create an empty list to add all strings to
str_list = []
# Loop through list
for element in x:
# check type. if not string, continue
if type(element) != str:
pass
# at this point in our loop the element has passed our type check, and is a string.
# add the element to our str_list
str_list.append(element)
# we should now have a list of strings
# however we should handle an edge case where a list is passed to the function that contains no strings at all, which would mean we now have an empty str_list. let's check that
if not str_list: # an empty list evaluates to False. if not str_list is basically saying "if str_list is empty"
return None
# if the program has not hit one of the return statements yet, we should now have a list of strings (or at least 1 string). you can check with a simple print statement (eg. print(str_list), print(len(str_list)) )
# now we can check for the longest string
# we can use the max() function for this operation
longest_string = max(str_list, key=len)
# return the longest string!
return longest_string

My lempel zip implementation makes encoding longer

I can't work out why my implementation is creating a longer string than the input.
It is implemented according to the description in this document and only this description.
It is simply designed to act on binary strings only. If anyone can shed some light on why this creates a longer string than it started with I'd be very greatful!
Main Encoding
def LZ_encode(uncompressed):
m=uncompressed
dictionary=dict_gen(m)
list=[int(bin(i)[2:]) for i in range(1,len(dictionary))]
pointer_bit=[]
for k in list:
pointer_bit=pointer_bit+[(str(chopped_lookup(k,dictionary)),dictionary[k][-1])]
new_pointer_bit=pointer_length_correct(pointer_bit)
list_output=[i for sub in new_pointer_bit for i in sub]
if list_output[-1]=='$':
output=''.join(list_output[:-1])
else:
output=''.join(list_output)
return output
Component Functions
def dict_gen(m): # Generates Dictionary
dictionary={0:""}
j=1
w=""
iterator=0
l=len(m)
for c in m:
iterator+=1
wc= str(str(w) + str(c))
if wc in dictionary.values():
w=wc
if iterator==l:
dictionary.update({int(bin(j)[2:]): wc+'$'})
else:
dictionary.update({int(bin(j)[2:]): wc})
w=""
j+=1
return dictionary
def chopped_lookup(k,dictionary): # Returns entry number of shortened source string
cut_source_string=dictionary[k][:-1]
for key, value in dictionary.iteritems():
if value == cut_source_string:
return key
def pointer_length_correct(lst): # Takes the (pointer,bit) list and corrects the lenth of the pointer
new_pointer_bit=[]
for pair in lst:
n=lst.index(pair)
if len(str(pair[0]))>ceil(log(n+1,2)):
while len(str(pair[0]))!=ceil(log(n+1,2)):
pair = (str(pair[0])[1:],pair[1])
if len(str(pair[0]))<ceil(log(n+1,2)):
while len(str(pair[0]))!=ceil(log(n+1,2)):
pair = (str('0'+str(pair[0])),pair[1])
new_pointer_bit=new_pointer_bit+[pair]
return new_pointer_bit

Constructing function from lists within lists in Python

I'm stumped on how to construct a function that works on lists within lists from inside out (I guess that's how you could poorly describe it).
I'm trying to dynamically turn a list like
res = SomeDjangoQuerySet
x = ['neighborhood', ['city', ['metro', 'metro']]]
into:
getattr(getattr(getattr(getattr(res, 'neighborhood'), 'city'), 'metro'), 'metro')
AKA:
getattr(getattr(getattr(getattr(res, x[0]), x[1][0]), x[1][1][0]), x[1][1][1])
Basically, the first value will always be a string, the second value will either be a string or a list. Each list will follow this pattern (string, string OR list). The depth of lists within lists is indeterminate. The innermost first value of the getattr() will be an outside variable ('res' in this case). Any advice?
This sounds like recursion and iteration might be useful. Does this do what you want?
def flatten(data):
res = []
if hasattr(data, '__iter__'):
for el in data:
res.extend(flatten(el))
else:
res.append(data)
return res
reduce(getattr, flatten(x), res)
I ended up putting in some time and learning about recursion and found this to be the simplest solution (although, credit to David Zwicker who also provided a working solution).
def recursion(a, b):
if type(b) is list:
return recursion(getattr(a, b[0]), b[1])
else:
return getattr(a, b)
recursion(res, x)
def nestattr(x, y):
if isinstance(y, str):
return getattr(x, y)
elif isinstance(y, list):
return nestattr(getattr(x, y[0]), y[1])
nestattr(res, x)
So you start off with the first string in the list, and you have the getattr of (1) the query with (2) that string. Then you recurse using the rest of that list, and if it's a string, you just do the getattr on (1) the result of the previous getattr with (2) this string. Otherwise, if it's still a list, you repeat. I think this is what you're looking for? Correct me if I'm wrong.

Detect last iteration over dictionary.iteritems() in python

Is there a simple way to detect the last iteration while iterating over a dictionary using iteritems()?
There is an ugly way to do this:
for i, (k, v) in enumerate(your_dict.items()):
if i == len(your_dict)-1:
# do special stuff here
But you should really consider if you need this. I am almost certain that there is another way.
as others have stated, dictionaries have no defined order, so it's hard to imagine why you would need this, but here it is
last = None
for current in your_dict.iteritems():
if last is not None:
# process last
last = current
# now last contains the last thing in dict.iteritems()
if last is not None: # this could happen if the dict was empty
# process the last item
it = spam_dict.iteritems()
try:
eggs1 = it.next()
while True:
eggs2 = it.next()
do_something(eggs1)
eggs1 = eggs2
except StopIteration:
do_final(eggs1)
Quick and quite dirty. Does it solve your issue?
I know this late, but here's how I've solved this issue:
dictItemCount = len(dict)
dictPosition = 1
for key,value in dict
if(dictPosition = dictItemCount):
print 'last item in dictionary'
dictPosition += 1
This is a special case of this broader question. My suggestion was to create an enumerate-like generator that returns -1 on the last item:
def annotate(gen):
prev_i, prev_val = 0, gen.next()
for i, val in enumerate(gen, start=1):
yield prev_i, prev_val
prev_i, prev_val = i, val
yield '-1', prev_val
Add gen = iter(gen) if you want it to handle sequences as well as generators.
I recently had this issue, I thought this was the most elegant solution because it allowed you to write for i,value,isLast in lastEnumerate(...)::
def lastEnumerate(iterator):
x = list(iterator)
for i,value in enumerate(x):
yield i,value,i==len(x)-1
For example:
for i,value,isLast in lastEnumerate(range(5)):
print(value)
if not isLast:
print(',')
The last item in a for loop hangs around after the for loop anyway:
for current_item in my_dict:
do_something(current_item)
try:
do_last(current_item)
except NameError:
print "my_dict was empty"
Even if the name "current_item" is in use before the for loop, attempting to loop over an empty dict seems to have the effect of deleting current_item, hence the NameError
You stated in an above comment that you need this to construct the WHERE clause of an SQL SELECT statement. Perhaps this will help:
def make_filter(colname, value):
if isinstance(value, str):
if '%' in value:
return "%s LIKE '%s'" % (colname, value)
else:
return "%s = '%s'" % (colname, value)
return "%s = %s" % (colname, value)
filters = {'USER_ID':'123456', 'CHECK_NUM':23459, 'CHECK_STATUS':'C%'}
whereclause = 'WHERE '+'\nAND '.join(make_filter(*x) for x in filters.iteritems())
print whereclause
which prints
WHERE CHECK_NUM = 23459
AND CHECK_STATUS LIKE 'C%'
AND USER_ID = '123456'
The approach that makes the most sense is to wrap the loop in some call which contains a hook to call your post-iteration functionality afterwards.
This could be implemented as context manager and called through a 'with' statement or, for older versions of Python, you could use the old 'try:' ... 'finally:' construct. It could also be wrapped in a class where the dictionary iteration is self dispatched (a "private" method) and the appendix code follows that in the public method. (Understanding that the distension between public vs private is a matter of intention and documentation, not enforced by Python).
Another approach is to enumerate your dict and compare the current iteration against the final one. Its easier to look at and understand in my opinion:
for n, (key, value) in enumerate(yourDict.items()):
if yourDict[n] == yourDict[-1]:
print('Found the last iteration!:', n)
OR you could just do something once the iteration is finished:
for key, value in yourDict.items():
pass
else:
print('Finished iterating over `yourDict`')
No. When using an iterator you do not know anything about the position - actually, the iterator could be infinite.
Besides that, a dictionary is not ordered. So if you need it e.g. to insert commas between the elements you should take the items, sort them and them iterate over the list of (key, value) tuples. And when iterating over this list you can easily count the number of iterations and thus know when you have the last element.

Categories