Python: Understanding reduce()'s 'initializer' argument - python

I'm relatively new to Python and am having trouble
with Folds or more specifically, reduce()'s 'initializer' argument
e.g. reduce(function, iterable[, initializer])
Here is the function...
>>> def x100y(x,y):
... return x*100+y
Could someone explain why reduce() produces 44...
>>> reduce(x100y, (), 44)
44
or why it produces 30102 here...
>>> reduce(x100y, [1,2], 3)
30102

From the docs:
reduce(function, iterable[, initializer])
Apply function of two
arguments cumulatively to the items of iterable, from left to right,
so as to reduce the iterable to a single value. For example,
reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates
((((1+2)+3)+4)+5). The left argument, x, is the accumulated value and
the right argument, y, is the update value from the iterable. If the
optional initializer is present, it is placed before the items of the
iterable in the calculation, and serves as a default when the iterable
is empty. If initializer is not given and iterable contains only one
item, the first item is returned.
The initializer is placed as element 0 in your iterable and if there are no elements in your iterable it is returned. (So this is why you get 44)
Also, x100y is not a valid python function. If you want to make it into a valid python function you would have to do
reduce(lambda x,y: x*100*y,[1,2],3)
which is equivalent to
(3*100*1)*100*2 which should give 60000 (why you got the value you had is probably because of an error)
Documentation on lambda is here

Related

Python3.6 map object

What is the difference between the following two code snippets:
a1 = map(int,'222211112211')
and
a2 = int('222211112211')
Why can I iterate over a1 which is type int?
For example: I can do something like this with a1:
for i in a1:
print(i)
But not with a2?
In a1, you're iterating over a string and converting each of its characters into an int, the result is an iterator of ints, check the documentation for map:
list(a1) # use `list()` to consume the iterator
=> [2, 2, 2, 2, 1, 1, 1, 1, 2, 2, 1, 1]
In a2, you're converting a string into an int, the result is an int. They're very different things!
a2
=> 222211112211
That's why you can iterate over a1, because it's an iterator, not an int. And that's why you can't iterate over a2, it's just an int.
I think the best thing to do in situations like these is to check the actual values in the REPL.
>>> a1 = map(int, '2212')
>>> a1
<map object at 0x7f0036c2a4a8>
>>> list(a1)
[2, 2, 1, 2]
>>> a2 = int('2212')
>>> a2
2212
So a1 is a special map object which turns out to be iterable. It stores each character of '2212', individually converted to an integer. Meanwhile, a2 simply converts the whole string to a simple integer. It would be an error to iterate over a2, but it would also be an error to do integer arithmetic on a1.
in python 3.6 ,the map function returns a generator ,
Note:these are are not loaded into the memory as a whole
This is the implementation of map in python3
def map(func, iterable):
for i in iterable:
yield func(i)
when u call the map as iterable the value is produced onspot as opposed to loading the entire thing, which could crash the memory in some cases.
list(map()) does is load the entire list into the memory which is what python2 versions of map does
`a=int('2212')`
gives a an integer value thats it its not an iterable
a=map(int,'2212')
returns an iterable generator object
<map object at 0x7f97e30d7f98>
it is an iterable which converts each character in the string to int and yields the result one by one
so
a=map(int ,'2212')
for i in a:
print(i)
for i in a:
print(i)
would print
2
2
1
2
calling the stored map object second time yields no result as the function has run out of values in the first run
if you want to get the values in the second run convert it to a list so that it resides in the main memory list(a) or if it is really long store the result of the map object in a separate file and read from there
In Python 3, map return an iterator that you can use in a for loop.
map takes at least two parameters, a function and an iterable argument. Here int is the function and '222211112211', a string, is the iterable object. map applies the function to each value of the iterable. Here, int will be applied to "2", "2", "2", ... "2", "1", "1" individually and make them all to be integers. map will return an iterator that allows you to loop over the results yielding from the previous step: (2, 2, 2, ..., 1, 1)
For a2, you are creating an integer out of the int function, and an integer is not iterable. Thus, you cannot loop of it.
Below is the description of map cited from Python 3 documentation.
map(function, iterable, ...)
: Return an iterator that applies function to every item of iterable, yielding the results. If additional iterable arguments are passed, function must take that many arguments and is applied to the items from all iterables in parallel. With multiple iterables, the iterator stops when the shortest iterable is exhausted. For cases where the function inputs are already arranged into argument tuples, see itertools.starmap().
One thing worth noting is that in Python 2, map returns a list rather than an iterator. In Python 3, you have to explicitly make it a list like list(map(int, "222")).

How work cmp argument on sort function in python

Can you explain this python code?
Here how does L.sort(fun) work?
def fun(a, b):
return cmp(a[1], b[1])
L= [[2, 1], [4, 5, 3]]
L.sort(fun)
print L
From the official documentation:
The sort() method takes optional arguments for controlling the comparisons.
cmp specifies a custom comparison function of two arguments (list items)
which should return a negative, zero or positive number depending on whether
the first argument is considered smaller than, equal to, or larger than the
second argument: cmp=lambda x,y: cmp(x.lower(), y.lower()).
The default value is None.
So you are trying to control the comparison using your own function "fun". Which say compares the values present at 1st index of the lists inside list(nested lists).
if you try to test it separately you will get -1 as the a[1] is smaller than b[1]
clearly and hence the output is "[[2,1],[4,5,3]]" which is already sorted
a = [2,1]
b = [4,5,3]
cmp(a[1], b[1])
You can give it a try changing value at 1st index something like this and you will understand how it is working.
Something like this
def fun(a,b):
return cmp(a[1], b[1])
L=[[2,6],[4,5,3]]
L.sort(fun)
print L
I hope this will help.

TypeError in Python due to empty sequence

I am using a simple script:
def get_stories(self, f):
data = [([], [u'Where', u'is', u'the', u'apple', u'?'],u'office')]
flatten = lambda data: reduce(lambda x, y: x + y, data)
data = [(flatten(story), q, answer) for story, q, answer in data]
return data
TypeError: reduce() of empty sequence with no initial value
But data is not empty !!
Why is this error happening.
Thanks a lot for your help.
The error message gives a good hint about how to solve the problem, by stating that there is "no initial value". Here's what docs for reduce have to say:
reduce(function, iterable[, initializer])
Apply function of two arguments cumulatively to the items of iterable, from left to right, so as to reduce the iterable to a single
value. For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5])
calculates ((((1+2)+3)+4)+5). The left argument, x, is the accumulated
value and the right argument, y, is the update value from the
iterable. If the optional initializer is present, it is placed before
the items of the iterable in the calculation, and serves as a default
when the iterable is empty. If initializer is not given and iterable
contains only one item, the first item is returned. [emphasis added]
So your flatten function should look like this:
flatten = lambda data: reduce(lambda x, y: x + y, data, [])

get average length of words using python reduce

I am attempting to get the average length of words in a file using reduce, but I am getting the following error "TypeError: object of type 'int' has no len()" which i find baffling because the file is filled with words and I am simply getting the length of each
def averageWord():
myWords = open('myfile.txt', 'r').read().split(' ')
avg = (reduce(lambda x,y: len(x) + len(y) ,myWords)) / len(myWords)
print avg
The reduce function will work upon two elements of the list at a time and then work with the returned value along with the next element. So the reduce function is used in a wrong way. Quoting from the docs
Apply function of two arguments cumulatively to the items of iterable, from left to right, so as to reduce the iterable to a single value. For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates ((((1+2)+3)+4)+5).
(emphasis mine)
Using sum along with map is a better way (as mentioned)
That is
avg = sum(map(len,myWords)) /len(myWords)
This would give you the average as expected

How to use python generator expressions to create a oneliner to run a function multiple times and get a list output

I am wondering if there is there is a simple Pythonic way (maybe using generators) to run a function over each item in a list and result in a list of returns?
Example:
def square_it(x):
return x*x
x_set = [0,1,2,3,4]
squared_set = square_it(x for x in x_set)
I notice that when I do a line by line debug on this, the object that gets passed into the function is a generator.
Because of this, I get an error:
TypeError: unsupported operand type(s) for *: 'generator' and 'generator'
I understand that this generator expression created a generator to be passed into the function, but I am wondering if there is a cool way to accomplish running the function multiple times only by specifying an iterable as the argument? (without modifying the function to expect an iterable).
It seems to me that this ability would be really useful to cut down on lines of code because you would not need to create a loop to fun the function and a variable to save the output in a list.
Thanks!
You want a list comprehension:
squared_set = [square_it(x) for x in x_set]
There's a builtin function, map(), for this common problem.
>>> map(square_it, x_set)
[0,1,4,9,16] # On Python 3, a generator is returned.
Alternatively, one can use a generator expression, which is memory-efficient but lazy (meaning the values will not be computed now, only when needed):
>>> (square_it(x) for x in x_set)
<generator object <genexpr> at ...>
Similarly, one can also use a list comprehension, which computes all the values upon creation, returning a list.
Additionally, here's a comparison of generator expressions and list comprehensions.
You want to call the square_it function inside the generator, not on the generator.
squared_set = (square_it(x) for x in x_set)
As the other answers have suggested, I think it is best (most "pythonic") to call your function explicitly on each element, using a list or generator comprehension.
To actually answer the question though, you can wrap your function that operates over scalers with a function that sniffs the input, and has different behavior depending on what it sees. For example:
>>> import types
>>> def scaler_over_generator(f):
... def wrapper(x):
... if isinstance(x, types.GeneratorType):
... return [f(i) for i in x]
... return f(x)
... return wrapper
>>> def square_it(x):
... return x * x
>>> square_it_maybe_over = scaler_over_generator(square_it)
>>> square_it_maybe_over(10)
100
>>> square_it_maybe_over(x for x in range(5))
[0, 1, 4, 9, 16]
I wouldn't use this idiom in my code, but it is possible to do.
You could also code it up with a decorator, like so:
>>> #scaler_over_generator
... def square_it(x):
... return x * x
>>> square_it(x for x in range(5))
[0, 1, 4, 9, 16]
If you didn't want/need a handle to the original function.
Note that there is a difference between list comprehension returning a list
squared_set = [square_it(x) for x in x_set]
and returning a generator that you can iterate over it:
squared_set = (square_it(x) for x in x_set)

Categories