get average length of words using python reduce - python

I am attempting to get the average length of words in a file using reduce, but I am getting the following error "TypeError: object of type 'int' has no len()" which i find baffling because the file is filled with words and I am simply getting the length of each
def averageWord():
myWords = open('myfile.txt', 'r').read().split(' ')
avg = (reduce(lambda x,y: len(x) + len(y) ,myWords)) / len(myWords)
print avg

The reduce function will work upon two elements of the list at a time and then work with the returned value along with the next element. So the reduce function is used in a wrong way. Quoting from the docs
Apply function of two arguments cumulatively to the items of iterable, from left to right, so as to reduce the iterable to a single value. For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates ((((1+2)+3)+4)+5).
(emphasis mine)
Using sum along with map is a better way (as mentioned)
That is
avg = sum(map(len,myWords)) /len(myWords)
This would give you the average as expected

Related

Lambda function error while using python3

I have a piece of code that works in Python 2.7 but not in Python3.7. Here I am trying to sort by values of a lambda function.
def get_nearest_available_slot(self):
"""Method to find nearest availability slot in parking
"""
available_slots = filter(lambda x: x.availability, self.slots.values())
if not available_slots:
return None
return sorted(available_slots, key=lambda x: x.slotNum)[0]
The error I get is:
File "/home/xyz/Desktop/parking-lot/parking-lot-1.4.2/parking_lot/bin/source/parking.py", line 45, in get_nearest_available_slot
return sorted(available_slots, key=lambda x: x.slotNum)[0]
IndexError: list index out of range
What am I doing wrong here?
The answer is simple: it's because of how filter works.
In Python 2, filter is eagerly evaluated, which means that once you call it, it returns a list:
filter(lambda x: x % 2 == 0, [1, 2, 3])
Output:
[2]
Conversely, in Python 3, filter is lazily evaluated; it produces an object you can iterate over once, or an iterator:
<filter at 0x110848f98>
In Python 2, the line if not available_slots stops execution if the result of filter is empty, since an empty list evaluates to False.
However, in Python 3, filter returns an iterator, which always evaluates to True, since you cannot tell if an iterator has been exhausted without trying to get the next element, and an iterator has no length. See this for more information.
Because of this, a case exists where an empty iterator gets passed to sorted, producing another empty list. You cannot access the element at position 0 of an empty list, so you get an IndexError.
To fix this, I suggest evaluating the condition strictly. You could do something like this, replacing sorted with min, since we only need one value:
def get_nearest_available_slot(self):
"""Method to find nearest availability slot in parking
"""
available_slots = [slot for slot in self.slots.values() if slot.availability]
if available_slots:
return min(available_slots, key=lambda x: x.slotNum)
else:
return None

TypeError in Python due to empty sequence

I am using a simple script:
def get_stories(self, f):
data = [([], [u'Where', u'is', u'the', u'apple', u'?'],u'office')]
flatten = lambda data: reduce(lambda x, y: x + y, data)
data = [(flatten(story), q, answer) for story, q, answer in data]
return data
TypeError: reduce() of empty sequence with no initial value
But data is not empty !!
Why is this error happening.
Thanks a lot for your help.
The error message gives a good hint about how to solve the problem, by stating that there is "no initial value". Here's what docs for reduce have to say:
reduce(function, iterable[, initializer])
Apply function of two arguments cumulatively to the items of iterable, from left to right, so as to reduce the iterable to a single
value. For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5])
calculates ((((1+2)+3)+4)+5). The left argument, x, is the accumulated
value and the right argument, y, is the update value from the
iterable. If the optional initializer is present, it is placed before
the items of the iterable in the calculation, and serves as a default
when the iterable is empty. If initializer is not given and iterable
contains only one item, the first item is returned. [emphasis added]
So your flatten function should look like this:
flatten = lambda data: reduce(lambda x, y: x + y, data, [])

Python: TypeError: 'int' object is not iterable(Think Python 10.3)

I trying to solve Think Python exercise 10.3
Write a function that takes a list of numbers and returns the cumulative sum; that is, a new list where the ith element is the sum of the first i + 1 elements from the original list. For example, the cumulative sum of [1, 2, 3] is [1, 3, 6].
I get a TypeError with this code:
def culm_sum(num):
res= []
a = 0
for i in num:
a += i
res += a
return res
When I call culm_sum([1, 2, 3]) I get
TypeError: 'int' object is not iterable
Thank you!
The code you are using to append to your list is incorrect:
res += a
Instead do
res.append(a)
What's wrong with res += a? Python expects a to be iterable and behind the scenes tries to do the equivalent of:
for item in a:
res.append(a)
But since a is not iterable, so you get a TypeError.
Note I initially thought your error was in for i in num: because your variable was poorly named. It sounds like it's a single integer. Since it is a list of numbers at least make it plural (nums) so that readers of your code are not confused. (The reader you will usually be helping is future you.)
What you are trying to do this is extend your list with an int, which is not iterable, and hence the error. You need to add the element to the list using the append method :
res.append(a)
Or, do this, the correct way to extend :
res += [a]

How does the min/max function on a nested list work?

Lets say, there is a nested list, like:
my_list = [[1, 2, 21], [1, 3], [1, 2]]
When the function min() is called on this:
min(my_list)
The output received is
[1, 2]
Why and How does it work? What are some use cases of it?
How are lists and other sequences compared in Python?
Lists (and other sequences) in Python are compared lexicographically and not based on any other parameter.
Sequence objects may be compared to other objects with the same sequence type. The comparison uses lexicographical ordering: first the first two items are compared, and if they differ this determines the outcome of the comparison; if they are equal, the next two items are compared, and so on, until either sequence is exhausted.
What is lexicographic sorting?
From the Wikipedia page on lexicographic sorting
lexicographic or lexicographical order (also known as lexical order, dictionary order, alphabetical order or lexicographic(al) product) is a generalization of the way the alphabetical order of words is based on the alphabetical order of their component letters.
The min function returns the smallest value in the iterable. So the lexicographic value of [1,2] is the least in that list. You can check by using [1,2,21]
>>> my_list=[[1,2,21],[1,3],[1,2]]
>>> min(my_list)
[1, 2]
What is happening in this case of min?
Going element wise on my_list, firstly [1,2,21] and [1,3]. Now from the docs
If two items to be compared are themselves sequences of the same type, the lexicographical comparison is carried out recursively.
Thus the value of [1,1,21] is less than [1,3], because the second element of [1,3], which is, 3 is lexicographically higher than the value of the second element of [1,1,21], which is, 1.
Now comparing [1,2] and [1,2,21], and adding another reference from the docs
If one sequence is an initial sub-sequence of the other, the shorter sequence is the smaller (lesser) one.
[1,2] is an initial sub-sequence of [1,2,21]. Therefore the value of [1,2] on the whole is smaller than that of [1,2,21]. Hence [1,2] is returned as the output.
This can be validated by using the sorted function
>>> sorted(my_list)
[[1, 2], [1, 2, 21], [1, 3]]
What if the list has multiple minimum elements?
If the list contains duplicate min elements the first is returned
>>> my_list=[[1,2],[1,2]]
>>> min(my_list)
[1, 2]
This can be confirmed using the id function call
>>> my_list=[[1,2],[1,2]]
>>> [id(i) for i in my_list]
[140297364849368, 140297364850160]
>>> id(min(my_list))
140297364849368
What do I need to do to prevent lexicographic comparison in min?
If the required comparison is not lexicographic then the key argument can be used (as mentioned by Padraic)
The min function has an additional optional argument called key. The key argument takes a function.
The optional key argument specifies a one-argument ordering function
like that used for list.sort(). The key argument, if supplied, must be
in keyword form (for example, min(a,b,c,key=func)).
For example, if we need the smallest element by length, we need to use the len function.
>>> my_list=[[1,2,21],[1,3],[1,2]]
>>> min(my_list,key=len) # Notice the key argument
[1, 3]
As we can see the first shortest element is returned here.
What if the list is heterogeneous?
Until Python2
If the list is heterogeneous type names are considered for ordering, check Comparisions,
Objects of different types except numbers are ordered by their type names
Hence if you put an int and a list there you will get the integer value as the smallest as i is of lower value than l. Similarly '1' would be of higher value than both of this.
>>> my_list=[[1,1,21],1,'1']
>>> min(my_list)
1
Python3 and onwards
However this confusing technique was removed in Python3. It now raises a TypeError. Read What's new in Python 3.0
The ordering comparison operators (<, <=, >=, >) raise a TypeError exception when the operands don’t have a meaningful natural ordering. Thus, expressions like 1 < '', 0 > None or len <= len are no longer valid, and e.g. None < None raises TypeError instead of returning False. A corollary is that sorting a heterogeneous list no longer makes sense – all the elements must be comparable to each other.
>>> my_list=[[1,1,21],1,'1']
>>> min(my_list)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: int() < list()
But it works for Comparable types, For example
>>> my_list=[1,2.0]
>>> min(my_list)
1
Here we can see that the list contains float values and int values. But as float and int are comparable types, min function works in this case.
One simple use case for lexicographical sorting is with making a sortable namedtuple class.
from collections import namedtuple
Time = namedtuple('Time', ['hours', 'minutes', 'seconds'])
t1 = Time(hours=8, minutes=15, seconds=30)
t2 = Time(hours=8, minutes=15, seconds=0)
t3 = Time(hours=8, minutes=30, seconds=30)
t4 = Time(hours=7, minutes=15, seconds=30)
assert min(t1, t2, t3, t4) == t4
assert max(t1, t2, t3, t4) == t3
Two lists are compared element wise
Even if sizes of two lists are different the two lists are compared element wise starting the comparison from the first element.
Now suppose that every element of a list has been checked and they are the same and there is no next element in the shorter list. Then the shorter list is declared to be smaller than the longer one.
Examples:
>>> [1,2]<[1,3]
True
>>> [1,2]<[1,2,21]
True
>>> [1,3]<[1,2,21]
False
>>>[1,2,22]<[1,2,21]
False
>>>[1]<[1,2,21]
True
>>>
it compares the lists elementwise:
>>> [1,2]<[1,3]
True
>>> [1,2]<[1,2,21]
True
>>>

Python: Understanding reduce()'s 'initializer' argument

I'm relatively new to Python and am having trouble
with Folds or more specifically, reduce()'s 'initializer' argument
e.g. reduce(function, iterable[, initializer])
Here is the function...
>>> def x100y(x,y):
... return x*100+y
Could someone explain why reduce() produces 44...
>>> reduce(x100y, (), 44)
44
or why it produces 30102 here...
>>> reduce(x100y, [1,2], 3)
30102
From the docs:
reduce(function, iterable[, initializer])
Apply function of two
arguments cumulatively to the items of iterable, from left to right,
so as to reduce the iterable to a single value. For example,
reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates
((((1+2)+3)+4)+5). The left argument, x, is the accumulated value and
the right argument, y, is the update value from the iterable. If the
optional initializer is present, it is placed before the items of the
iterable in the calculation, and serves as a default when the iterable
is empty. If initializer is not given and iterable contains only one
item, the first item is returned.
The initializer is placed as element 0 in your iterable and if there are no elements in your iterable it is returned. (So this is why you get 44)
Also, x100y is not a valid python function. If you want to make it into a valid python function you would have to do
reduce(lambda x,y: x*100*y,[1,2],3)
which is equivalent to
(3*100*1)*100*2 which should give 60000 (why you got the value you had is probably because of an error)
Documentation on lambda is here

Categories