Advanced Comparison in Python from "Think Python"

Advanced Comparison in Python from "Think Python" - python

I am currently working through a few sections of "Think Python" by Allen B. Downey and I am having trouble understanding the solution to the question in Section 16.1:
Write a boolean function called is_after that takes two Time objects,
t1 and t2, and returns True if t1 follows t2 chronologically and False
otherwise. Challenge: don’t use an if statement.
His solution is the following:
def is_after(t1, t2):
"""Returns True if t1 is after t2; false otherwise."""
return (t1.hour, t1.minute, t1.second) > (t2.hour, t2.minute, t2.second)
Full solution code shown here.
Questions: Is this operator comparing on multiple values at once? How this this working? Where can I read more about this?

Read the docs here for an explanation
Sequence objects may be compared to other objects with the same sequence type. The comparison uses lexicographical ordering: first the first two items are compared, and if they differ this determines the outcome of the comparison; if they are equal, the next two items are compared, and so on, until either sequence is exhausted. If two items to be compared are themselves sequences of the same type, the lexicographical comparison is carried out recursively. If all items of two sequences compare equal, the sequences are considered equal. If one sequence is an initial sub-sequence of the other, the shorter sequence is the smaller (lesser) one.
To your specific case: t1.hour is compared against t2.hour. If they are equal, t1.minute is compared against t2.minute. If those are equal, t1.second is compared against t2.second. As soon as there's an inequality, that is returned.

(t1.hour, t1.minute, t1.second) and (t2.hour, t2.minute, t2.second) are tuples. From the docs:
Tuples and lists are compared lexicographically using comparison of corresponding elements.
Meaning that first t1.hour and t2.hour are compared, then the minutes and then the seconds.

From the Python documentation:
Sequence types also support comparisons. In particular, tuples and lists are compared lexicographically by comparing corresponding elements.

Its just comparing tuples. Do a (2,3,4) > (1,2,3) on the terminal and youll understand. Play around with the tuple comparisons and the rules of tuple comparisons will become pretty evident.

Related

min on Python tuple with zeros [duplicate]

I have been reading the Core Python programming book, and the author shows an example like:
(4, 5) < (3, 5) # Equals false
So, I'm wondering, how/why does it equal false? How does python compare these two tuples?
Btw, it's not explained in the book.

Tuples are compared position by position:
the first item of the first tuple is compared to the first item of the second tuple; if they are not equal (i.e. the first is greater or smaller than the second) then that's the result of the comparison, else the second item is considered, then the third and so on.
See Common Sequence Operations:
Sequences of the same type also support comparisons. In particular, tuples and lists are compared lexicographically by comparing corresponding elements. This means that to compare equal, every element must compare equal and the two sequences must be of the same type and have the same length.
Also Value Comparisons for further details:
Lexicographical comparison between built-in collections works as follows:
For two collections to compare equal, they must be of the same type, have the same length, and each pair of corresponding elements must compare equal (for example, [1,2] == (1,2) is false because the type is not the same).
Collections that support order comparison are ordered the same as their first unequal elements (for example, [1,2,x] <= [1,2,y] has the same value as x <= y). If a corresponding element does not exist, the shorter collection is ordered first (for example, [1,2] < [1,2,3] is true).
If not equal, the sequences are ordered the same as their first differing elements. For example, cmp([1,2,x], [1,2,y]) returns the same as cmp(x,y). If the corresponding element does not exist, the shorter sequence is considered smaller (for example, [1,2] < [1,2,3] returns True).
Note 1: < and > do not mean "smaller than" and "greater than" but "is before" and "is after": so (0, 1) "is before" (1, 0).
Note 2: tuples must not be considered as vectors in a n-dimensional space, compared according to their length.
Note 3: referring to question https://stackoverflow.com/questions/36911617/python-2-tuple-comparison: do not think that a tuple is "greater" than another only if any element of the first is greater than the corresponding one in the second.

The Python documentation does explain it.
Tuples and lists are compared
lexicographically using comparison of
corresponding elements. This means
that to compare equal, each element
must compare equal and the two
sequences must be of the same type and
have the same length.

The python 2.5 documentation explains it well.
Tuples and lists are compared lexicographically using comparison of corresponding elements. This means that to compare equal, each element must compare equal and the two sequences must be of the same type and have the same length.
If not equal, the sequences are ordered the same as their first differing elements. For example, cmp([1,2,x], [1,2,y]) returns the same as cmp(x,y). If the corresponding element does not exist, the shorter sequence is ordered first (for example, [1,2] < [1,2,3]).
Unfortunately that page seems to have disappeared in the documentation for more recent versions.

I had some confusion before regarding integer comparsion, so I will explain it to be more beginner friendly with an example
a = ('A','B','C') # see it as the string "ABC"
b = ('A','B','D')
A is converted to its corresponding ASCII ord('A') #65 same for other elements
So,
>> a>b # True
you can think of it as comparing between string (It is exactly, actually)
the same thing goes for integers too.
x = (1,2,2) # see it the string "123"
y = (1,2,3)
x > y # False
because (1 is not greater than 1, move to the next, 2 is not greater than 2, move to the next 2 is less than three -lexicographically -)
The key point is mentioned in the answer above
think of it as an element is before another alphabetically not element is greater than an element and in this case consider all the tuple elements as one string.

Short code and notation in python

I see some codes like this:
K = np.array([B[z==i].mean(axis=0) for i in range(k)])
Where B is a 2D array (matrix) and z is a 1D array (vector).
I am wondering what B[z==i] means?

In B[z==i] you have two types of operations.
First, given B[i] for i in range(k) what are you doing in cases like this is an iteration over the values of the list B.
In your case you have z==i which is a comparison between objects.
In python the objects to compare does not need to have the same type, but if the types are different, they always return False.
A little summary of how comparisons work:
Numbers are compared arithmetically.
Strings are compared lexicographically using the numeric equivalents
(the result of the built-in function ord()) of their characters.
Unicode and 8-bit strings are fully interoperable in this behavior.
Tuples and lists are compared lexicographically using comparison of
corresponding elements. This means that to compare equal, each element
must compare equal and the two sequences must be of the same type and
have the same length.
If not equal, the sequences are ordered the same as their first
differing elements. For example, cmp([1,2,x], [1,2,y]) returns the
same as cmp(x,y). If the corresponding element does not exist, the
shorter sequence is ordered first (for example, [1,2] < [1,2,3]).
Mappings (dictionaries) compare equal if and only if their sorted
(key, value) lists compare equal.5.3Outcomes other than equality are
resolved consistently, but are not otherwise defined.
Most other types compare unequal unless they are the same object; the
choice whether one object is considered smaller or larger than another
one is made arbitrarily but consistently within one execution of a
program.
This document is a little bit old but you can have more information about comparisons there: source

Python: Improving run-time: Choosing two movie_lengths whose total_untimes will equal the exact flight length

I am working through another interview Question, and it is about the following coding interview question.
So I'm building a feature for choosing two movies whose total runtimes will equal the exact flight length.
The question asks the following:
Write a function that takes an integer flight_length (in minutes) and a list of integers movie_lengths (in minutes) and returns a boolean indicating whether there are two numbers in movie_lengths whose sum equals flight_length.
I first thought we can do through nest two loops (the outer choosing first_movie_length, the inner choosing second_movie_length). That’d give us a runtime of O(n^2)O(n2)
But is it possible that we can do better?
I have the following solution:
def can_two_movies_fill_flight(movie_lengths, flight_length):
# movie lengths we've seen so far
movie_lengths_seen = set()
for first_movie_length in movie_lengths:
matching_second_movie_length = flight_length - first_movie_length
if matching_second_movie_length in movie_lengths_seen:
return True
movie_lengths_seen.add(first_movie_length)
# we never found a match, so return False
return False
I think this solution gives me O(n) time, and O(n) O(n) space.
Is it possible that I can use hash map?

We could sort the movie_lengths first—then we could use binary search to find second_movie_length in O(\lg{n})O(lgn) time instead of O(n)O(n) time.
But sorting would cost O(nlg(n))O(nlg(n)), and we can do even better than that.
My Solution:
Using a set as our data structure.
We make one pass through movie_lengths, treating each item as the first_movie_length. At each iteration.
See if there is a matching_second_movie_length we've seen already (stored in our movie_lengths_seen set) that is equal to flight_length - first_movie_length. If there is, we short-circuit and return True.
Keep our movie_lengths_seen set up to date by throwing in the current first_movie_length.
def can_two_movies_fill_flight(movie_lengths, flight_length):
# movie lengths we've seen so far
movie_lengths_seen = set()
for first_movie_length in movie_lengths:
matching_second_movie_length = flight_length - first_movie_length
if matching_second_movie_length in movie_lengths_seen:
return True
movie_lengths_seen.add(first_movie_length)
return False
We know users won't watch the same movie twice because we check movie_lengths_seen for matching_second_movie_length before we've put first_movie_length in it!
Efficiency and Algorithmic complexity: O(n)O(n) time, and O(n)O(n) space. Note while optimizing runtime we added a bit of space cost.

What happens when you compare lists of different sizes?

In Python, I have the two lists of different sizes:
x = [[0,5,10],[0,10,5]]
y = [100,500,900]
What is the comparison happening at each step when I run:
print x>y
e.g. How does it compare say the first element: [0,5,10] vs 100?

In Python 3, you can't compare those two lists because their elements are not of comparable types.
In Python 2, lists are always greater than integers, period, so your x is always greater than your y regardless of what elements are in x's sublists.

The real problem is how to compare [0,5,10] vs 100, i.e, a list vs an integer.
The answer depends on the Python version. In Python 3.x, the two types can't be compared. In Python 2.x, lists are always greater than integers because the type names list is greater than int.
In your example, the print statement in
print x>y
suggests that you are using Python 2.x, so the answer is x > y would be True.

Those comparisons won't work. list and int can't be compared. So it won't work

Just to expand on the other answers, the documentation is pretty good on this stuff. From the 2.7 documentation:
Sequence objects may be compared to other objects with the same sequence type. The comparison uses lexicographical ordering: first the first two items are compared, and if they differ this determines the outcome of the comparison; if they are equal, the next two items are compared, and so on, until either sequence is exhausted. If two items to be compared are themselves sequences of the same type, the lexicographical comparison is carried out recursively. If all items of two sequences compare equal, the sequences are considered equal.
From the 3.5 documentation:
Sequence objects may be compared to other objects with the same sequence type. The comparison uses lexicographical ordering: first the first two items are compared, and if they differ this determines the outcome of the comparison; if they are equal, the next two items are compared, and so on, until either sequence is exhausted. If two items to be compared are themselves sequences of the same type, the lexicographical comparison is carried out recursively. If all items of two sequences compare equal, the sequences are considered equal. If one sequence is an initial sub-sequence of the other, the shorter sequence is the smaller (lesser) one.

Essentially x == y uses the magic __eq__ method on objects to compare them. Different objects will behave differently, and you can even define your own custom equalities, but in general comparing objects of different types will always evaluate to False (Python2 and Python3 docs).
So in your example, [0,5,10] == 100 evaluates to False, not because it checked to see if the elements in the list were equal to 100, but because the two types were incompatible.

Lexicographical Sorting of Word List

I need to merge and sort lists of 100,000+ words lexicographically. I currently do it with a slightly modified bubble sort, but at O(n^2) it takes quite a while. Are there any faster algorithms for sorting lists of words? I'm using Python, but if there is a language that can handle this better I'm open to suggestions.

Use the built-in sort() list method:
>>> words = [ 'baloney', 'aardvark' ]
>>> words.sort()
>>> print words
['aardvark', 'baloney']
It uses a O(n lg(n)) sort1, the Timsort (which is a modified merge-sort, I believe. It's highly tuned for speed.).
1 As pointed out in the comments, this refers to the number of element comparisons, not the number of low-level operations. Since the elements in this case are strings, and comparing two strings takes min{|S1|, |S2|} character comparisons, the total complexity is O(n lg(n) * |S|) where |S| is the length of the longest string being sorted. This is true of all comparison sorts, however -- the true number of operations varies depending on the cost of the element-comparison function for the type of elements being sorted. Since all comparison sorts use the same comparison function, you can just ignore this subtlety when comparing the algorithmic complexity of these sorts amongst each other.

Any O(nlogn) sorting algorithm will probably do it better then bubble sort, but they will be O(nlogn * |S|)
However, sorting strings can be done in O(n*|S|) where |S| is the length of the average string, using a trie, and a simple DFS.
high-level pseudo code:
1. create a trie from your collection.
2. do a DFS on the trie generated, and add each string
to the list when you reach terminal node.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Advanced Comparison in Python from "Think Python" - python

(t1.hour, t1.minute, t1.second) and (t2.hour, t2.minute, t2.second) are tuples. From the docs: Tuples and lists are compared lexicographically using comparison of corresponding elements. Meaning that first t1.hour and t2.hour are compared, then the minutes and then the seconds.

From the Python documentation: Sequence types also support comparisons. In particular, tuples and lists are compared lexicographically by comparing corresponding elements.

Its just comparing tuples. Do a (2,3,4) > (1,2,3) on the terminal and youll understand. Play around with the tuple comparisons and the rules of tuple comparisons will become pretty evident.

Related

min on Python tuple with zeros [duplicate]

Short code and notation in python

Python: Improving run-time: Choosing two movie_lengths whose total_untimes will equal the exact flight length

What happens when you compare lists of different sizes?

Lexicographical Sorting of Word List

Categories

Resources