No cmp keyword for max function in python - python

So, when sorting a list in python, the sorted function can take a cmp keyword to override the __cmp__ function of the objects we are sorting.
I would expect max to have a similar keyword, but it doesn't. People know why?
And in any case, anyone know the most pythonic way to get around this? I don't want to override __cmp__ for the classes themselves, and other options I can think of like sorted(L,cmp=compare)[0] seem ugly. What would be a nice way to do this?
The actual example is given as L=[a1,a2,...,an] where each ak is itself a list of integers we want the maximum where ai<aj is in the lexicographical sense.

Don't use cmp. It has been removed from Python3 in favor of key.
Since Python compares strings lexicographically, you could use key = str to find the "maximum" integer:
In [2]: max([10,9], key = str)
Out[2]: 9

Related

Sorting with a custom method with additional parameter?

So I have a fitness function (returning only true or false for a given pair of arguments) which I would like to use as a key for sorting my list of possible arguments. While normally, I'd be able to do something like:
sorted(possibleArguments, key = fitnessFunction)
Here the probles is that my fitness function looks like this:
def fitnessFunction(arg1, arg2, f):
return f(*arg1) < f(*arg2)
Of course in the method I want to use the sorting, the function using which the fitness is to be calculated is known and doesn't change during the sorting but can I somehow tell Python that's the case? Can I do something like:
sorted(possibleArguments, key = fitnessFunction(one element to be compared, the other one, function I'm currently interested in))
If so, how?
key does not take a comparison function, it converts an element of the list into a comparable item.
BTW It's no longer possible to pass a comparison function to sort in python 3 (and the __cmp__ method is gone from objects too), so you better get used to it (it was cumbersome, you had to return 0 if equal, negative if lesser, positive if bigger, a bit like strcmp does, archaic. You could create complex comparison functions, but they could reveal unstable. I surely don't miss them).
Fortunately you have the f() function which is enough.
You just have to do this in your case:
sorted(possibleArguments, key = lambda x : f(*x))
the comparisons are done by the sort function. No need for fitnessFunction

One liner. Plotting a histogram using a method from class instances in a list

I have just learned to use pythonic lambda.
Why this works:
print(max(my_firms, key=lambda firm: firm.num_members()))
But this will not:
plt.hist(my_firms, key=lambda firm: firm.num_members())
That is. I have a list, my_firms, that contains class instances, firm, that have a method num.members(). I want to do a histogram with the quantity of members of all firms in my_firms.
Thanks a lot
Not every method will accept a key argument. In fact, most don't. I suspect that matplotlib's hist function is one that doesn't.
In this case, you'll probably want to use a list comprehension to transform the firm objects into numbers of members:
plt.hist([f.num_members() for f in my_firms])
In other places, you'll probably use a generator expression instead, but IIRC, plt.hist expects an array-like object and generators don't quite fit the bill.
As far as I know, plt.hist doesn't take any keyword arguments called key. Check the documentation.
As for your plot, you can are probably achieve what you are looking for with a list comprehension like this:
plt.hist([f.num_members() for f in my_firms])

conditionals with dicts Python

I was wondering what is the correct way to check a key:value pair of a dict. Lets say I have this dict
dict_ = {
'key1':'val1',
'key2':'val2'
}
I can check a condition like this
if dict_['key1'] == 'val1'
but I feel like there is a more elegant way that takes advantage of the dict data structure.
What you're doing already does take advantage of the data structure, which is why it's "the one obvious way" to do what you want to do. (You can find examples like this all over the tutorial, the reference docs, and the stdlib implementation.)
However, I can see what you're thinking: the dict is in some sense a container of key-value pairs (even if it's only a collections.Container of keys…), so… shouldn't there be some way to just check whether a key-value pair exists?
Up to Python 2.6, there really isn't.* But in 3.0, the items() method returns a special set-like view of the key-value pairs. And 2.7 backported that functionality, under the name viewitems. So:
('key1', 'val1') in d.viewitems()
But I don't think that's really clearer or cleaner; "items" feels like a lower-level way to think of dictionaries than "mappings", which is what both your original code and smci's answer rely on.
It's also less concise, it doesn't work in 2.6 or earlier, and many dict-like mapping objects don't support it,** and it's and slightly slower on 2.7 to boot, but these are probably less important, and not what you asked about.
* Well, there is, but only by iterating over all of the items with iteritems, or using items to effectively do the same exhaustive search behind your back, neither of which is what you want.
** In fact, in 2.7, it's not actually possible to support it with a pure-Python class…
If you want to avoid throwing KeyError if dict doesn't even contain 'key1':
if dict_.get('key1')=='val1':
(However, throwing an exception for missing key is perfectly fine Python idiom.)
Otherwise, #Cyber is correct that it's already fine! (What exactly is the problem?)
There is a has_key function
dict_.has_key('key1')
This returns a boolean true or false.
Alternatively, you can have you get function return a default value when the key is not present.
dict_.get('key3','Default Value')
Modified typo*

Delegate sort based on a condition

I have a list of objects, that are pre-sorted based on some complex criteria that cannot be easily duplicated with attrgetter, for example. I want to further sort a subset of them alphabetically, if both of them have the property: part_of_subset.
How do I do this without re-defining an alphabetic sort function?
def cmp(a, b):
if a.part_of_subset and b.part_of_subset:
# sort alphabetically -- must I duplicate alphabetic sort code?
return 0
While you can define a comparison function for sorting, it is generally recommended to use a key function. For your application, this key function should return the same value for everything that should be left untouched, and the sort key for the rest. Example
def my_key(a):
if a.part_of_subset:
return 0,
return 1, a.sort_key
collection.sort(key=my_key)
Note that the subset that is sorted will be grouped together to one block after the already sorted elements.
Edited: To get rid of the restriction that sort_key may never be None, and to make the code work in Python 3, I updated the key function. The old version might also have led to strange results in the case that the sort keys are of different types (which does not seem too useful, but anyway).
You can delegate the sorting to another function under certain conditions by just saying return cmp(a, b). I'm referring to the builtin Python function cmp, not your cmp.

Is arr.__len__() the preferred way to get the length of an array in Python? [duplicate]

This question already has answers here:
How do I get the number of elements in a list (length of a list) in Python?
(11 answers)
Closed 13 days ago.
The community is reviewing whether to reopen this question as of 8 days ago.
In Python, is the following the only way to get the number of elements?
arr.__len__()
If so, why the strange syntax?
my_list = [1,2,3,4,5]
len(my_list)
# 5
The same works for tuples:
my_tuple = (1,2,3,4,5)
len(my_tuple)
# 5
And strings, which are really just arrays of characters:
my_string = 'hello world'
len(my_string)
# 11
It was intentionally done this way so that lists, tuples and other container types or iterables didn't all need to explicitly implement a public .length() method, instead you can just check the len() of anything that implements the 'magic' __len__() method.
Sure, this may seem redundant, but length checking implementations can vary considerably, even within the same language. It's not uncommon to see one collection type use a .length() method while another type uses a .length property, while yet another uses .count(). Having a language-level keyword unifies the entry point for all these types. So even objects you may not consider to be lists of elements could still be length-checked. This includes strings, queues, trees, etc.
The functional nature of len() also lends itself well to functional styles of programming.
lengths = map(len, list_of_containers)
The way you take a length of anything for which that makes sense (a list, dictionary, tuple, string, ...) is to call len on it.
l = [1,2,3,4]
s = 'abcde'
len(l) #returns 4
len(s) #returns 5
The reason for the "strange" syntax is that internally python translates len(object) into object.__len__(). This applies to any object. So, if you are defining some class and it makes sense for it to have a length, just define a __len__() method on it and then one can call len on those instances.
Just use len(arr):
>>> import array
>>> arr = array.array('i')
>>> arr.append('2')
>>> arr.__len__()
1
>>> len(arr)
1
Python uses duck typing: it doesn't care about what an object is, as long as it has the appropriate interface for the situation at hand. When you call the built-in function len() on an object, you are actually calling its internal __len__ method. A custom object can implement this interface and len() will return the answer, even if the object is not conceptually a sequence.
For a complete list of interfaces, have a look here: http://docs.python.org/reference/datamodel.html#basic-customization
The preferred way to get the length of any python object is to pass it as an argument to the len function. Internally, python will then try to call the special __len__ method of the object that was passed.
you can use len(arr)
as suggested in previous answers to get the length of the array. In case you want the dimensions of a 2D array you could use arr.shape returns height and width
len(list_name) function takes list as a parameter and it calls list's __len__() function.
Python suggests users use len() instead of __len__() for consistency, just like other guys said. However, There're some other benefits:
For some built-in types like list, str, bytearray and so on, the Cython implementation of len() takes a shortcut. It directly returns the ob_size in a C structure, which is faster than calling __len__().
If you are interested in such details, you could read the book called "Fluent Python" by Luciano Ramalho. There're many interesting details in it, and may help you understand Python more deeply.

Categories