Using max() on a list that contains numbers and strings - python

I have a list that is like this:
self.thislist = ['Name', 13, 'Name1', 160, 'Name2', 394]
Basically the list has a name and a number after it, and I'm trying to find out the highest number in the list, which is 394. But for some reason, it picks a name as this.
if max(self.thislist) > 150:
this = max(self.thislist) # so this should be 394
position = self.thislist.index(this) # The index of it
temponary = position - 1 # This is so we can find the name that is associated with it
name = self.thislist[temponary] #and this retrieves the name
and it retrieves for example, 'Name', when it should be 394.
So the point is to retreive a name and a number associated with that name. Any ideas?

By calling max, you're asking it to compare all the values.
In Python 2.x, most values can be compared to each other, even if they're of different types; the comparison will be meaningful in some arbitrary and implementation-specific way (in CPython, it mostly comes down to comparing the names of type objects themselves), but that's rarely if ever useful to you.
In Python 3.x, most values of unrelated types can't be compared to each other, so you'd just get a TypeError instead of a useless answer. But the solution is the same.
If you want to compare the numbers and ignore the names, you can filter out all non-numbers, skip every even element, use a key function that converts all non-numbers to something smaller than any number, or almost anything else that avoids trying to compare the names and the numbers. For example:
if max(self.thislist[1::2]) > 150:
As a side note, using data structures like this is going to make a lot of things more complicated. It seems like what you really want here is not a list of alternating names and numbers, but a dict mapping names to numbers, or a list of name-number pairs, or something similar. Then you could write things more readably. For example, after this:
self.thisdict = dict(zip(self.thislist[::2], self.thislist[1::2]))
… you can do things like:
if max(self.thisdict.itervalues()) > 150:

In Python 2, you can compare values of different types (and they will then be compared by the name of the type. Since str comes after int, any string will compare higher than any integer. Since this doesn't make any sense, Python 3 has wisely removed this feature)
In order to get what you want, use a custom key:
>>> thislist = ['Name', 13, 'Name1', 160, 'Name2', 394]
>>> max(thislist, key = lambda x: x if isinstance(x, (int, long, float)) else 0)
394
(This assumes that there is at least one positive number in the list)

That's because string always compare greater than integers in Python 2. You can use a custom key function to fix that:
>>> lst = ['Name', 13, 'Name1', 160, 'Name2', 394]
>>> max(lst, key=lambda x: (isinstance(x, (int, float)), x))
394

I would do it like this:
>>> thislist = ['Name', 13, 'Name1', 160, 'Name2', 394]
>>> names, numbers = thislist[0::2], thislist[1::2]
>>> max(zip(numbers, names))
(394, 'Name2')

you'll want to filter the strings out of the list first as values of different types have and arbitrary (but consistent) ordering (in python2.x) as can be checked easily:
>>> 'foo' > 1
True
I'd just filter with a generator expression that only pulls out the numbers and pass that to max:
import numbers
max(x for x in self.thislist if isinstance(x, numbers.Number))
demo:
>>> lst = ['foo', 1, 1.6, 'bar']
>>> max(x for x in lst if isinstance(x, numbers.Number))
1.6
>>> lst = ['foo', 1, 1.6, 2, 'bar']
>>> max(x for x in lst if isinstance(x, numbers.Number))
2

Related

How do I sort a list which is a value of dictionaries?

I need to sort a list which is a value of dictionaries by using a function with less computation cost. I would not able to share the original code, so please help me with the following example.
I tried with the standard approach, I parsed the values, used the intermediate list to sort and stored it in the new dictionary which is highly computation intensive. I am trying to streamline it, for that, I am expecting any suggestions or ways to incorporate.
Input
a= {'a':1, 'b': [2,8,4,3], 'c':['c',5,7,'a',6]}
Output
a= {'a':1, 'b': [2,3,4,8], 'c':['a','c',5,6,7]}
You do not need to sort the dict, you need to sort all values that are lists inside your dict. You do not need to create any new objects at all:
a= {'a':1, 'b': [2,8,4,3], 'c':['c',5,7,'a',6]} # changed c and a to be strings
for e in a:
if isinstance(a[e],list):
a[e].sort() # inplace sort the lists
print(a)
Output:
{'a': 1, 'c': [5, 6, 7, 'a', 'c'], 'b': [2, 3, 4, 8]}
This does not create new dicts nor does it create new lists - it simply sorts the list in-place. You can not get much faster/less computational then that unless you have special domainknowledge about your lists that would make programming a specialized in-place-sorter as replacement for list.sort() viable.
On Python 3 (thanks #Matthias Profil) comparison between int ansd str give TypeError - you can "fix" that with some optional computation ( inspired by answers at: python-list-sort-query-when-list-contains-different-element-types ):
def IsString(item):
return isinstance(item,str)
def IsInt(item):
return isinstance(item,int)
a= {'a':1, 'b': [2,8,4,3], 'c':['c',5,7,'a',6]} # changed c and a to be strings
for e in a:
if isinstance(a[e],list):
try:
a[e].sort() # inplace sort the lists
except TypeError:
str_list = sorted(filter(IsString,a[e]))
int_list = sorted(filter(IsInt,a[e]))
a[e] = int_list + str_list # default to numbers before strings
print(a)
In general (if your values are list of comparable items, eg numbers only), you could do something like this
sorted_dict = {key: sorted(value) for key, value in original_dict.items()}
If your values are single numbers/strings, you should change sorted(value) to sorted(value) if isinstance(value, list) else value. (thanks to user #DeepSpace for pointing out).
However, the example you give in not valid, unless a and c refer to integer values.

comparing contents of two lists python [duplicate]

a = [1, 2, 3, 1, 2, 3]
b = [3, 2, 1, 3, 2, 1]
a & b should be considered equal, because they have exactly the same elements, only in different order.
The thing is, my actual lists will consist of objects (my class instances), not integers.
O(n): The Counter() method is best (if your objects are hashable):
def compare(s, t):
return Counter(s) == Counter(t)
O(n log n): The sorted() method is next best (if your objects are orderable):
def compare(s, t):
return sorted(s) == sorted(t)
O(n * n): If the objects are neither hashable, nor orderable, you can use equality:
def compare(s, t):
t = list(t) # make a mutable copy
try:
for elem in s:
t.remove(elem)
except ValueError:
return False
return not t
You can sort both:
sorted(a) == sorted(b)
A counting sort could also be more efficient (but it requires the object to be hashable).
>>> from collections import Counter
>>> a = [1, 2, 3, 1, 2, 3]
>>> b = [3, 2, 1, 3, 2, 1]
>>> print (Counter(a) == Counter(b))
True
If you know the items are always hashable, you can use a Counter() which is O(n)
If you know the items are always sortable, you can use sorted() which is O(n log n)
In the general case you can't rely on being able to sort, or has the elements, so you need a fallback like this, which is unfortunately O(n^2)
len(a)==len(b) and all(a.count(i)==b.count(i) for i in a)
If you have to do this in tests:
https://docs.python.org/3.5/library/unittest.html#unittest.TestCase.assertCountEqual
assertCountEqual(first, second, msg=None)
Test that sequence first contains the same elements as second, regardless of their order. When they don’t, an error message listing the differences between the sequences will be generated.
Duplicate elements are not ignored when comparing first and second. It verifies whether each element has the same count in both sequences. Equivalent to: assertEqual(Counter(list(first)), Counter(list(second))) but works with sequences of unhashable objects as well.
New in version 3.2.
or in 2.7:
https://docs.python.org/2.7/library/unittest.html#unittest.TestCase.assertItemsEqual
Outside of tests I would recommend the Counter method.
The best way to do this is by sorting the lists and comparing them. (Using Counter won't work with objects that aren't hashable.) This is straightforward for integers:
sorted(a) == sorted(b)
It gets a little trickier with arbitrary objects. If you care about object identity, i.e., whether the same objects are in both lists, you can use the id() function as the sort key.
sorted(a, key=id) == sorted(b, key==id)
(In Python 2.x you don't actually need the key= parameter, because you can compare any object to any object. The ordering is arbitrary but stable, so it works fine for this purpose; it doesn't matter what order the objects are in, only that the ordering is the same for both lists. In Python 3, though, comparing objects of different types is disallowed in many circumstances -- for example, you can't compare strings to integers -- so if you will have objects of various types, best to explicitly use the object's ID.)
If you want to compare the objects in the list by value, on the other hand, first you need to define what "value" means for the objects. Then you will need some way to provide that as a key (and for Python 3, as a consistent type). One potential way that would work for a lot of arbitrary objects is to sort by their repr(). Of course, this could waste a lot of extra time and memory building repr() strings for large lists and so on.
sorted(a, key=repr) == sorted(b, key==repr)
If the objects are all your own types, you can define __lt__() on them so that the object knows how to compare itself to others. Then you can just sort them and not worry about the key= parameter. Of course you could also define __hash__() and use Counter, which will be faster.
If the comparison is to be performed in a testing context, use assertCountEqual(a, b) (py>=3.2) and assertItemsEqual(a, b) (2.7<=py<3.2).
Works on sequences of unhashable objects too.
If the list contains items that are not hashable (such as a list of objects) you might be able to use the Counter Class and the id() function such as:
from collections import Counter
...
if Counter(map(id,a)) == Counter(map(id,b)):
print("Lists a and b contain the same objects")
Let a,b lists
def ass_equal(a,b):
try:
map(lambda x: a.pop(a.index(x)), b) # try to remove all the elements of b from a, on fail, throw exception
if len(a) == 0: # if a is empty, means that b has removed them all
return True
except:
return False # b failed to remove some items from a
No need to make them hashable or sort them.
I hope the below piece of code might work in your case :-
if ((len(a) == len(b)) and
(all(i in a for i in b))):
print 'True'
else:
print 'False'
This will ensure that all the elements in both the lists a & b are same, regardless of whether they are in same order or not.
For better understanding, refer to my answer in this question
You can write your own function to compare the lists.
Let's get two lists.
list_1=['John', 'Doe']
list_2=['Doe','Joe']
Firstly, we define an empty dictionary, count the list items and write in the dictionary.
def count_list(list_items):
empty_dict={}
for list_item in list_items:
list_item=list_item.strip()
if list_item not in empty_dict:
empty_dict[list_item]=1
else:
empty_dict[list_item]+=1
return empty_dict
After that, we'll compare both lists by using the following function.
def compare_list(list_1, list_2):
if count_list(list_1)==count_list(list_2):
return True
return False
compare_list(list_1,list_2)
from collections import defaultdict
def _list_eq(a: list, b: list) -> bool:
if len(a) != len(b):
return False
b_set = set(b)
a_map = defaultdict(lambda: 0)
b_map = defaultdict(lambda: 0)
for item1, item2 in zip(a, b):
if item1 not in b_set:
return False
a_map[item1] += 1
b_map[item2] += 1
return a_map == b_map
Sorting can be quite slow if the data is highly unordered (timsort is extra good when the items have some degree of ordering). Sorting both also requires fully iterating through both lists.
Rather than mutating a list, just allocate a set and do a left-->right membership check, keeping a count of how many of each item exist along the way:
If the two lists are not the same length you can short circuit and return False immediately.
If you hit any item in list a that isn't in list b you can return False
If you get through all items then you can compare the values of a_map and b_map to find out if they match.
This allows you to short-circuit in many cases long before you've iterated both lists.
plug in this:
def lists_equal(l1: list, l2: list) -> bool:
"""
import collections
compare = lambda x, y: collections.Counter(x) == collections.Counter(y)
ref:
- https://stackoverflow.com/questions/9623114/check-if-two-unordered-lists-are-equal
- https://stackoverflow.com/questions/7828867/how-to-efficiently-compare-two-unordered-lists-not-sets
"""
compare = lambda x, y: collections.Counter(x) == collections.Counter(y)
set_comp = set(l1) == set(l2) # removes duplicates, so returns true when not sometimes :(
multiset_comp = compare(l1, l2) # approximates multiset
return set_comp and multiset_comp #set_comp is gere in case the compare function doesn't work

Why python allows tuple as key for dictionary

tuples in python can have elements of different type. For example:
tup1 = ('physics', 'chemistry', 1997, 2000);
tup2 = (1, 2, 3, 4 );
When used for keys in dictionary, how does python decide the size of key when the element size are varied?
Python dictionaries need not know the size of the key. Python dictionaries accept any object as a key if it provides the __hash__ and __eq__ special methods. Python finds the matching key by key == another, which internally calls key.__eq__(another). This also means, that you can have one dictionary that has strings, integers, tuples of 1 and 100 elements as keys at the same time.
To speed things up, a dictionary organizes these keys into a hash table, that uses a hash code calculated with hash(key) to divide keys; internally hash(key) calls key.__hash__(); a hash code is a simple integer that satisfies two rules:
it must be that hash(key) == hash(another) if key == another
the hash key should be chosen so that if key != another then preferably (but not in every case) hash(key) != hash(another)
In addition in Python hash(x) must be constant to the lifetime of x, which means that the equality of x with regards to other objects must not change either.
A tuple has both __eq__ and __hash__:
>>> t = ('physics', 'chemistry', 1997, 2000)
>>> hash(t)
1710411284653490310
>>> u = ('physics', 'chemistry', 1997, 2000) # another tuple
>>> t is u # they are not the same object
False
>>> hash(t) == hash(u)
True
>>> t == u
True
Now, Python need not even use the hash code at all to find an object in a dictionary, all it would need is to find an element with matching key, comparing each and every to the given key with ==. But this would mean that on a dictionary having n keys, on average n / 2 comparisons would have to be made to find a key. With the hashing trick we can narrow the set of keys to compare to ideally always 1 or handful at maximum, thus a lookup in dictionary should be equally fast were it small or large.
Now, unlike a tuple, a list in Python is a mutable value, and for it it is impossible to provide a unchanging hash code that would also in future satisfy the 2 rules given above. Thus Python does not define it at all:
>>> [].__hash__ is None
True
likewise you get an exception if using it as a dictionary key:
>>> {[]: 42}
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
how does python decide the size of key when the element size are varied?
The simple answer is that it doesn't: Python allows dictionaries with heterogenous keys. This is much broader than different-sized tuples:
In [15]: d = {}
In [16]: d[42] = 'foo'
In [17]: d['bar'] = -1
In [18]: d[(1, 2, 3)] = {}
In [19]: d
Out[19]: {42: 'foo', 'bar': -1, (1, 2, 3): {}}
Any hashable object, irrespective of its type, can be used as a key into any dictionary.
The size of the key does not matter for a dictionary.
Dictionary keys must be immutable.
Since a tuple is immutable, you can use a tuple as a key to a dict.
Source

"in" statement behavior in lists vs. strings

In Python, asking if a substring exists in a string is pretty straightforward:
>>> their_string = 'abracadabra'
>>> our_string = 'cad'
>>> our_string in their_string
True
However, checking if these same characters are "in" a list fails:
>>> ours, theirs = map(list, [our_string, their_string])
>>> ours in theirs
False
>>> ours, theirs = map(tuple, [our_string, their_string])
>>> ours in theirs
False
I wasn't able to find any obvious reason why checking for elements "in" an ordered (even immutable) iterable would behave differently than a different type of ordered, immutable iterable.
For container types such as lists and tuples, x in container checks if x is an item in the container. Thus with ours in theirs, Python checks if ours is an item in theirs and finds that it is False.
Remember that a list could contain a list. (e.g [['a','b','c'], ...])
>>> ours = ['a','b','c']
>>> theirs = [['a','b','c'], 1, 2]
>>> ours in theirs
True
Are you looking to see if 'cad' is in any of the strings in a list of strings? That would like something like:
stringsToSearch = ['blah', 'foo', 'bar', 'abracadabra']
if any('cad' in s for s in stringsToSearch):
# 'cad' was in at least one string in the list
else:
# none of the strings in the list contain 'cad'
From the Python documentation, https://docs.python.org/2/library/stdtypes.html for sequences:
x in s True if an item of s is equal to x, else False (1)
x not in s False if an item of s is equal to x, else True (1)
(1) When s is a string or Unicode string object the in and not in operations act like a substring test.
For user defined classes, the __contains__ method implements this in test. list and tuple implement the basic notion. string has the added notion of 'substring'. string is a special case among the basic sequences.

How to get the list with the higher sum of its elements

I was wondering if anyone could help me with a Python problem I have. I have four lists, each list holds floats (decimals). I'm adding all the floats that each list contains. The part I'm stuck on is I want to know which of the four list has a higher sum. I know I could use if statements but does anyone know a more of a efficient way. For instance:
foodmart = [12.33,5.55]
nike = [42.20,69.99]
gas_station = [0.89,45.22]
toy_store = [10.99,15.32]
use max():
>>> max(foodmart,nike,gas_station,toy_store, key=sum)
>>> [42.2, 69.99]
help() on max:
max(iterable[, key=func]) -> value
max(a, b, c, ...[, key=func]) ->
value
With a single iterable argument, return its largest item. With two or
more arguments, return the largest argument.
Represent the lists as a dict and use max with an optional key function to calculate the sum
Instead of representing the lists in the way you did, use a dictionary. It would be easier to determine the correct shop and work on any number of lists / shops without the need to enumerate them in the max routine. This would be more Pythonic and maintainable
>>> shops = dict()
>>> shops['foodmart'] = [12.33,5.55]
>>> shops['nike'] = [42.20,69.99]
>>> shops['gas_station'] = [0.89,45.22]
>>> shops['toy_store'] = [10.99,15.32]
>>> max(shops, key = lambda k:sum(shops[k]))
'nike'
>>> max([1,2],[3,4],[2,3], key=lambda x: sum(x))
[3, 4]

Categories