TL DR: How can I best use map to filter a list based on logical indexing?
Given a list:
values = ['1', '2', '3', '5', 'N/A', '5']
I would like to map the following function and use the result to filter my list. I could do this with filter and other methods but mostly looking to learn if this can be done solely using map.
The function:
def is_int(val):
try:
x = int(val)
return True
except ValueError:
return False
Attempted solution:
[x for x in list(map(is_int, values)) if x is False]
The above gives me the values I need. However, it does not return the index or allow logical indexing. I have tried to do other ridiculous things like:
[values[x] for x in list(map(is_int, values)) if x is False]
and many others that obviously don't work.
What I thought I could do:
values[[x for x in list(map(is_int, values)) if x is False]]
Expected outcome:
['N/A']
[v for v in values if not is_int(v)]
If you have a parallel list of booleans:
[v for v, b in zip(values, [is_int(x) for x in values]) if not b]
you can get the expected outcome using the simple snippet written below which does not involve any map function
[x for x in values if is_int(x) is False]
And, if you want to strictly use map function then the snippet below will help you
[values[i] for i,y in enumerate(list(map(is_int,values))) if y is False]
map is just not the right tool for the job, as that would transform the values, whereas you just want to check them. If anything, you are looking for filter, but you have to "inverse" the filter-function first:
>>> values = ['1', '2', "foo", '3', '5', 'N/A', '5']
>>> not_an_int = lambda x: not is_int(x)
>>> list(filter(not_an_int, values))
['foo', 'N/A']
In practice, however, I would rather use a list comprehension with a condition.
You can do this using a bit of help from itertools and by negating the output of your original function since we want it to return True where it is not an int.
from itertools import compress
from operator import not_
list(compress(values, map(not_, map(is_int, values))))
['N/A']
You cannot use map() alone to perform a reduction. By its very definition, map() preserves the number of items (see e.g. here).
On the other hand, reduce operations are meant to be doing what you want. In Python these may be implemented normally with a generator expression or for the more functional-style inclined programmers, with filter(). Other non-primitive approach may exist, but they essentially boil down to one of the two, e.g.:
values = ['1', '2', '3', '5', 'N/A', '5']
list(filter(lambda x: not is_int(x), values))
# ['N/A']
Yet, if what you want is to combine the result of map() to use it for within slicing, this cannot be done with Python alone.
However, NumPy supports precisely what you want except that the result will not be a list:
import numpy as np
np.array(values)[list(map(lambda x: not is_int(x), values))]
# array(['N/A'], dtype='<U3')
(Or you could have your own container defined in such a way as to implement this behavior).
That being said, it is quite common to use the following generator expression in Python in place of map() / filter().
filter(func, items)
is roughly equivalent to:
item for item in items if func(item)
while
map(func, items)
is roughly equivalent to:
func(item) for item in items
and their combination:
filter(f_func, map(m_func, items))
is roughly equivalent to:
m_func(item) for item in items if f_func(item)
Not exactly what I had in mind but something I learnt from this problem, we could do the following(which might be computationally less efficient). This is almost similar to #aws_apprentice 's answer. Clearly one is better off using filter and/or list comprehension:
from itertools import compress
list(compress(values, list(map(lambda x: not is_int(x), values))))
Or as suggested by #aws_apprentice simply:
from itertools import compress
list(compress(values, map(lambda x: not is_int(x), values)))
Related
I have dict in Python with keys of the following form:
mydict = {'0' : 10,
'1' : 23,
'2.0' : 321,
'2.1' : 3231,
'3' : 3,
'4.0.0' : 1,
'4.0.1' : 10,
'5' : 11,
# ... etc
'10' : 32,
'11.0' : 3,
'11.1' : 243,
'12.0' : 3,
'12.1.0': 1,
'12.1.1': 2,
}
Some of the indices have no sub-values, some have one level of sub-values and some have two. If I only had one sub-level I could treat them all as numbers and sort numerically. The second sub-level forces me to handle them all as strings. However, if I sort them like strings I'll have 10 following 1 and 20 following 2.
How can I sort the indices correctly?
Note: What I really want to do is print out the dict sorted by index. If there's a better way to do it than sorting it somehow that's fine with me.
You can sort the keys the way that you want, by splitting them on '.' and then converting each of the components into an integer, like this:
sorted(mydict.keys(), key=lambda a:map(int,a.split('.')))
which returns this:
['0',
'1',
'2.0',
'2.1',
'3',
'4.0.0',
'4.0.1',
'5',
'10',
'11.0',
'11.1',
'12.0',
'12.1.0',
'12.1.1']
You can iterate over that list of keys, and pull the values out of your dictionary as needed.
You could also sort the result of mydict.items(), very similarly:
sorted(mydict.items(), key=lambda a:map(int,a[0].split('.')))
This gives you a sorted list of (key, value) pairs, like this:
[('0', 10),
('1', 23),
('2.0', 321),
('2.1', 3231),
('3', 3),
# ...
('12.1.1', 2)]
Python's sorting functions can take a custom compare function, so you just need to define a function that compares keys the way you like:
def version_cmp(a, b):
'''These keys just look like version numbers to me....'''
ai = map(int, a.split('.'))
bi = map(int, b.split('.'))
return cmp(ai, bi)
for k in sorted(mydict.keys(), version_cmp):
print k, mydict[k]
In this case you should better to use the key parameter to sorted(), though. See Ian Clelland's answer for an example for that.
As an addendum to Ian Clelland's answer, the map() call can be replaced with a list comprehension... if you prefer that style. It may also be more efficient (though negligibly in this case I suspect).
sorted(mydict.keys(), key=lambda a: [int(i) for i in a.split('.')])
For fun & usefulness (for googling ppl, mostly):
f = lambda i: [int(j) if re.match(r"[0-9]+", j) else j for j in re.findall(r"([0-9]+|[^0-9]+)", i)]
cmpg = lambda x, y: cmp(f(x), f(y))
use as sorted(list, cmp=cmpg).
Additionally, regexes might be pre-compiled (rarely necessary though, actually, with re module's caching).
And, it may be (easily) modified, for example, to include negative values (add -? to num regex, probably) and/or to use float values.
It might be not very efficient, but even with that it's quite useful.
And, uhm, it can be used as key= for sorted() too.
I would do a search on "sorting a python dictionary" and take a look at the answers. I would give PEP-265 a read as well. The sorted() function is what you are looking for.
There is a nice sorting HOWTO on the python web site: http://wiki.python.org/moin/HowTo/Sorting .
It makes a good introduction to sorting, and discusses different techniques to adapt the sorting result to your needs.
Number permutations function, creates a list of all possible arrangements.
Worked on this code for a while, it works well and I'm trying to find a shorter an more efficient way of writing it.
a = [3,7,9]
perms = lambda a:list(sorted(z) for z in map(lambda p:dict.fromkeys([str(sum(v* (10**(len(p) -1 - i)) for i,v in enumerate(item))).zfill(len(a)) for item in itertools.permutations(p)]).keys(), [[int(x) for x in ''.join(str(i) for i in a)]]))[0]
The code returns:
['379', '397', '739', '793', '937', '973']
You can also input numeric string variable
a = '468'
perms(a)
['468', '486', '648', '684', '846', '864']
This is code is different, instead of returning tuples or list. It returns a list of results in string format. Also, you are allowed to input numeric strings, tuples or lists. Trust me I've checked any duplicates before posting.
The triple digits work pretty well
perms('000')
['000']
Other codes produces this
['000', '000', '000', '000', '000', '000']
Also, this code returns an ordered list.
You are already using itertools.permutations, why not just:
def perms(iterable):
return [''.join(p) for p in (map(str, perm) for perm in itertools.permutations(iterable))]
>>> perms('123')
# Result: ['123', '132', '213', '231', '312', '321']
Update: If you wish to avoid duplicates, you can extend the functionality like so by using a set, as elaborated on in Why does Python's itertools.permutations contain duplicates? (When the original list has duplicates)
def unique_perms(iterable):
perm_set = set()
permutations = [i for i in itertools.permutations(iterable) if i not in perm_set and not perm_set.add(i)]
return [''.join(p) for p in (map(str, perm) for perm in permutations)]
This is still significantly faster than the posters original method, especially for examples like '000' and is stable (preserves permutation order).
Given a list of lists like this :
[["fileA",7],["fileB",4],["fileC",17],["fileD",15]]
How would you return the first element associated to the smallest value ? In this instance "fileB" since it has the smallest value (4). I'm guessing the shortest way would be using list comprehension .
Actually, a list comprehension wouldn't be the best tool for this. Instead, you should use min, its key function, and operator.itemgetter:
>>> from operator import itemgetter
>>> lst = [["fileA",7],["fileB",4],["fileC",17],["fileD",15]]
>>> min(lst, key=itemgetter(1))
['fileB', 4]
>>> min(lst, key=itemgetter(1))[0]
'fileB'
>>>
Without importing anything you can do:
min(lst, key = lambda x: x[1])[0]
I came up with this weird idea which doesn't use anything but simple generator expression:
min((x[1], x[0]) for x in lst)[1]
It is based on the fact that the minimization is done on the first element of a tuple/list by default.
You should probably just convert the data to a dictionary, since that immediately seems a lot more sensible than having a list of lists. Then you can manipulate and access your data more easily.
myLists = [["fileA",7],["fileB",4],["fileC",17],["fileD",15]]
myDict = dict(myLists)
min(myDict, key=myDict.get)
# 'fileB'
So, I have x=[(12,), (1,), (3,)] (list of tuples) and I want x=[12, 1, 3] (list of integers) in best way possible? Can you please help?
You didn't say what you mean by "best", but presumably you mean "most pythonic" or "most readable" or something like that.
The list comprehension given by F3AR3DLEGEND is probably the simplest. Anyone who knows how to read a list comprehension will immediately know what it means.
y = [i[0] for i in x]
However, often you don't really need a list, just something that can be iterated over once. If you've got a billion elements in x, building a billion-element y just to iterate over it one element at a time may be a bad idea. So, you can use a generator expression:
y = (i[0] for i in x)
If you prefer functional programming, you might prefer to use map. The downside of map is that you have to pass it a function, not just an expression, which means you either need to use a lambda function, or itemgetter:
y = map(operator.itemgetter(0), x)
In Python 3, this is equivalent to the generator expression; if you want a list, pass it to list. In Python 2, it returns a list; if you want an iterator, use itertools.imap instead of map.
If you want a more generic flattening solution, you can write one yourself, but it's always worth looking at itertools for generic solutions of this kind, and there is in fact a recipe called flatten that's used to "Flatten one level of nesting". So, copy and paste that into your code (or pip install more-itertools) and you can just do this:
y = flatten(x)
If you look at how flatten is implemented, and then at how chain.from_iterable is implemented, and then at how chain is implemented, you'll notice that you could write the same thing in terms of builtins. But why bother, when flatten is going to be more readable and obvious?
Finally, if you want to reduce the generic version to a nested list comprehension (or generator expression, of course):
y = [j for i in x for j in i]
However, nested list comprehensions are very easy to get wrong, both in writing and reading. (Note that F3AR3DLEGEND, the same person who gave the simplest answer first, also gave a nested comprehension and got it wrong. If he can't pull it off, are you sure you want to try?) For really simple cases, they're not too bad, but still, I think flatten is a lot easier to read.
y = [i[0] for i in x]
This only works for one element per tuple, though.
However, if you have multiple elements per tuple, you can use a slightly more complex list comprehension:
y = [i[j] for i in x for j in range(len(i))]
Reference: List Comprehensions
Just do this:
x = [i[0] for i in x]
Explanation:
>>> x=[(12,), (1,), (3,)]
>>> x
[(12,), (1,), (3,)]
>>> [i for i in x]
[(12,), (1,), (3,)]
>>> [i[0] for i in x]
[12, 1, 3]
This is the most efficient way:
x = [i for i, in x]
or, equivalently
x = [i for (i,) in x]
This is a bit slower:
x = [i[0] for i in x]
you can use map function....
map(lambda y: y[0], x)
I have a list of tuples like this:
[('foo','bar'),('foo1','bar1'),('foofoo','barbar')]
What is the fastest way in python (running on a very low cpu/ram machine) to swap values like this...
[('bar','foo'),('bar1','foo1'),('barbar','foofoo')]
I am currently using:
for x in mylist:
self.my_new_list.append(((x[1]),(x[0])))
Is there a better or faster way???
You could use map:
map (lambda t: (t[1], t[0]), mylist)
Or list comprehension:
[(t[1], t[0]) for t in mylist]
List comprehensions are preferred and supposedly much faster than map when lambda is needed, however note that list comprehension has a strict evaluation, that is it will be evaluated as soon as it gets bound to variable, if you're worried about memory consumption use a generator instead:
g = ((t[1], t[0]) for t in mylist)
#call when you need a value
g.next()
There are some more details here: Python List Comprehension Vs. Map
You can use reversed like this:
tuple(reversed((1, 2)) == (2, 1)
To apply it to a list, you can use map or a list/generator comprehension:
map(tuple, map(reversed, tuples)) # map
[tuple(reversed(x)) for x in tuples] # list comprehension
(tuple(reversed(x)) for x in tuples) # generator comprehension
If you're interested primarily in runtime speed, I can only recommend that you profile the various approaches and pick the fastest.
A fancy way:
[t[::-1] for t in mylist]
Using a list comprehension I find more elegant and understandable to use separate variables instead of indices for a single variable as in the solution provided by #iabdalkader:
[(b, a) for a, b in mylist]
To modify the current list in-place, the most efficient way is:
my_list[:] = [(y, x) for x, y in my_list]
It is assigning to the list slice, which covers the entire list, without creating an extra duplicate of the list in memory. See also this answer.