python OrderedDict get a key index of - python

Can OrderedDict get a key position?
is like list of index()
test = ['a', 'b', 'c', 'd', 'e']
test.index('b') # return 1

just one line program.
such as:
print(list(your_ordered_dict).index('your_key'))
Maybe you can use lambda,like this line program:
f = lambda ordered_dict, key: list(ordered_dict).index(key)
Good luck.

Keep it simple.
from collections import OrderedDict
x = OrderedDict('test1'='a', 'test2'='b')
print(list(x.keys().index('test1'))

You can write this in two ways:
list(x).index('b')
next(i for i, k in enumerate(x) if k=='b')
The first one will be a little faster for small dicts, but a lot slower, and waste a lot of space, for huge ones. (Of course most of the time, OrderedDicts are pretty small.)
Both versions will work for any iterable; there's nothing special about OrderedDict here.

If you take the keys as a list, you can then index like:
Code:
list(x).index('b')
Test Code:
from collections import OrderedDict
x = OrderedDict(a=1, b=2)
print(list(x).index('b'))
Results:
1

The accepted answer list(x).index('b') will be O(N) every time you're searching for the position.
Instead, you can create a mapping key -> position which will be O(1) once the mapping is constructed.
ordered_dict = OrderedDict(a='', b='')
key_to_pos = {k: pos for pos, k in enumerate(ordered_dict)}
assert key_to_pos['b'] == 1

Related

python list of sets find symmetric difference in all elements

Consider this list of sets
my_input_list= [
{1,2,3,4,5},
{2,3,7,4,5},
set(),
{1,2,3,4,5,6},
set(),]
I want to get the only exclusive elements 6 and 7 as the response, list or set. Set preferred.
I tried
print reduce(set.symmetric_difference,my_input_list) but that gives
{2,3,4,5,6,7}
And i tried sorting the list by length, smallest first raises an error due to two empty sets. Largest first gives the same result as unsorted.
Any help or ideas please?
Thanks :)
Looks like the most straightforward solution is to count everything and return the elements that only appear once.
This solution uses chain.from_iterable (to flatten your sets) + Counter (to count things). Finally, use a set comprehension to filter elements with count == 1.
from itertools import chain
from collections import Counter
c = Counter(chain.from_iterable(my_input_list))
print({k for k in c if c[k] == 1})
{6, 7}
A quick note; the empty literal {} is used to indicate an empty dict, not set. For the latter, use set().
You could use itertools.chain and collection.Counter:
from itertools import chain
from collections import Counter
r = {k for k,v in Counter(chain.from_iterable(my_input_list)).items() if v==1}

Categorize list in Python

What is the best way to categorize a list in python?
for example:
totalist is below
totalist[1] = ['A','B','C','D','E']
totalist[2] = ['A','B','X','Y','Z']
totalist[3] = ['A','F','T','U','V']
totalist[4] = ['A','F','M','N','O']
Say I want to get the list where the first two items are ['A','B'], basically list[1] and list[2]. Is there an easy way to get these without iterate one item at a time? Like something like this?
if ['A','B'] in totalist
I know that doesn't work.
You could check the first two elements of each list.
for totalist in all_lists:
if totalist[:2] == ['A', 'B']:
# Do something.
Note: The one-liner solutions suggested by Kasramvd are quite nice too. I found my solution more readable. Though I should say comprehensions are slightly faster than regular for loops. (Which I tested myself.)
Just for fun, itertools solution to push per-element work to the C layer:
from future_builtins import map # Py2 only; not needed on Py3
from itertools import compress
from operator import itemgetter
# Generator
prefixes = map(itemgetter(slice(2)), totalist)
selectors = map(['A','B'].__eq__, prefixes)
# If you need them one at a time, just skip list wrapping and iterate
# compress output directly
matches = list(compress(totalist, selectors))
This could all be one-lined to:
matches = list(compress(totalist, map(['A','B'].__eq__, map(itemgetter(slice(2)), totalist))))
but I wouldn't recommend it. Incidentally, if totalist might be a generator, not a re-iterable sequence, you'd want to use itertools.tee to double it, adding:
totalist, forselection = itertools.tee(totalist, 2)
and changing the definition of prefixes to map over forselection, not totalist; since compress iterates both iterators in parallel, tee won't have meaningful memory overhead.
Of course, as others have noted, even moving to C, this is a linear algorithm. Ideally, you'd use something like a collections.defaultdict(list) to map from two element prefixes of each list (converted to tuple to make them legal dict keys) to a list of all lists with that prefix. Then, instead of linear search over N lists to find those with matching prefixes, you just do totaldict['A', 'B'] and you get the results with O(1) lookup (and less fixed work too; no constant slicing).
Example precompute work:
from collections import defaultdict
totaldict = defaultdict(list)
for x in totalist:
totaldict[tuple(x[:2])].append(x)
# Optionally, to prevent autovivification later:
totaldict = dict(totaldict)
Then you can get matches effectively instantly for any two element prefix with just:
matches = totaldict['A', 'B']
You could do this.
>>> for i in totalist:
... if ['A','B']==i[:2]:
... print i
Basically you can't do this in python with a nested list. But if you are looking for an optimized approach here are some ways:
Use a simple list comprehension, by comparing the intended list with only first two items of sub lists:
>>> [sub for sub in totalist if sub[:2] == ['A', 'B']]
[['A', 'B', 'C', 'D', 'E'], ['A', 'B', 'X', 'Y', 'Z']]
If you want the indices use enumerate:
>>> [ind for ind, sub in enumerate(totalist) if sub[:2] == ['A', 'B']]
[0, 1]
And here is a approach in Numpy which is pretty much optimized when you are dealing with large data sets:
>>> import numpy as np
>>>
>>> totalist = np.array([['A','B','C','D','E'],
... ['A','B','X','Y','Z'],
... ['A','F','T','U','V'],
... ['A','F','M','N','O']])
>>> totalist[(totalist[:,:2]==['A', 'B']).all(axis=1)]
array([['A', 'B', 'C', 'D', 'E'],
['A', 'B', 'X', 'Y', 'Z']],
dtype='|S1')
Also as an alternative to list comprehension in python if you don't want to use a loop and you are looking for a functional way, you can use filter function, which is not as optimized as a list comprehension:
>>> list(filter(lambda x: x[:2]==['A', 'B'], totalist))
[['A', 'B', 'C', 'D', 'E'], ['A', 'B', 'X', 'Y', 'Z']]
You imply that you are concerned about performance (cost). If you need to do this, and if you are worried about performance, you need a different data-structure. This will add a little "cost" when you making the lists, but save you time when filtering them.
If the need to filter based on the first two elements is fixed (it doesn't generalise to the first n elements) then I would add the lists, as they are made, to a dict where the key is a tuple of the first two elements, and the item is a list of lists.
then you simply retrieve your list by doing a dict lookup. This is easy to do and will bring potentially large speed ups, at almost no cost in memory and time while making the lists.

Grouping tuples using libraries in Python

Using Python is there an easier way without writing bunch of loops be able to count the values in a similar way. Perhaps using some library such as itertools groupby?
#original tuple array
[("A","field1"),("A","field1"),("B","field1")]
#output array
[("A","field1", 2), ("B", "field1",1)]
You can use a Counter dict to group, adding the count at the end
l = [("A","field1"),("A","field1"),("B","field1")]
from collections import Counter
print([k+(v,) for k,v in Counter(l).items()])
If you want the output ordered by the first time you encounter a tuple, you can use an OrderedDict to do the counting:
from collections import OrderedDict
d = OrderedDict()
for t in l:
d.setdefault(t, 0)
d[t] += 1
print([k+(v,) for k,v in d.items()])
I guess you just want to count the number of ocurrences of each tuple...right? If so, you can use Counter
You can use itertools.groupby. Omit the key parameter, then it will just group equal elements (assuming those are consecutive), then add the number of elements in the group.
>>> lst = [("A","field1"),("A","field1"),("B","field1")]
>>> [(k + (len(list(g)),)) for k, g in itertools.groupby(lst)]
[('A', 'field1', 2), ('B', 'field1', 1)]
If the elements are not consecitive, then this will not work, and anyway, the solution suggesting collections.Counter seems to be a better fit to the problem.

Using set on Dictionary keys

For my program, I wish to cleanly check whether any elements in a list is a key in a dictionary. So far, I can only think to loop through the list and checking.
However, is there any way to simplify this process? Is there any way to use sets? Through sets, one can check whether two lists have common elements.
This should be easy using the builtin any function:
any(item in dct for item in lst)
This is quick, efficient and (IMHO) quite readible. What could be better? :-)
Of course, this doesn't tell you which keys are in the dict. If you need that, then you're best recourse is to use dictionary view objects:
# python2.7
dct.viewkeys() & lst # Returns a set of the overlap
# python3.x
dct.keys() & lst # Same as above, but for py3.x
You can test for an intersection between the dictionary's keys and the list items using dict.keys:
if the_dict.keys() & the_list:
# the_dict has one or more keys found in the_list
Demo:
>>> the_dict = {'a':1, 'b':2, 'c':3}
>>> the_list = ['x', 'b', 'y']
>>> if the_dict.keys() & the_list:
... print('found key in the_list')
...
found key in the_list
>>>
Note that in Python 2.x, the method is called dict.viewkeys.
As far as efficiency goes, you can't be any more efficient than looping through the list. I would also argue that looping through the list is already a simple process.

Why can't I iterate over a Counter in Python?

Why is that when I try to do the below, I get the need more than 1 value to unpack?
for key,value in countstr:
print key,value
for key,value in countstr:
ValueError: need more than 1 value to unpack
However this works just fine:
for key,value in countstr.most_common():
print key,value
I don't understand, aren't countstr and countstr.most_common() equivalent?
EDIT:
Thanks for the below answers, then I guess what I don't understand is: If countstr is a mapping what is countstr.most_common()? -- I'm really new to Python, sorry if I am missing something simple here.
No, they're not. Iterating over a mapping (be it a collections.Counter or a dict or ...) iterates only over the mapping's keys.
And there's another difference: iterating over the keys of a Counter delivers them in no defined order. The order returned by most_common() is defined (sorted in reverse order of value).
No, they aren't equivalent. countstr is a Counter which is a dictionary subclass. Iterating over it yields 1 key at a time. countstr.most_common() is a list which contains 2-tuples (ordered key-value pairs).
A countstr is a Counter which is a subclass for counting hashable objects. It is an unordered collection where elements are stored as dictionary keys and their counts are stored as dictionary values.
>>> c = Counter(a=4, b=2, c=0, d=-2)
>>> list(c.elements())
['a', 'a', 'a', 'a', 'b', 'b']
You can't iterate the counter values/keys directly but you can copy them into a list and iterate the list.
s = ["bcdef", "abcdefg", "bcde", "bcdef"]
import collections
counter=collections.Counter(s)
vals = list(counter.values())
keys = list(counter.keys())
vals[0]
keys[0]
Output:
2
'bcdef'

Categories