Finding key with minimum value in OrderedDict - python

I need to find the key whose value is the lowest in the ordered dictionary but only when it is True for the position in my_list.
from collections import OrderedDict
my_list = [True,False,False,True,True,False,False,False,]
my_lookup = OrderedDict([('a', 2), ('b', 9), ('c', 4), ('d', 7),
('e', 3), ('f', 0), ('g', -5), ('h', 9)])
I know how to do it with a for loop like
mins=[]
i=0
for item in my_lookup.items():
if my_list[i]:
mins.append(item)
i+=1
print min(mins,key=lambda x:x[1])[0]
prints
a
because a is lowest that is also True in my_list.
This works but it is long and I want to know how to do it with comprehensions or one line?

You can use itertools.compress with key to min being get method,
>>> from itertools import compress
>>> min(compress(my_lookup, my_list), key=my_lookup.get)
a

You can also combine generator expression and min:
>>> min(((value, key) for ((key, value), flag) in zip(my_lookup.iteritems(), my_list) if flag))[1]
'a'

Two-liner:
from itertools import izip
print min((lookup_key for (lookup_key, keep) in izip(my_lookup.iterkeys(),
my_list) if keep), key=my_lookup.get)

For what it's worth, a slight variation of the original code performs reasonably (on par with ovgolovin's approach) and is quite readable:
minimum = (None, float('inf'))
for i, item in enumerate(my_lookup.items()):
if not my_list[i]:
continue
if item[1] < minimum[1]:
minimum = item
return minimum[0]
Even the original code is only slightly slower than this, based on ovgolovin's ideone benchmark. Of course, Jared's solution is quite faster and quite shorter.

Related

List of tuples and mean calculation [duplicate]

This question already has answers here:
Finding the average of a list
(25 answers)
Closed 1 year ago.
I have the following list of tuples:
list = [(120, 'x'), (1120, 'y'), (1330, 'x'), (0, 't'), (1, 'x'), (0, 'd'), (2435, 'x')]
I would like to calculate the mean of the first component of all tuples. I did the following:
s = []
for i in range(len(list)):
a = list[0][i]
if a =! 0:
s.append(a)
else:
pass
mean = sum(s) / len(s)
and it works, but my question is whether there is any way to avoid using for loops? since I have a very large list of tuples and due to time calculation I need to find another way if that possible.
According to the above stated for loop method. How could I find the mean with regard to the wights? I mean, e.g. the last element in the list is (2435, 'x') and the number 2435 is very large in comparison to that one in (1, 'x') which is 1. Any ideas would be very appreciated. Thanks in advance.
The loop is unavoidable as you need to iterate over all the elements at least once as John describes.
However, you can use an iterator based approach to get rid of creating a list to save on space:
mean = sum(elt[0] for elt in lst)/len(lst)
Update: I realize you only need the mean of elements that are non-zero. You can modify your approach to not store the elements in the list.
total = 0
counts = 0
for elt in lst:
if elt[0]:
total += elt[0]
counts += 1
mean = total/counts
A pandas approach:
import pandas as pd
tuples = [(120, 'x'), (1120, 'y'), (1330, 'x'), (0, 't'), (1, 'x'), (0, 'd'), (2435, 'x')]
df = pd.DataFrame(tuples)
df[0][df[0]!=0].mean() #1001.2
Careful timing would be needed to see if this is any better than what you are currently doing. The actual mean calculation should be faster, but the gain could well be negated by the cost of conversion.
You do need a for loop, but you can use list comprehension to make it cleaner.
Also, python standard library has a very nice statistics module that you can use for the calculation of the mean.
As extra note, please, do not use list as a variable name, it can be confused with the type list.
from statistics import mean
mylist = [(120, 'x'), (1120, 'y'), (1330, 'x'), (0, 't'), (1, 'x'), (0,'d'), (2435, 'x')]
m = mean([item[0] for item in mylist if item[0] != 0])
print(m)
1001.2
In Python 2.7
items = [item[0] for item in mylist if item[0] != 0]
mean = sum(items)/len(items)
print(mean)
1001.2
Finish up by refactoring the list comprehension to show more meaningful variable names, for example items = [number for number, letter in mylist if number != 0]

Character count in Python

The task is given: need to get a word from user, then total characters in the word must be counted and displayed in sorted order (count must be descending and characters must be ascending -
i.e.,
if the user gives as "management"
then the output should be
**a 2
e 2
m 2
n 2
g 1
t 1**
this is the code i written for the task:
string=input().strip()
set1=set(string)
lis=[]
for i in set1:
lis.append(i)
lis.sort()
while len(lis)>0:
maxi=0
for i in lis:
if string.count(i)>maxi:
maxi=string.count(i)
for j in lis:
if string.count(j)==maxi:
print(j,maxi)
lis.remove(j)
this code gives me following output for string "management"
a 2
m 2
e 2
n 2
g 1
t 1
m & e are not sorted.
What is wrong with my code?
The issue with your code lies in that you're trying to remove an element from the list while you're still iterating over it. This can cause problems. Presently, you remove "a", whereupon "e" takes its spot - and the list advances to the next letter, "m". Thus, "e" is skipped 'till the next iteration.
Try separating your printing and your removal, and don't remove elements from a list you're currently iterating over - instead, try adding all other elements to a new list.
string=input().strip()
set1=set(string)
lis=[]
for i in set1:
lis.append(i)
lis.sort()
while len(lis)>0:
maxi=0
for i in lis:
if string.count(i)>maxi:
maxi=string.count(i)
for j in lis:
if string.count(j)==maxi:
print(j,maxi)
dupelis = lis
lis = []
for k in dupelis:
if string.count(k)!=maxi:
lis.append(k)
managementa 2e 2m 2n 2g 1t 1
Demo
The problem with your code is the assignment of the variable maxi and the two for loops. "e" wont come second because you are assigning maxi as "2" and string.count(i) will be less than maxi.
for i in lis:
if string.count(i)>maxi:
maxi=string.count(i)
for j in lis:
if string.count(j)==maxi:
print(j,maxi)
There are several ways of achieving what you are looking for. You can try the solutions as others have explained.
you can use a simple Counter for that
from collections import Counter
Counter("management")
Counter({'a': 2, 'e': 2, 'm': 2, 'n': 2, 'g': 1, 't': 1})
I'm not really sure what you are trying to achieve by adding a while loop and then two nested for loops inside it. But the same thing can be achieved by a single for loop.
for i in lis:
print(i, string.count(i))
With this the output will be:
a 2
e 2
g 1
m 2
n 2
t 1
As answered before, you can use a Counter to get the counts of characters, no need to make a set or list.
For sorting, you'd be well off using the inbuilt sorted function which accepts a function in the key parameter. Read more about sorting and lambda functions.
>>> from collections import Counter
>>> c = Counter('management')
>>> sorted(c.items())
[('a', 2), ('e', 2), ('g', 1), ('m', 2), ('n', 2), ('t', 1)]
>>> alpha_sorted = sorted(c.items())
>>> sorted(alpha_sorted, key=lambda x: x[1])
[('g', 1), ('t', 1), ('a', 2), ('e', 2), ('m', 2), ('n', 2)]
>>> sorted(alpha_sorted, key=lambda x: x[1], reverse=True) # Reverse ensures you get descending sort
[('a', 2), ('e', 2), ('m', 2), ('n', 2), ('g', 1), ('t', 1)]
The easiest way to count the characters is to use Counter, as suggested by some previous answers. After that, the trick is to come up with a measure that takes both the count and the character into account to achieve the sorting. I have the following:
from collections import Counter
c = Counter('management')
sc = sorted(c.items(),
key=lambda x: -1000 * x[1] + ord(x[0]))
for char, count in sc:
print(char, count)
c.items() gives a list of tuples (character, count). We can use sorted() to sort them.
The parameter key is the key. sorted() puts items with lower keys (i.e. keys with smaller values) first, so I have to make a big count have a small value.
I basically give a lot of negative weight (-1000) to the count (x[1]), then augment that with the ascii value of character (ord(x[0])). The result is a sorting order that takes into account the count first, the character second.
An underlying assumption is that ord(x[0]) never exceeds 1000, which should be true of English characters.

How can I make a dictionary / collections.counter that takesz into account the index in Python?

I am aware of dictionaries and collection.Counters in Python.
My question is how can I make one that takes index of the string into account?
For example for this string: aaabaaa
I would like to make a tuples that contain each string in progression, keeping track of the count going left to right and resetting the count once a new alphanumeric is found.
For example, I like to see this output:
[('a', 3), ('b', 1), ('a', 3)]
Any idea how to use the dictionary / Counter/ or is there some other data structure built into Python I can use?
Regards
You could use groupby:
from itertools import groupby
m = [(k, sum(1 for _ in v)) for k, v in groupby('aaabaaa')]
print(m)
Output
[('a', 3), ('b', 1), ('a', 3)]
Explanation
The groupby function makes an iterator that returns consecutive keys and groups from the iterable, in this case 'aaabaaa'. The key k is the value identifying of the group, ['a', 'b', 'a']. The sum(1 for _ in v) count the amount of elements in the group.

An algorithm to find transitions in Python

I want to implement an algorithm that gets the index of letter changes.
I have the below list, here I want to find the beginning of every letter changes and put a result list except the first one. Because, for the first one, we should get the last index of occurrence of it. Let me give you an example:
letters=['A','A','A','A','A','A','A','A','A','A','A','A','B','C','C','X','D','X','B','B','A','A','A','A']
Transitions:
'A','A','A','A','A','A','A','A','A','A','A','A'-->'B'-->'C','C'-->'X'-->'D'-->'X'-->'B','B'-->'A','A','A','A'
Here, after A letters finish, B starts, we should put the index of last A and the index of first B and so on, but we should not include X letter into the result list.
Desired result:
[(11, 'A'), (12, 'B'), (13, 'C'), (16, 'D'), (18, 'B'), (20, 'A')]
So far, I have done this code, this finds other items except the (11, 'A'). How can I modify my code to get the desired result?
for i in range(len(letters)):
if letters[i]!='X' and letters[i]!=letters[i-1]:
result.append((i,(letters[i])))
My result:
[(12, 'B'), (13, 'C'), (16, 'D'), (18, 'B'), (20, 'A')] ---> missing (11, 'A').
Now that you've explained you want the first index of every letter after the first, here's a one-liner:
letters=['A','A','A','A','A','A','A','A','A','A','A','A','B','C','C','X','D','X','B','B','A','A','A','A']
[(n+1, b) for (n, (a,b)) in enumerate(zip(letters,letters[1:])) if a!=b and b!='X']
#=> [(12, 'B'), (13, 'C'), (16, 'D'), (18, 'B'), (20, 'A')]
Now, your first entry is different. For this, you need to use a recipe which finds the last index of each item:
import itertools
grouped = [(len(list(g))-1,k) for k,g in (itertools.groupby(letters))]
weird_transitions = [grouped[0]] + [(n+1, b) for (n, (a,b)) in enumerate(zip(letters,letters[1:])) if a!=b and b!='X']
#=> [(11, 'A'), (12, 'B'), (13, 'C'), (16, 'D'), (18, 'B'), (20, 'A')]
Of course, you could avoid creating the whole list of grouped, because you only ever use the first item from groupby. I leave that as an exercise for the reader.
This will also give you an X as the first item, if X is the first (set of) items. Because you say nothing about what you're doing, or why the Xs are there, but omitted, I can't figure out if that's the right behaviour or not. If it's not, then probably use my entire other recipe (in my other answer), and then take the first item from that.
Your question is a bit confusing, but this code should do what you want.
firstChangeFound = False
for i in range(len(letters)):
if letters[i]!='X' and letters[i]!=letters[i-1]:
if not firstChangeFound:
result.append((i-1, letters[i-1])) #Grab the last occurrence of the first character
result.append((i, letters[i]))
firstChangeFound = True
else:
result.append((i, letters[i]))
You want (Or, you don't, as you finally explained - see my other answer):
import itertools
import functional # get it from pypi
letters=['A','A','A','A','A','A','A','A','A','A','A','A','B','C','C','X','D','X','B','B','A','A','A','A']
grouped = [(len(list(g)),k) for k,g in (itertools.groupby(letters))]
#=> [(12, 'A'), (1, 'B'), (2, 'C'), (1, 'D'), (2, 'B'), (4, 'A')]
#-1 to take this from counts to indices
filter(lambda (a,b): b!='X',functional.scanl(lambda (a,b),(c,d): (a+c,d), (-1,'X'), grouped))
#=> [(11, 'A'), (12, 'B'), (14, 'C'), (16, 'D'), (19, 'B'), (23, 'A')]
This gives you the last index of each letter run, other than Xs. If you want the first index after the relevant letter, then switch the -1 to 0.
scanl is a reduce which returns intermediate results.
As a general rule, it makes sense to either filter first or last, unless that is for some reason expensive, or the filtering can easily be accomplished without increasing complexity.
Also, your code is relatively hard to read and understand, because you iterate by index. That's unusual in python, unless manipulating the index numerically. If you're visiting every item, it's usual to iterate directly.
Also, why do you want this particular format? It's usual to have the format as (unique item,data) because that can easily be placed in a dict.
With minimal change to your code, and following Josh Caswell's suggestion:
for i, letter in enumerate(letters[1:], 1):
if letter != 'X' and letters[i] != letters[i-1]:
result.append((i, letter))
first_change = result[0][0]
first_stretch = ''.join(letters[:first_change]).rstrip('X')
if first_stretch:
result.insert(0, (len(first_stretch) - 1, first_stretch[-1]))
Here's a solution which uses groupby to generate a single sequence from which both first and last indices can be extracted.
import itertools
import functools
letters = ['A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'B', 'C', 'C', 'X', 'D', 'X', 'B', 'B', 'A', 'A', 'A', 'A']
groupbysecond = functools.partial(itertools.groupby,key=operator.itemgetter(1))
def transitions(letters):
#segregate transition and non-transition indices
grouped = groupbysecond(enumerate(zip(letters,letters[1:])))
# extract first such entry from each group
firsts = (next(l) for k,l in grouped)
# group those entries together - where multiple, there are first and last
# indices of the run of letters
regrouped = groupbysecond((n,a) for n,(a,b) in firsts)
# special case for first entry, which wants last index of first letter
kfirst,lfirst = next(regrouped)
firstitem = (tuple(lfirst)[-1],) if kfirst != 'X' else ()
#return first item, and first index for all other letters
return itertools.chain(firstitem,(next(l) for k,l in regrouped if k != 'X'))
letters=['A','A','A','A','A','A','A','A','A','A','A','A','B','C','C','X','D','X','B','B','A','A','A','A']
# 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
prev = letters[0]
result = []
for i in range(len(letters)):
if prev!=letters[i]:
result.append((i-1,prev))
if letters[i]!='X':
prev = letters[i]
else:
prev = letters[i+1]
result.append((len(letters)-1,letters[-1]))
print result
RESULTS IN: (Not OP's desired results, sorry I must have misunderstood. see JSutton's ans)
[(11,'A'), (12,'B'), (14,'C'), (16,'D'), (19,'B'), (23,'A')]
which is actually the index of the last instance of a letter before they change or the list ends.
With an aid of dictionary to keep running time linear in number of input, here is a solution:
letters=['A','A','A','A','A','A','A','A','A','A','A','A','B','C','C','X','D','X','B','B','A','A','A','A']
def f(letters):
result = []
added = {}
for i in range(len(letters)):
if (i+1 == len(letters)):
break
if letters[i+1]!='X' and letters[i+1]!=letters[i]:
if(i not in added and letters[i]!='X'):
result.append((i, letters[i]))
added[i] = letters[i]
if(i+1 not in added):
result.append((i+1, letters[i+1]))
added[i+1] = letters[i+1]
return result
Basically, my the solution always tries to add both indices where a change occurred. But the dictionary (which has constant time lookup tells us if we already added the element or not to exclude duplicates). This takes care of adding the first element. Otherwise you can use an if statement to indicate first round which will only run once. However, I argue that this solution has same running time. As long as you do not check if you added the element by looking up the list itself (since this is linear time lookup at worst), this will result in O(n^2) time which is bad!
Here's my suggestion. It has three steps.
Fist, find all the starting indexes for each run of letters.
Replace the index in the first non-X run with the index of the end of its run, which will be one less than the start of the following run.
Filter out all X runs.
The code:
def letter_runs(letters):
prev = None
results = []
for index, letter in enumerate(letters):
if letter != prev:
prev = letter
results.append((index, letter))
if results[0][1] != "X":
results[0] = (results[1][0]-1, results[0][1])
else: # if first run is "X" second must be something else!
results[1] = (results[2][0]-1, results[1][1])
return [(index, letter) for index, letter in results if letter != "X"]

PySchool- List (Topic 6-22)

I am a beginner in python and i am trying to solve some questions about lists. I got stuck on one problem and I am not able to solve it:
Write a function countLetters(word) that takes in a word as argument
and returns a list that counts the number of times each letter
appears. The letters must be sorted in alphabetical order.
Ex:
>>> countLetters('google')
[('e', 1), ('g', 2), ('l', 1), ('o', 2)]
I am not able to count the occurrences of every character. For sorting I am using sorted(list) and I am also using dictionary(items functions) for this format of output(tuples of list). But I am not able to link all these things.
Use sets !
m = "google"
u = set(m)
sorted([(l, m.count(l)) for l in u])
>>> [('e', 1), ('g', 2), ('l', 1), ('o', 2)]
A hint: Note that you can loop through a string in the same way as a list or other iterable object in python:
def countLetters(word):
for letter in word:
print letter
countLetters("ABC")
The output will be:
A
B
C
So instead of printing, use the loop to look at what letter you've got (in your letter variable) and count it somehow.
finally, made it!!!
import collections
def countch(strng):
d=collections.defaultdict(int)
for letter in strng:
d[letter]+=1
print sorted(d.items())
This is my solution.Now, i can ask for your solutions of this problem.I would love to see your code.

Categories