I am running python 2.7.2 on a mac.
I have a simple dictionary:
dictionary= {a,b,c,a,a,b,b,b,b,c,a,w,w,p,r}
I want it to be printed and have the output like this:
Dictionary in alphabetical order:
a 4
b 5
c 2
p 1
r 1
w 2
But what I'm getting is something like this...
a 1
a 1
a 1
a 1
b 1
.
.
.
w 1
This is the code I am using.
new_dict = []
for word in dictionary.keys():
value = dictionary[word]
string_val = str(value)
new_dict.append(word + ": " + string_val)
sorted_dictionary = sorted(new_dict)
for entry in sorted_dictionary:
print entry
Can you please tell me where is the mistake?
(By the way, I'm not a programmer but a linguist, so please go easy on me.)
What you're using is not a dictionary, it's a set! :)
And sets doesn't allow duplicates.
What you probably need is not dictionaries, but lists.
A little explanation
Dictionaries have keys, and each unique keys have their own values:
my_dict = {1:'a', 2:'b', 3:'c'}
You retrieve values by using the keys:
>>> my_dict [1]
'a'
On the other hand, a list doesn't have keys.
my_list = ['a','b','c']
And you retrieve the values using their index:
>>> my_list[1]
'b'
Keep in mind that indices starts counting from zero, not 1.
Solving The Problem
Now, for your problem. First, store the characters as a list:
l = ['a', 'b', 'c', 'a', 'a', 'b', 'b', 'b', 'b', 'c', 'a', 'w', 'w', 'p', 'r']
Next, we'll need to know what items are in this list:
items = []
for item in l:
if item not in items:
items.append(item)
This is pretty much equal to items = set(l) (the only difference is that this is a list). But just to make things clear, hope you understand what the code does.
Here is the content of items:
>>> items
['a', 'b', 'c', 'w', 'p', 'r']
With that done, we will use lst.count() method to see the number of a char's occurence in your list, and the built-in function sorted() to sort the items:
for item in sorted(items): #iterates through the sorted items.
print item, l.count(item)
Result:
a 4
b 5
c 2
w 2
p 1
r 1
Hope this helps!!
Let's start with the obvious, this:
dictionary= {a,b,c,a,a,b,b,b,b,c,a,w,w,p,r}
is not a dictionary. It is a set, and sets do not preserve duplicates. You probably meant to declare that as a list or a tuple.
Now, onto the meat of your problem: you need to implement something to count the items of your collection. Your implementation doesn't really do that. You could roll your own, but really you should use a Counter:
my_list = ['a','b','c','a','a','b','b','b','b','c','a','w','w','p','r']
from collections import Counter
c = Counter(my_list)
c
Out[19]: Counter({'b': 5, 'a': 4, 'c': 2, 'w': 2, 'p': 1, 'r': 1})
Now on to your next problem: dictionaries (of all types, including Counter objects) do not preserve key order. You need to call sorted on the dict's items(), which is a list of tuples, then iterate over that to do your printing.
for k,v in sorted(c.items()):
print('{}: {}'.format(k,v))
a: 4
b: 5
c: 2
p: 1
r: 1
w: 2
dictionary is something like this{key1:content1, key2:content2, ...} key in a dictionary is unique. then a = {1,2,3,4,5,5,4,5,6} is the set, when you print this out, you will notice that
print a
set([1,2,3,4,5,6])
duplicates are eliminated.
In your case, a better data structure you can use is a list which can hold multiple duplicates inside.
if you want to count the element number inside, a better option is collections.Counter, for instance:
import collections as c
cnt = c.Counter()
dict= ['a','b','c','a','a','b','b','b','b','c','a','w','w','p','r']
for item in dict:
cnt[item]+=1
print cnt
the results would be:
Counter({'b': 5, 'a': 4, 'c': 2, 'w': 2, 'p': 1, 'r': 1})
as you notice, the results become a dictionary here.
so by using:
for key in cnt.keys():
print key, cnt[key]
you can access the key and content
a 4
c 2
b 5
p 1
r 1
w 2
you can achieve what you want by modifying this a little bit. hope this is helpful
Dictionary cannot be defined as {'a','b'}. If it defined so, then it is an set, where you can't find duplicates in the list
If your defining a character, give it in quotes unless it is declared already.
You can't loop through like this for word in dictionary.keys():, since here dictionary is not a dictionary type.
If you like to write a code without using any builtin function, try this
input=['a','b','c','a','a','b','b','b','b','c','a','w','w','p','r']
dict={}
for x in input:
if x in dict.keys():
dict[x]=dict[x]+1
else:
dict[x]=1
for k in dict.keys():
print k, dict[k]
First, a dictionary is an unordered collection (i.e., it has no guaranteed order of its keys).
Second, each dict key must be unique.
Though you could count the frequency of characters using a dict, there's a better the solution. The Counter class in Python's collections module is based on a dict and is specifically designed for a task like tallying frequency.
from collections import Counter
letters = ['a', 'b', 'c', 'a', 'a', 'b', 'b', 'b', 'b', 'c', 'a', 'w', 'w', 'p', 'r']
cnt = Counter(letters)
print cnt
The contents of the counter are now:
Counter({'b': 5, 'a': 4, 'c': 2, 'w': 2, 'p': 1, 'r': 1})
You can print these conveniently:
for char, freq in sorted(cnt.items()):
print char, freq
which gives:
a 4
b 5
c 2
p 1
r 1
w 2
Related
I would like to generate a list of combinations. I will try to simplify my problem to make it understandable.
We have 3 variables :
x : number of letters
k : number of groups
n : number of letters per group
I would like to generate using python a list of every possible combinations, without any duplicate knowing that : i don't care about the order of the groups and the order of the letters within a group.
As an example, with x = 4, k = 2, n = 2 :
# we start with 4 letters, we want to make 2 groups of 2 letters
letters = ['A','B','C','D']
# here would be a code that generate the list
# Here is the result that is very simple, only 3 combinations exist.
combos = [ ['AB', 'CD'], ['AC', 'BD'], ['AD', 'BC'] ]
Since I don't care about the order of or within the groups, and letters within a group, ['AB', 'CD'] and ['DC', 'BA'] is a duplicate.
This is a simplification of my real problem, which has those values : x = 12, k = 4, n = 3. I tried to use some functions from itertools, but with that many letters my computer freezes because it's too many combinations.
Another way of seeing the problem : you have 12 players, you want to make 4 teams of 3 players. What are all the possibilities ?
Could anyone help me to find an optimized solution to generate this list?
There will certainly be more sophisticated/efficient ways of doing this, but here's an approach that works in a reasonable amount of time for your example and should be easy enough to adapt for other cases.
It generates unique teams and unique combinations thereof, as per your specifications.
from itertools import combinations
# this assumes that team_size * team_num == len(players) is a given
team_size = 3
team_num = 4
players = list('ABCDEFGHIJKL')
unique_teams = [set(c) for c in combinations(players, team_size)]
def duplicate_player(combo):
"""Returns True if a player occurs in more than one team"""
return len(set.union(*combo)) < len(players)
result = (combo for combo in combinations(unique_teams, team_num) if not duplicate_player(combo))
result is a generator that can be iterated or turned into a list with list(result). On kaggle.com, it takes a minute or so to generate the whole list of all possible combinations (a total of 15400, in line with the computations by #beaker and #John Coleman in the comments). The teams are tuples of sets that look like this:
[({'A', 'B', 'C'}, {'D', 'E', 'F'}, {'G', 'H', 'I'}, {'J', 'K', 'L'}),
({'A', 'B', 'C'}, {'D', 'E', 'F'}, {'G', 'H', 'J'}, {'I', 'K', 'L'}),
({'A', 'B', 'C'}, {'D', 'E', 'F'}, {'G', 'H', 'K'}, {'I', 'J', 'L'}),
...
]
If you want, you can cast them into strings by calling ''.join() on each of them.
Another solution (players are numbered 0, 1, ...):
import itertools
def equipartitions(base_count: int, group_size: int):
if base_count % group_size != 0:
raise ValueError("group_count must divide base_count")
return set(_equipartitions(frozenset(range(base_count)), group_size))
def _equipartitions(base_set: frozenset, group_size: int):
if not base_set:
yield frozenset()
for combo in itertools.combinations(base_set, group_size):
for rest in _equipartitions(base_set.difference(frozenset(combo)), group_size):
yield frozenset({frozenset(combo), *rest})
all_combinations = [
[tuple(team) for team in combo]
for combo in equipartitions(12, 3)
]
print(all_combinations)
print(len(all_combinations))
And another:
import itertools
from typing import Iterable
def equipartitions(players: Iterable, team_size: int):
if len(players) % team_size != 0:
raise ValueError("group_count must divide base_count")
return _equipartitions(set(players), team_size)
def _equipartitions(players: set, team_size: int):
if not players:
yield []
return
first_player, *other_players = players
for other_team_members in itertools.combinations(other_players, team_size-1):
first_team = {first_player, *other_team_members}
for other_teams in _equipartitions(set(other_players) - set(first_team), team_size):
yield [first_team, *other_teams]
all_combinations = [
{''.join(sorted(team)) for team in combo} for combo in equipartitions(players='ABCDEFGHIJKL', team_size=3)
]
print(all_combinations)
print(len(all_combinations))
Firstly, you can use a list comprehension to give you all of the possible combinations (regardless of the duplicates):
comb = [(a,b) for a in letters for b in letters if a != b]
And, afterwards, you can use the sorted function to sort the tuples. After that, to remove the duplicates, you can convert all of the items to a set and then back to a list.
var = [tuple(sorted(sub)) for sub in comb]
var = list(set(var))
You could use the list comprehension approach, which has a time complexity of O(n*n-1), or you could use a more verbose way, but with a slightly better time complexity of O(n^2-n)/2:
comb = []
for first_letter_idx, _ in enumerate(letters):
for sec_letter_idx in range(first_letter_idx + 1, len(letters)):
comb.append(letters[first_letter_idx] + letters[sec_letter_idx])
print(comb)
comb2 = []
for first_letter_idx, _ in enumerate(comb):
for sec_letter_idx in range(first_letter_idx + 1, len(comb)):
if (comb[first_letter_idx][0] not in comb[sec_letter_idx]
and comb[first_letter_idx][1] not in comb[sec_letter_idx]):
comb2.append([comb[first_letter_idx], comb[sec_letter_idx]])
print(comb2)
This algorithm needs more work to handle dynamic inputs. Maybe with recursion.
Use combination from itertools
from itertools import combinations
x = list(combinations(['A','B','C','D'],2))
t = []
for i in (x):
t.append(i[0]+i[1]) # concatenating the strings and adding in a list
g = []
for i in range(0,len(t),2):
for j in range(i+1,len(t)):
g.append([t[i],t[j]])
break
print(g)
I know this is a frequently asked question, however I do not have access to the Counter module as I'm using v2.6 of Python. I want to count the number of time a specific key appears in a list of dictionaries.
If my dictionary looks like this:
data = [{'a':1, 'b':1}, {'a':1, 'c':1}, {'b':1, 'c':1}, {'a':1, 'c':1}, {'a':1, 'd':1}]
How would I find out how many times "a" appears? I've tried using len, but that only returns the number of values for one key.
len(data['a'])
You can use list comprehension.
data = [{'a':1, 'b':1}, {'a':1, 'c':1}, {'b':1, 'c':1}, {'a':1, 'c':1}, {'a':1, 'd':1}]
sum([1 for d in data if 'a' in d])
Explanation:
First take the dictionary object from list data, check if key 'a' is present in the dictionary or not, if present, add 1 to the list. Then sum the new list.
You won't have access to collections.Counter, but collections.defaultdict was added in Python 2.5
keys and flatten list
data = [j for i in data for j in i.keys()]
# ['a', 'b', 'a', 'c', 'c', 'b', 'a', 'c', 'a', 'd']
collections.defaultdict
from collections import defaultdict
dct = defaultdict(int)
for key in data:
dct[key] += 1
# defaultdict(<type 'int'>, {'a': 4, 'c': 3, 'b': 2, 'd': 1})
If you only need the count for a, there are simpler ways to do this, but this will give you the counts of all keys in your list of dictionaries.
A one-line solution could be:
len([k for d in data for k in d.keys() if k == 'a'])
For this you could write the following function that would work for data in the structure you provided (a list of dicts):
def count_key(key,dict_list):
keys_list = []
for item in dict_list:
keys_list += item.keys()
return keys_list.count(key)
Then, you could invoke the function as follows:
data = [{'a':1, 'b':1}, {'a':1, 'c':1}, {'b':1, 'c':1}, {'a':1, 'c':1}, {'a':1, 'd':1}]
count_a = count_key('a',data)
In this case, count_a will be 4.
This question looks very much like a class assignment. Here is a simple bit of code that will do the job:
n=0
for d in data:
if 'a' in d:
n+=1
print(n)
Here n is a counter, the for loop iterates through the list of dictionaries.
The 'a' in d expression will return true if the key 'a' is in the dictionary d, in which case the counter n will be incremented. At the end the result is printed. I believe in Python 2.6 the brackets would be optional (I am using 3.6).
All the questions I've seen do the exact opposite of what I want to do:
Say I have a list:
lst = ['a','b','c']
I am looking to make a dictionary where the key is the element number (starting with 1 instead of 0) and the list element is the value. Like this:
{1:'a', 2:'b', 3:'c'}
But for a long list. I've read a little about enumerate() but everything I've seen has used the list element as the key instead.
I found this:
dict = {tuple(key): idx for idx, key in enumerate(lst)}
But that produces:
{'a':1, 'b':2, 'c':3}
... which is the opposite of what I want. And, also in a weird notation that is confusing to someone new to Python.
Advice is much appreciated! Thanks!
enumerate has a start keyword argument so you can count from whatever number you want. Then just pass that to dict
dict(enumerate(lst, start=1))
You could also write a dictionary comprehension
{index: x for index, x in enumerate(lst, start=1)}
By default enumerate start from 0 , but you can set by this value by second argument which is start , You can add +1 to every iterator if you want to start from 1 instead of zero :
print({index+1:value for index,value in enumerate(lst)})
output:
{1: 'a', 2: 'b', 3: 'c'}
Above dict comprehension is same as :
dict_1={}
for index,value in enumerate(lst):
dict_1[index+1]=value
print(dict_1)
Using Dict Comprehension and enumerate
print({x:y for x,y in enumerate(lst,1)})
{1: 'a', 2: 'b', 3: 'c'}
Using Dict Comprehension , zip and range-
print({x:y for x,y in zip(range(1,len(lst)+1),lst)})
{1: 'a', 2: 'b', 3: 'c'}
I think the below code should help.
my_list = ['A', 'B', 'C', 'D']
my_index = []
my_dict = {}
for i in range(len(my_list)):
my_index.append(i+1)
for key in my_index:
for value in my_list:
my_dict[key] = value
I have a dictionary like:
chars_dict = {'a' : 1, 'c': 2, 'e': 4, 'h': 3, 's': 1}
Simply this dictionary will have characters and their counts with minimum being 1 and maximum being dependent on the characters in the string.
Now, I want to check for existence of count 2 or greater without using for loop. To achieve this, I reversed the above dictionary. Now the dictionary becomes,
rev_chars_dict = {1: ['a', 's'], 2: 'c', 4: 'e', 3: 'h'}
But, how can I check for the existence of keys (here numbers 2 or greater than that) without using for loop? Is there a pythonic way of doing it?
I would like something like,
if >=2 in rev_chars_dict:
return True
else:
return False
Why not use a for loop? Is this a homework problem?
max(d.values()) >= 2
where d is the dictionary.
Find the keys over 1:
[key for key, value in chars_dict.items() if value >1]
for a simple test:
len([key for key, value in chars_dict.items() if value >1])>0
I am writing a python program where I will be appending numbers into a list, but I don't want the numbers in the list to repeat. So how do I check if a number is already in the list before I do list.append()?
You could do
if item not in mylist:
mylist.append(item)
But you should really use a set, like this :
myset = set()
myset.add(item)
EDIT: If order is important but your list is very big, you should probably use both a list and a set, like so:
mylist = []
myset = set()
for item in ...:
if item not in myset:
mylist.append(item)
myset.add(item)
This way, you get fast lookup for element existence, but you keep your ordering. If you use the naive solution, you will get O(n) performance for the lookup, and that can be bad if your list is big
Or, as #larsman pointed out, you can use OrderedDict to the same effect:
from collections import OrderedDict
mydict = OrderedDict()
for item in ...:
mydict[item] = True
If you want to have unique elements in your list, then why not use a set, if of course, order does not matter for you: -
>>> s = set()
>>> s.add(2)
>>> s.add(4)
>>> s.add(5)
>>> s.add(2)
>>> s
39: set([2, 4, 5])
If order is a matter of concern, then you can use: -
>>> def addUnique(l, num):
... if num not in l:
... l.append(num)
...
... return l
You can also find an OrderedSet recipe, which is referred to in Python Documentation
If you want your numbers in ascending order you can add them into a set and then sort the set into an ascending list.
s = set()
if number1 not in s:
s.add(number1)
if number2 not in s:
s.add(number2)
...
s = sorted(s) #Now a list in ascending order
You could probably use a set object instead. Just add numbers to the set. They inherently do not replicate.
To check if a number is in a list one can use the in keyword.
Let's create a list
exampleList = [1, 2, 3, 4, 5]
Now let's see if it contains the number 4:
contains = 4 in exampleList
print(contains)
>>>> True
As you want to append when an element is not in a list, the not in can also help
exampleList2 = ["a", "b", "c", "d", "e"]
notcontain = "e" not in exampleList2
print(notcontain)
>>> False
But, as others have mentioned, you may want to consider using a different data structure, more specifically, set. See examples below (Source):
basket = {'apple', 'orange', 'apple', 'pear', 'orange', 'banana'}
>>> print(basket) # show that duplicates have been removed
{'orange', 'banana', 'pear', 'apple'}
'orange' in basket # fast membership testing
True
'crabgrass' in basket
False
# Demonstrate set operations on unique letters from two words
...
a = set('abracadabra')
b = set('alacazam')
a # unique letters in a
>>> {'a', 'r', 'b', 'c', 'd'}
a - b # letters in a but not in b
>>> {'r', 'd', 'b'}
a | b # letters in a or b or both
>>> {'a', 'c', 'r', 'd', 'b', 'm', 'z', 'l'}
a & b # letters in both a and b
>>> {'a', 'c'}
a ^ b # letters in a or b but not both
>>> {'r', 'd', 'b', 'm', 'z', 'l'}