I have the following program. I am trying to understand list comprehension and set comprehension:
mylist = [i for i in range(1,10)]
print(mylist)
clist = []
for i in mylist:
if i % 2 == 0:
clist.append(i)
clist2 = [x for x in mylist if (x%2 == 0)]
print('clist {} clist2 {}'.format(clist,clist2))
#set comprehension
word_list = ['apple','banana','mango','cucumber','doll']
myset = set()
for word in word_list:
myset.add(word[0])
myset2 = {word[0] for word in word_list}
print('myset {} myset2 {}'.format(myset,myset2))
My question is why the curly braces for myset2 = {word[0] for word in word_list}.
I haven't come across sets in detail before.
Curly braces are used for both dictionary and set comprehensions. Which one is created depends on whether you supply the associated value or not, like following (3.4):
>>> a={x for x in range(3)}
>>> a
{0, 1, 2}
>>> type(a)
<class 'set'>
>>> a={x: x for x in range(3)}
>>> a
{0: 0, 1: 1, 2: 2}
>>> type(a)
<class 'dict'>
Set is an unordered, mutable collection of unrepeated elements.
In python you can use set() to build a set, for example:
set>>> set([1,1,2,3,3])
set([1, 2, 3])
>>> set([3,3,2,5,5])
set([2, 3, 5])
Or use a set comprehension, like a list comprehension but with curly braces:
>>> {x for x in [1,1,5,5,3,3]}
set([1, 3, 5])
Related
I have my program's output as a python dictionary and i want a list of keys from the dictn:
s = "cool_ice_wifi"
r = ["water_is_cool", "cold_ice_drink", "cool_wifi_speed"]
good_list=s.split("_")
dictn={}
for i in range(len(r)):
split_review=r[i].split("_")
counter=0
for good_word in good_list:
if good_word in split_review:
counter=counter+1
d1={i:counter}
dictn.update(d1)
print(dictn)
The conditions on which we should get the keys:
The keys with the same values will have the index copied as it is in a dummy list.
The keys with highest values will come first and then the lowest in the dummy list
Dictn={0: 1, 1: 1, 2: 2}
Expected output = [2,0,1]
You can use a list comp:
[key for key in sorted(dictn, key=dictn.get, reverse=True)]
In Python3 it is now possible to use the sorted method, as described here, to sort the dictionary in any way you choose.
Check out the documentation, but in the simplest case you can .get the dictionary's values, while for more complex operations, you'd define a key function yourself.
Dictionaries in Python3 are now insertion-ordered, so one other way to do things is to sort at the moment of dictionary creation, or you could use an OrderedDict.
Here's an example of the first option in action, which I think is the easiest
>>> a = {}
>>> a[0] = 1
>>> a[1] = 1
>>> a[2] = 2
>>> print(a)
{0: 1, 1: 1, 2: 2}
>>>
>>> [(k) for k in sorted(a, key=a.get, reverse=True)]
[2, 0, 1]
I want to populate a dictionary with the counts of various items in a list, but only when the count exceeds a certain number. (This is in Python 2.7)
For example:
x = [2,3,4,2,3,5,6] if I only want numbers that appear twice or more, I would want only
d = {2: 2, 3: 2} as an output.
I wanted to do this with a dictionary comprehension, for example
{(num if x.count(num) >= 2): x.count(num) for num in x}
But this throws an "invalid syntax" error, and it seems I need to set some default key, which means some key I don't want being added to the dictionary which I then have to remove.
What I'm doing now is in two lines:
d = {(num if x.count(num) >= 2 else None): x.count(num) for num in x}
d.pop(None, None)
But is there a way to do it in one, or to do the dictionary comprehension with an if statement without actually adding any default key for the else statement?
Use Counter to count each items in x, the use a dictionary comprehension to pull those values where the count is greater than or equal to your threshold (e.g. 2).
from collections import Counter
x = [2, 3, 4, 2, 3, 5, 6]
threshold = 2
c = Counter(x)
d = {k: v for k, v in c.iteritems() if v >= threshold}
>>> d
{2: 2, 3: 2}
That works:
{ i: x.count(i) for i in x if x.count(i) >= 2}
The if part must be after the for, not before, that's why you get the syntax error.
To avoid counting elements twice, and without any extra import, you could also use two nested comprehensions (actually the inner one is a generator to avoid iterating the full list twice) :
>>> { j: n for j, n in ((i, x.count(i)) for i in x) if n >= 2}
{2: 2, 3: 2}
The test in your expression: (num if x.count(num) >= 2 else None) comes too late: you already instructed the dict comp to issue a value. You have to filter it out beforehand.
just move the condition from ternary to the filter part of the comprehension:
x = [2,3,4,2,3,5,6]
d = {num: x.count(num) for num in x if x.count(num) >= 2}
that said, this method isn't very effective, because it counts elements twice.
Filter a Counter instead:
import collections
d = {num:count for num,count in collections.Counter(x).items() if count>=2}
This should work:
a = [1,2,2,2,3,4,4,5,6,2,2,2]
{n: a.count(n) for n in set(a) if a.count(n) >= 2}
{2: 6, 4: 2}
This should work:
Input:
a = [2,2,2,2,1,1,1,3,3,4]
Code:
x = { i : a.count(i) for i in a }
print(x)
Output:
>>> {2: 4, 1: 3, 3: 2, 4: 1}
I wrote this code to perform as a simple search engine in a list of strings like the example below:
mii(['hello world','hello','hello cat','hellolot of cats']) == {'hello': {0, 1, 2}, 'cat': {2}, 'of': {3}, 'world': {0}, 'cats': {3}, 'hellolot': {3}}
but I constantly get the error
'dict' object has no attribute 'add'
how can I fix it?
def mii(strlist):
word={}
index={}
for str in strlist:
for str2 in str.split():
if str2 in word==False:
word.add(str2)
i={}
for (n,m) in list(enumerate(strlist)):
k=m.split()
if str2 in k:
i.add(n)
index.add(i)
return { x:y for (x,y) in zip(word,index)}
In Python, when you initialize an object as word = {} you're creating a dict object and not a set object (which I assume is what you wanted). In order to create a set, use:
word = set()
You might have been confused by Python's Set Comprehension, e.g.:
myset = {e for e in [1, 2, 3, 1]}
which results in a set containing elements 1, 2 and 3. Similarly Dict Comprehension:
mydict = {k: v for k, v in [(1, 2)]}
results in a dictionary with key-value pair 1: 2.
x = [1, 2, 3] # is a literal that creates a list (mutable array).
x = [] # creates an empty list.
x = (1, 2, 3) # is a literal that creates a tuple (constant list).
x = () # creates an empty tuple.
x = {1, 2, 3} # is a literal that creates a set.
x = {} # confusingly creates an empty dictionary (hash array), NOT a set, because dictionaries were there first in python.
Use
x = set() # to create an empty set.
Also note that
x = {"first": 1, "unordered": 2, "hash": 3} # is a literal that creates a dictionary, just to mix things up.
I see lots of issues in your function -
In Python {} is an empty dictionary, not a set , to create a set, you should use the builtin function set() .
The if condition - if str2 in word==False: , would never amount to True because of operator chaining, it would be converted to - if str2 in word and word==False , example showing this behavior -
>>> 'a' in 'abcd'==False
False
>>> 'a' in 'abcd'==True
False
In line - for (n,m) in list(enumerate(strlist)) - You do not need to convert the return of enumerate() function to list, you can just iterate over its return value (which is an iterator directly)
Sets do not have any sense of order, when you do - zip(word,index) - there is no guarantee that the elements are zipped in the correct order you want (since they do not have any sense of order at all).
Do not use str as a variable name.
Given this, you are better off directly creating the dictionary from the start , rather than sets.
Code -
def mii(strlist):
word={}
for i, s in enumerate(strlist):
for s2 in s.split():
word.setdefault(s2,set()).add(i)
return word
Demo -
>>> def mii(strlist):
... word={}
... for i, s in enumerate(strlist):
... for s2 in s.split():
... word.setdefault(s2,set()).add(i)
... return word
...
>>> mii(['hello world','hello','hello cat','hellolot of cats'])
{'cats': {3}, 'world': {0}, 'cat': {2}, 'hello': {0, 1, 2}, 'hellolot': {3}, 'of': {3}}
def mii(strlist):
word_list = {}
for index, str in enumerate(strlist):
for word in str.split():
if word not in word_list.keys():
word_list[word] = [index]
else:
word_list[word].append(index)
return word_list
print mii(['hello world','hello','hello cat','hellolot of cats'])
Output:
{'of': [3], 'cat': [2], 'cats': [3], 'hellolot': [3], 'world': [0], 'hello': [0, 1, 2]}
I think this is what you wanted.
I learned that I can use list comprehension in python to pre-populate a dict:
bounds = {i:1 for i in range(4)}
However if I try to add other elements to the dict I have a syntax error:
# rise an error
bounds = {i:1 for i in range(4),5:2}
Is there an other way to write in a concise way a dict where most of it has the same value and then there are exceptions on the tiles ?
It's not a set, but a dict.
You can do the following:
>>> x = {i:1 for i in range(4)}
>>> x.update({5:2})
>>> x
{0: 1, 1: 1, 2: 1, 3: 1, 5: 2}
You will, however, not be able to do:
>>> x = {i:1 for i in range(4)}.update({5:2})
>>> x is None
True
Because update operates in-place on the dict, and doesn't return (or returns the default None).
I am trying to create a list of lists based on hashes. That is, I want a list of lists of items that hash the same. Is this possible in a single-line comprehension?
Here is the simple code that works without comprehensions:
def list_of_lists(items):
items_by_hash = defaultdict(list)
for item in items:
words_by_key[hash(item)].append(item)
return words_by_key.values()
For example, let's say we have this simple hash function:
def hash(string):
import __builtin__
return __builtin__.hash(string) % 10
Then,
>>> l = ['sam', 'nick', 'nathan', 'mike']
>>> [hash(x) for x in l]
[4, 3, 2, 2]
>>>
>>> list_of_lists(l)
[['nathan', 'mike'], ['nick'], ['sam']]
Is there any way I could do this in a comprehension? I need to be able to reference the dictionary I'm building mid-comprehension, in order to append the next item to the list-value.
This is the best I've got, but it doesn't work:
>>> { hash(word) : [word] for word in l }.values()
[['mike'], ['nick'], ['sam']]
It obviously creates a new list every time which is not what I want. I want something like
{ hash(word) : __this__[hash(word)] + [word] for word in l }.values()
or
>>> dict([ (hash(word), word) for word in l ])
{2: 'mike', 3: 'nick', 4: 'sam'}
but this causes the same problem.
[[y[1] for y in x[1]] for x in itertools.groupby(sorted((hash(y), y)
for y in items), operator.itemgetter(0))]