Is there a way to change this dictionary output? - python

Change my dictionary, this is the initial code:
bow=[[i for i in all_docs[j] if i not in stopwords] for j in range(n_docs)]
bow=list(filter(None,bow))
bow
Here is bow output:
[['lunar',
'satellite',
'needs'],
['glad',
'see',
'griffin'] ]
worddict_two = [ (i,key) for i,key in enumerate(bow)]
worddict_two
From this output :
[(0,
['lunar',
'satellite',
'needs']),
(1,
['glad',
'see',
'griffin'])
to this output:
[(0,'lunar satellite needs'),
(1,'glad see griffin') ) ]

worddict_two = [ (i, " ".join(key)) for i,key in enumerate(bow)]
This would work. Use join to Join all items in a tuple into a string with space as a separator

You can just join the list with spaces like so
worddict_two = [ (i,' '.join(key)) for i,key in enumerate(bow)]

you can do it like this:
bow = [
['lunar','satellite','needs'],
['glad','see','griffin']
]
res = [(i,*key) for i,key in enumerate(bow)]
print(res)

Try this:
word_three = [(item[0], ', '.join(word for word in item[1])) for item in worddict_two]

Related

Convert a list of bigram tuples to a list of strings

I am trying to create Bigram tokens of sentences.
I have a list of tuples such as
tuples = [('hello', 'my'), ('my', 'name'), ('name', 'is'), ('is', 'bob')]
and I was wondering if there is a way to convert it to a list using python, so it would look love this:
list = ['hello my', 'my name', 'name is', 'is bob']
thank you
Try this snippet:
list = [' '.join(x) for x in tuples]
join is a string method that contatenates all items of a list (tuple) within a separator defined in '' brackets.
Try this
list = [ '{0} {1}'.format(t[0],t[1]) for t in tuples ]
In general if you want to use both values of a tuple, you can use something like this:
my_list = []
for (first, second) in tuples:
my_list.append(first+ ' '+ second)
In this case
my_list = [' '.join(x) for t in tuples]
should be fine
tuples = [('hello', 'my'), ('my', 'name'), ('name', 'is'), ('is', 'bob')]
result=[]
[result.append(k+" "+v) for k,v in tuples]
print(result)
output:
['hello my', 'my name', 'name is', 'is bob']

How can i create a new list of sublist?

SO I am currently trying to do some exercises in python and i don't quite understand how to take a list of strings which is called s and build a new list of sub lists that is called as r.
So if i have an input
s = [ 'It is', 'time', 'for', 'tea' ]
the output list r should contain:
[ [0,'It is'], [1,'time'], [2,'for'], [3,'tea'] ]
Can someone please help me understand and get the answer?
I have tried to do this but its not the answer that i want.
def sub_lists(list1):
# store all the sublists
sublist = [[]]
# first loop
for i in range(len(list1) + 1):
# second loop
for j in range(i + 1, len(list1) + 1):
# slice the subarray
sub = list1[i:j]
sublist.append(sub)
return sublist
# driver code
s = [ 'It is', 'time', 'for', 'tea' ]
print(sub_lists(s))
You can use enumerate for this:
s = [ 'It is', 'time', 'for', 'tea' ]
r = [[index, value] for index, value in enumerate(s)]
print(r)
Output:
[[0, 'It is'], [1, 'time'], [2, 'for'], [3, 'tea']]
You can use enumerate():
s = [ 'It is', 'time', 'for', 'tea' ]
s = [[i,v] for i,v in enumerate(s)]
print(s)
Or, if you're not ready for that, you can use subscriptions:
s = [ 'It is', 'time', 'for', 'tea' ]
s = [[i,s[i]] for i in range(len(s))]
print(s)

i want to Make one list from list of list python with some condition i mentioned the expected output

I have a list of list i want to merge them in to one with ''
associated_values=[['chennai'], ['printer', 'pc', 'notebook']]
i want this output
["chennai","'printer','pc','notebook'"]
this code is not working. i want two list as two comma separated string values same as the required output.
for i in associated_values:
s=''
newlist.append(str(s.join(i)))
This works:
associated_values=[['chennai'], ['printer', 'pc', 'notebook']]
newlist = []
for i in associated_values:
if len(i) == 1:
newlist.append("'"+str(i[0]+"'"))
else:
s = ''
for item in i:
if item != i[0]:
s += ' ,' + "'"+str(item)+ "'"
else:
s += "'"+str(item)+"'"
newlist.append(s)
print(newlist)
Output
============================== RESTART: D:\x.py ==============================
["'chennai'", "'printer' ,'pc' ,'notebook'"]
>>>
I hope this is what you want.
The following should give you what you want :
for e in associated_values:
newlist.append(str(e)[1:-1])
You can solve your problem using this approach:
associated_values=[['chennai'], ['printer', 'pc', 'notebook']]
result = list(map(lambda x: ','.join(map(lambda y: "'" + y + "'" if len(x) > 1 else y, x)), associated_values))
print(result)
# ['chennai', "'printer','pc','notebook'"]
If you wand output as a list you can use this :
associated_values=[['chennai'], ['printer', 'pc', 'notebook']]
newlist = []
for i in associated_values:
for _ in i:
newlist.append(_)
print(newlist)
# Output : ['chennai', 'printer', 'pc', 'notebook']

Remove a substr from string items in a list

list = [ 'u'adc', 'u'toto', 'u'tomato', ...]
What I want is to end up with a list of the kind:
list2 = [ 'adc', 'toto', 'tomato'... ]
Can you please tell me how to do that without using regex?
I'm trying:
for item in list:
list.extend(str(item).replace("u'",''))
list.remove(item)
but this ends up giving something of the form [ 'a', 'd', 'd', 'm'...]
In the list I may have an arbitrary number of strings.
you can encode it to "utf-8" like this:
list_a=[ u'adc', u'toto', u'tomato']
list_b=list()
for i in list_a:
list_b.append(i.encode("utf-8"))
list_b
output:
['adc', 'toto', 'tomato']
Or you can use str function:
list_c = list()
for i in list_a:
list_c.append(str(i))
list_c
Output:
['adc', 'toto', 'tomato']
Use "u\'"
For example:
l = [ "u'adc", "u'toto", "u'tomato"]
for item in l:
print(item.replace("u\'", ""))
Will output:
adc
toto
tomato
I verified your question but it says the syntax problem, which means that the way you are declaring the string in the list is not proper. In which case, I have corrected that at line #2.
In [1]: list = [ 'u'adc', 'u'toto', 'u'tomato']
File "<ipython-input-1-2c6e581e868e>", line 1
list = [ 'u'adc', 'u'toto', 'u'tomato']
^
SyntaxError: invalid syntax
In [2]: list = [ u'adc', u'toto', u'tomato']
In [3]: list = [ str(item) for item in list ]
In [4]: list
Out[4]: ['adc', 'toto', 'tomato']
In [5]:
Solution-1
input_list = [ u'adc', u'toto', u'tomato']
output_list=map(lambda x:str(x),input_list )
print output_list
And Output Look like:
['adc', 'toto', 'tomato']
Solution-2
input_list = [ u'adc', u'toto', u'tomato']
output_list=map(lambda x:x.encode("utf-8"),input_list )
print output_list
And Output Look like:
['adc', 'toto', 'tomato']
Try this:
for item in list:
for x in range(0, len(item)):
if item[x] == 'u':
item[x] = ''
This takes all instances in the list, and checks for the string 'u'. If 'u' is found, than the code replaces it with a blank string, essentially deleting it. Some more code could allow this to check for combinations of letters ('abc', etc.).
Your input is nothing but a json! You the dump each item in the list(which is a json!) to get the desired output!
Since your output comes with quotes - you need to strip(beginning and trailing) them!
import json
list = [ u'adc', u'toto', u'tomato']
print [json.dumps(i).strip('\"') for i in list]
Output:
['adc', 'toto', 'tomato']
Hope it helps!

Find the word which all character is matching with other words in python

like umbellar = umbrella both are equal words.
Input = ["umbellar","goa","umbrella","ago","aery","alem","ayre","gnu","eyra","egma","game","leam","amel","year","meal","yare","gun","alme","ung","male","lame","mela","mage" ]
so output should be :
output=[
["umbellar","umbrella"],
["ago","goa"],
["aery","ayre","eyra","yare","year"],
["alem","alme","amel","lame","leam","male","meal","mela"],
["gnu","gun","ung"]
["egma","game","mage"],
]
from itertools import groupby
def group_words(word_list):
sorted_words = sorted(word_list, key=sorted)
grouped_words = groupby(sorted_words, sorted)
for key, words in grouped_words:
group = list(words)
if len(group) > 1:
yield group
Example:
>>> group_words(["umbellar","goa","umbrella","ago","aery","alem","ayre","gnu","eyra","egma","game","leam","amel","year","meal","yare","gun","alme","ung","male","lame","mela","mage" ])
<generator object group_words at 0x0297B5F8>
>>> list(_)
[['umbellar', 'umbrella'], ['egma', 'game', 'mage'], ['alem', 'leam', 'amel', 'meal', 'alme', 'male', 'lame', 'mela'], ['aery', 'ayre', 'eyra', 'year', 'yare'], ['goa', 'ago'], ['gnu', 'gun', 'ung']]
They're not equal words, they're anagrams.
Anagrams can be found by sorting by character:
sorted('umbellar') == sorted('umbrella')
collections.defaultdict comes in handy:
from collections import defaultdict
input = ["umbellar","goa","umbrella","ago","aery","alem","ayre","gnu",
"eyra","egma","game","leam","amel","year","meal","yare","gun",
"alme","ung","male","lame","mela","mage" ]
D = defaultdict(list)
for i in input:
key = ''.join(sorted(input))
D[key].append(i)
output = D.values()
And output is [['umbellar', 'umbrella'], ['goa', 'ago'], ['gnu', 'gun', 'ung'], ['alem', 'leam', 'amel', 'meal', 'alme', 'male', 'lame', 'mela'], ['egma', 'game', 'mage'], ['aery', 'ayre', 'eyra', 'year', 'yare']]
As others point out you're looking for all the groups of anagrams in your list of words. here you have a possible solution. This algorithm looks for candidates and selects one (first element) as the canonical word, deletes the rest as possible words because anagrams are transitive and once you find that a word belongs to an anagram group you don't need to recompute it again.
input = ["umbellar","goa","umbrella","ago","aery","alem","ayre","gnu",
"eyra","egma","game","leam","amel","year","meal","yare","gun",
"alme","ung","male","lame","mela","mage" ]
res = dict()
for word in input:
res[word]=[word]
for word in input:
#the len test is just to avoid sorting and comparing words of different len
candidates = filter(lambda x: len(x) == len(word) and\
sorted(x) == sorted(word),res.keys())
if len(candidates):
canonical = candidates[0]
for c in candidates[1:]:
#we delete all candidates expect the canonical/
del res[c]
#we add the others to the canonical member
res[canonical].append(c)
print res.values()
This algth outputs ...
[['year', 'ayre', 'aery', 'yare', 'eyra'], ['umbellar', 'umbrella'],
['lame', 'leam', 'mela', 'amel', 'alme', 'alem', 'male', 'meal'],
['goa', 'ago'], ['game', 'mage', 'egma'], ['gnu', 'gun', 'ung']]
the answer of Shang is right......but I have been challenged to do same thing without using .... 'groupby()' .......
here it is.....
adding the print statements will help you in debugging the code and runtime output....
def group_words(word_list):
global new_list
list1 = []
_list0 = []
_list1 = []
new_list = []
for elm in word_list:
list_elm = list(elm)
list1.append(list(list_elm))
for ee in list1:
ee = sorted(ee)
ee = ''.join(ee)
_list1.append(ee)
_list1 = list(set(_list1))
for _e1 in _list1:
for e0 in word_list:
if len(e0) == len(_e1):
list_e0 = ''.join(sorted(e0))
if _e1 == list_e0:
_list0.append(e0)
_list0 = list(_list0)
new_list.append(_list0)
_list0 = []
return new_list
and output is
[['umbellar', 'umbrella'], ['goa', 'ago'], ['gnu', 'gun', 'ung'], ['alem', 'leam', 'amel', 'meal', 'alme', 'male', 'lame', 'mela'], ['egma', 'game', 'mage'], ['aery', 'ayre', 'eyra', 'year', 'yare']]

Categories