Replacing elements in list with values from dictionary - python

I want to replace strings in a list with values from dictionary. However, for some logical reason which is not logical to me (obviously), length of list changes after replacement.
genre_list = ['action, drama, thriller', 'crime, romance, adventure']
list_new = []
categories_ids = {'action': '18',
'drama': '13',
'thriller': '11',
'romance': '1',
'adventure': '8',
'crime': '3'
}
print(len(genre_list)) # length before
for z in genre_list:
for a, b in categories_ids.items():
if a in z:
list_temp = z.replace(z, b)
list_new.append(list_temp)
print(len(list_new)) # length after
What am I missing here? Thanks in advance.

You append to list_new each element from categories that appears in each element of genre_list - the first 3 keys appears in the first element of genre_list and the 3 other keys appears in the second element of genre_list - so in list_new will be 6 elements in total.
Try instead:
genre_list = ['action, drama, thriller', 'crime, romance, adventure']
list_new = []
categories_ids = {'action': '18',
'drama': '13',
'thriller': '11',
'romance': '1',
'adventure': '8',
'crime': '3'
}
for z in genre_list:
for a, b in categories_ids.items():
z = z.replace(a, b)
list_new.append(z) # here is the difference - one append per element in genre_list
print(list_new) # output:['18, 13, 11', '3, 1, 8']

Use:
def func(s):
return ", ".join(categories_ids[w] for w in s.split(", "))
list_new = list(map(func, genre_list))
print(list_new)
This prints:
['18, 13, 11', '3, 1, 8']

you are adding new elements to your new_list if a key from your dict is in one string from genre_list but the genre_list has in one string multiple keys from your dict so you end to have multiple strings/elements in your new_list
you can use a regular expression with list comprehension:
import re
genre_list = ['action, drama, thriller', 'crime, romance, adventure']
pattern = '|'.join(categories_ids)
def replace(gr):
return categories_ids[gr.group()]
list_new = [re.sub(pattern, replace, t) for t in genre_list]
# ['18, 13, 11', '3, 1, 8']

Each list element contains more than 1 keys which is reason you end up with more elements in the new list. This can be handled as given the code below.
for z in genre_list:
key_words=''
for key in z.split(','):
if key.strip() in categories_ids:
key_words += categories_ids[key.strip()] +','
list_new.append(key_words[:-1])
Now both the lists will have same length as given below.
2 ['action, drama, thriller', 'crime, romance, adventure']
2 ['18,13,11', '3,1,8']

Related

how can I compare two lists in python and if I have matches ~> I want matches and next value from another list

a = [’bww’, ’1’, ’23’, ’honda’, ’2’, ’55’, ’ford’, ’11’, ’88’, ’tesla’, ’15’, ’1’, ’kia’, ’2’, ’3’]
b = [’ford’, ’honda’]
should return all matches and next value from list a
Result -> [’ford’, ’11’, ’honda’, ’2’]
or even better [’ford 11’, ’honda 2’]
I am new with python and asking help
Here is a neat one-liner to solve what you are looking for. It uses a list comprehension, which iterates over 2 items (bi-gram) of the list at once and then combines the matching items with their next item using .join()
[' '.join([i,j]) for i,j in zip(a,a[1:]) if i in b] #<------
['honda 2', 'ford 11']
EXPLANATION:
You can use zip(a, a[1:]) to iterate over 2 items in the list at once (bi-gram), as a rolling window of size 2. This works as follows.
Next you can compare the first item i[k] in each tuple (i[k],i[k+1]) with elements from list b, using if i in b
If it matches, you can then keep that tuple, and use ' '.join([i,j]) to join them into 1 string as you expect.
Rather than changing the data to suit the code (which some responders seem to think is appropriate) try this:
GROUP = 3
a = ['bmw', '1', '23', 'honda', '2', '55', 'ford', '11', '88', 'tesla', '15', '1', 'kia', '2', '3']
b = ['ford', 'honda']
c = [f'{a[i]} {a[i+1]}' for i in range(0, len(a)-1, GROUP) if a[i] in b]
print(c)
Output:
['honda 2', 'ford 11']
Note:
The assumption here is that input data are presented in groups of three but only the first two values in each triplet are needed.
If the assumption about grouping is wrong then:
c = [f'{a[i]} {a[i+1]}' for i in range(len(a)-1) if a[i] in b]
...which will be less efficient
Assuming all are in string type also assuming after every name in the list a there will be a number next to him.
Code:-
a = ['bww', '1', 'honda', '2', 'ford', '11', 'tesla', '15', 'nissan', '2']
b = ['ford', 'honda']
res=[]
for check in b:
for index in range(len(a)-1):
if check==a[index]:
res.append(check+" "+a[index+1])
print(res)
Output:-
['ford 11', 'honda 2']
List comprehension
Code:-
a = ['bww', '1', 'honda', '2', 'ford', '11', 'tesla', '15', 'nissan', '2']
b = ['ford', 'honda']
res=[check+" "+a[index+1] for check in b for index in range(len(a)-1) if check==a[index]]
print(res) #Same output
I hope ths will help you
a = ['bww', 1, 'honda', 2, 'ford', 11, 'tesla', 15, 'nissan', 2]
b = ['ford', 'honda']
ls=[]
for item in b:
if a.__contains__(item):
ls.append((item+" "+str(a[a.index(item)+1])))
print(ls)

Del list and next list element in list if string exist

I have an example:
list = [['2 a', 'nnn', 'xxxx','last'], ['next, next'], ['3', '4', 'next']]
for i in range(len(list)):
if list[i][-1] == "last":
del(list[i+1])
del(list[i])
I'd like to delete this list where the last item is "last" and the next item on the list.
In this example there is a problem every time - I tried different configurations, replacing with numpy array - nothing helps.
Trackback:
IndexError: list index out of range
I want the final result of this list to be ['3', '4', 'next']
Give me some tips or help how I can solve it.
Try this:
l = [['2 a', 'nnn', 'xxxx','last'], ['next, next'], ['3', '4', 'next']]
delete_next = False
to_ret = []
for x in l:
if x[-1] == 'last':
delete_next = True
elif delete_next:
delete_next = False
else:
to_ret.append(x)
Using a variable to store if this needs to be deleted
Loop over the list, if the last element of that iteration == 'last' then skip, else, append to a new list.
Also, it is not recommended to edit lists while iterating over them as strange things can happen, as mentioned in the comments above, like the indexes changing.
l = [['2 a', 'nnn', 'xxxx','last'], ['next, next'], ['3', '4', 'next']]
newlist = []
for i in l:
if i[-1] == 'last':
continue
else:
newlist.append(i)

Forming an array from items in list of lists

I am trying to create an array from data in a list of lists.
ac_name = 'ac'
dat = [['ab=55', 'ac=25', 'db =57', 'dc =44'],
['ab=75','ac =12', 'cg =11', 'pt =95'],
['ab=17', 'ac=62'],
['ab=97', 'aa=501', 'dc=12', 'dd=19']]
So I want to get a list that looks like this
ac = ['ac=25','ac=12','ac=62','']
and from this get
ac_values = [25,12,62,'']
All in all I want to convert dat into one large array.
I know this doesnt work because it is going through every item, so the output is however many elements there are in dat.
ac = []
for d in dat:
for c in d:
if ac_name in c:
ac.append(c)
else:
ac.append('')
As I mentioned in comment, your else block is inside the nested loop which means that for all the items in each list if the condition is not executed you'll have an empty string. You can use a flag to see whether the if block is executed in nested loop and append an empty string to the final result.
In [6]: ac = []
...: for d in dat:
...: flag = True
...: for c in d:
...: if ac_name in c:
...: ac.append(c)
...: flag = False
...: if flag:
...: ac.append('')
...:
In [7]: ac
Out[7]: ['ac=25', 'ac =12', 'ac=62', '']
But since this is not a much Pythonic way for dealing with problem, instead you can use generator expressions and next() function as following to create a dictionary out of expected result. In this case you can easily access keys or values as well.
In [19]: result = dict((ind, next((i for i in d if i.startswith(ac_name)), '=').split('=')[1]) for ind, d in enumerate(dat))
In [20]: result
Out[20]: {0: '25', 1: '12', 2: '62', 3: ''}
In [21]: result.keys() # shows number of sub-lists in your original list
Out[21]: dict_keys([0, 1, 2, 3])
In [22]: result.values()
Out[22]: dict_values(['25', '12', '62', ''])
ac_name = 'ac'
datas = [['ab=55', 'ac=25', 'db =57', 'dc =44'],
['ab=75','ac =12', 'cg =11', 'pt =95'],
['ab=17', 'ac=62'],
['ab=97', 'aa=501', 'dc=12', 'dd=19'],
['ab=55', 'ac=25', 'db =57', 'dc =44'],
['ab=75','ac =12', 'cg =11', 'pt =95'],
['ab=17', 'ac=62'],
['ab=97', 'aa=501', 'dc=12', 'dd=19']]
lst = []
for i,data in enumerate(datas):
for d in data:
if ac_name in d:
lst.append(d.split('=')[-1])
if i == len(lst):
lst.append('')
print(lst)
Output
['25', '12', '62', '', '25', '12', '62', '']
You can use itertools.chain to flatten your list of lists. Then use a list comprehension to filter and split elements as required.
from itertools import chain
res = [int(i.split('=')[-1]) for i in chain.from_iterable(dat) \
if i.startswith('ac')]
print(res)
[25, 12, 62]
There are many ways to do this as folks have shown. Here is one way using list comprehension and higher order functions:
In [14]: ["" if not kv else kv[0].split('=')[-1].strip() for kv in [filter(lambda x: x.startswith(ac_name), xs) for xs in datas]]
Out[14]: ['25', '12', '62', '']
If an exact key "ac" is desired, can use regular expressions too:
import re
p = re.compile(ac_name + '\s*')
["" if not kv else kv[0].split('=')[-1].strip() for kv in [filter(lambda x: p.match(x), xs) for xs in datas]]
After some puzzling, I found a possible solution
Process each element in each sublist individually: if it contains 'ac', then strip the 'ac=' part. If not, just return an empty string ''.
Then concatenate all elements in each sublist using string.join(). This will return a list of strings with either the number string, e.g. '25', or an empty string.
Finally, conditionally convert each string to integer if possible. Else just return the (empty) string.
ac = [int(cell_string) if cell_string.isdigit() else cell_string for cell_string in
[''.join([cell.split('=')[1] if ac_name in cell else '' for cell in row]) for row in data]]
Output:
[25, 12, 62, '']
edit:
If you want to extend it to multiple column names, e.g.:
col_name = ['ac', 'dc']
Then just extend this:
cols = [[int(cell_string) if cell_string.isdigit() else cell_string for cell_string in
[''.join([cell.split('=')[1] if name in cell else '' for cell in row]) for row in data]] for name in col_name]
Output:
[[25, 12, 62, ''], [44, '', '', 12]]
Try this:
ac_name = 'ac'
ac = []
ac_values = []
for value in dat:
found = False
for item in value:
if ac_name in item:
ac.append(item)
ac_values.append(item.split('=')[-1])
found = True
if not found:
ac.append(' ')
ac_values.append(' ')
print(ac)
print(ac_values)
Output:
['ac= 25', 'ac = 12', 'ac=62', ' ']
[' 25', ' 12', '62', ' ']
This will work for any length of ac_name:
ac_name = 'ac'
ac = []
ac_values=[]
for i in dat:
found=False
for j in i:
if j[:2]==ac_name:
ac.append(j)
ac_values.append(int(j[len(ac_name)+2:]))
found=True
if not found:
ac.append("")
ac_values.append("")
print(ac)
print(ac_values)

Splitting list into dictionary based on value type

I have a list that contains a word followed by multiple numbers. Some are separated my spaces other by commas. I'm trying to make the word a key in a dictionary and then split each number up into a separate list and store that as the dictionaries value.
Input:
my_list = ['word 1234 123 1', 'word 123 43 564', 'somethingelse 123,4124,56', etc...]
Output:
my_dict = {'word': ['1234', '123', '1'],
'word': ['123', '43', '564'], 'somethingelse': '123', '4124', '56' ]}
So far I have experimented with creating words to remove and also different regular expressions:
re.sub(r'([^\s\w]|_)+', '', str(my_list))
stopwords = ['word']
querywords = my_list.split()
resultwords = [word for word in querywords if word.lower() not in stopwords]
result = ' '.join(resultwords)
re.findall('\d+|\D+', my_list)
I wrote this method to split before certain words:
def removeBeforeX(self, x, listIn, listOut):
for item in listIn:
if x in item:
a,b = item.split(x)
listOut.append(b)
else:
listOut.append(item)
return listOut
I don't necessarily need the word to be the key because keys must be unique in dictionaries, but I do need to separate the numbers but keep them together in a list.
What's the problem?
>>> import re
>>> my_dict={}
>>> my_list = ['word1 1234 123 1', 'word2 123 43 564', 'somethingelse 123,4124,56']
>>> for i in my_list:
... parts = re.split(r'[, ]', i)
... my_dict[parts[0]] = parts[1:]
...
>>> my_dict
{'word2': ['123', '43', '564'], 'somethingelse': ['123', '4124', '56'], 'word1': ['1234', '123', '1']}
Obviously, you can't have the same key twice, otherwise the value will be overwritten.
A way without the re module:
>>> for i in my_list:
... parts = i.replace(',', ' ').split()
... my_dict[parts[0]] = parts[1:]
...
EDITED
OUTPUT A DICTIONARY
I would use partition method for the string combined with re and findall
import re
my_dict = {}
my_list = ['wordA 1234 123 1', 'wordB 123 43 564', 'somethingelse 123,4124,56']
for item in my_list:
my_dict[item.partition(' ')[0]] = re.findall('\d+', item)
print my_dict
The above code will result in the following dictionary
{'wordA': ['1234', '123', '1'], 'wordB': ['123', '43', '564'], 'somethingelse': ['123', '4124', '56']}
IF you expect to have the same word more than once you will need to rewrite the loop to check for the existence of the key and append the new values to the existing list.
OUTPUT A LIST OF LISTS
If a dictionary is not needed, them I would use re and list comprehension
my_newlist = [re.findall('\d+', item) for item in my_list]
print my_newlist
The above code with result in the following list of lists
[['1234', '123', '1'], ['123', '43', '564'], ['123', '4124', '56']]​

Python looping combinations of 8 objects into 3 groups, 3-3-2

Let's say I have a list of 8 objects, numbered that 1-8.
The objects are put into three boxes, 3 in one box, 3 in another box, 2 in the last box. By mathematics, there are 8C3*5C3=560 ways to do this. I want to loop through there 560 items.
Is there any way in Python to do so?
The result should look like this:
list=['12','345',678'], ['12','346','578'], ..., etc.
Note that ['12','345','678'] and ['12','354',876'] are considered the same for this purpose.
I want to make a for-loop this list. Is there any way in Python to do so?
Here is the solution I get, but it seems ugly.
import itertools
for c1,c2 in itertools.combinations(range(8),2):
l2=list(range(8))
l2.pop(c2)
l2.pop(c1)
for c3,c4,c5 in itertools.combinations(l2,3):
l3=l2[:]
l3.remove(c5)
l3.remove(c4)
l3.remove(c3)
c6,c7,c8=l3
print(c1,c2,c3,c4,c5,c6,c7,c8)
def F(seq, parts, indexes=None, res=[], cur=0):
if indexes is None: # indexes to use for combinations
indexes = range(len(seq))
if cur >= len(parts): # base case
yield [[seq[i] for i in g] for g in res]
return
for x in combinations(indexes, r=parts[cur]):
set_x = set(x)
new_indexes = [i for i in indexes if i not in set_x]
for comb in F(seq, parts, new_indexes, res=res + [x], cur=cur + 1):
yield comb
it = F('12345678', parts=(2,3,3))
for i in range(10):
print [''.join(g) for g in next(it)]
['12', '345', '678']
['12', '346', '578']
['12', '347', '568']
['12', '348', '567']
['12', '356', '478']
['12', '357', '468']
['12', '358', '467']
['12', '367', '458']
['12', '368', '457']
['12', '378', '456']
Another example:
for c in F('1234', parts=(2,2)):
print [''.join(g) for g in c]
['12', '34']
['13', '24']
['14', '23']
['23', '14']
['24', '13']
['34', '12']
You could just permute all your 8 values (like shown on previous answers).
for that use this previous answer (also on the following code).
Then assign each combination as a tuple, so they can be hashed and unique, for that you'll have to order them, so they can also be compare uniquely.
def all_perms(elements):
if len(elements) <=1:
yield elements
else:
for perm in all_perms(elements[1:]):
for i in range(len(elements)):
#nb elements[0:1] works in both string and list contexts
yield perm[:i] + elements[0:1] + perm[i:]
v = [1,2,3,4,5,6,7,8]
a = {}
for i in all_perms(v):
k = (tuple(sorted([i[0],i[1]])) , tuple(sorted([i[2],i[3],i[4]])) , tuple(sorted([i[5],i[6],i[7]])))
if k not in a:
a[k] = [str(i[0])+str(i[1]), str(i[2])+str(i[3])+str(i[4]), str(i[5])+str(i[6]) + str(i[7])]
x = 0
for i in a.values():
print x, i
x+=1
For your example on 8 values, this gives 560 combinations.
l would be a list of eight objects, in this example strings:
l = ["O1","02","03","04","04","06","07","08"]
for group in [l[:3],l[3:6],l[6:]]: #get 3 slices of the list into 3's and a 2
print(group)
Produces:
>>>
['O1', '02', '03']
['04', '04', '06']
['07','08']

Categories