I have a list containing different data types, say numbers and strings:
foo = [5,2,'a',8,4,'b','y',9, 'd','e','g']
Let's say I want to find all consecutive strings in the the list, and group them together:
bar = [ ['a'],['b','y'],['d','e','g'] ]
How can I do this
This is a wonderful opportunity to use groupby:
from itertools import groupby
foo = [5,2,'a',8,4,'b','y',9, 'd','e','g']
bar = [list(g) for k, g in groupby(foo, key=lambda x: isinstance(x, str)) if k]
which produces the desired:
[['a'], ['b', 'y'], ['d', 'e', 'g']]
Iterate through each element in the list, if it is of type str, append it to one_d_array, otherwise, append one_d_array to two_d_array, provided one_d_array is not empty. Reset one_d_array whenever the element is not of type str
lst = [5,2,'a',8,4,'b','y',9, 'd','e','g', 3]
ind = 0
two_d_arr = []
one_d_arr = []
while(ind < len(lst)):
cur_element = lst[ind]
if(isinstance(cur_element, str) == True):
one_d_arr.append(cur_element)
else:
if(len(one_d_arr) != 0):
two_d_arr.append(one_d_arr)
one_d_arr = []
ind = ind+1
if(len(one_d_arr) != 0):
two_d_arr.append(one_d_arr)
print(two_d_arr)
Without using any import, you can do it through a good old "for loop" iterating over the elements of the lists. Here is a code working also for any type you want, not only string:
def group_list(a_list, a_type):
res = []
sublist = []
for elem in a_list:
if isinstance(elem, a_type):
# Here the element is of type a_type: append it to a sublist
sublist.append(elem)
else:
# Here the element is not of type a_type: append the sublist (if not empty) to the result list
if sublist:
res.append(sublist)
sublist = []
# If the last element of the list is of type a_type, the last sublist has not been appended: append it now
if sublist:
res.append(sublist)
return res
foo = [5,2,'a',8,4,'b','y',9, 'd','e','g']
print(group_list(foo,str))
# [['a'], ['b', 'y'], ['d', 'e', 'g']]
Related
I have the following list a with None values for which I want to make a "fill down".
a = [
['A','B','C','D'],
[None,None,2,None],
[None,1,None,None],
[None,None,8,None],
['W','R',5,'Q'],
['H','S','X','V'],
[None,None,None,7]
]
The expected output would be like this:
b = [
['A','B','C','D'],
['A','B',2,'D'],
['A',1,'C','D'],
['A','B',8,'D'],
['W','R',5,'Q'],
['H','S','X','V'],
['H','S','X',7]
]
I was able to make the next code and seems to work but I was wondering if there is a built-in method or more direct
way to do it. I know that there is something like that using pandas but needs to convert to dataframe, and I want
to continue working with list, if possible only update a list, and if not possible to modify a then get output in b list. Thanks
b = []
for z in a:
if None in z:
b.append([temp[i] if value == None else value for i, value in enumerate(z) ])
else:
b.append(z)
temp = z
You could use a list comprehension for this, but I'm not sure it adds a lot to your solution already.
b = [a[0]]
for a_row in a[1:]:
b.append([i if i else j for i,j in zip(a_row, b[-1])])
I'm not sure if it's by design, but in your example a number is never carried down to the next row. If you wanted to ensure that only letters are carried down, this could be added by keeping track of the letters last seen in each position. Assuming that the first row of a is always letters then;
last_seen_letters = a[0]
b = []
for a_row in a:
b.append(b_row := [i if i else j for i,j in zip(a_row, last_seen_letters)])
last_seen_letters = [i if isinstance(i, str) else j for i,j in zip(b_row, last_seen_letters)]
First, consider the process of "filling down" into a single row. We have two rows as input: the row above and the row below; we want to consider elements from the two lists pairwise. For each pair, our output is determined by simple logic - use the first value if the second value is None, and the second value otherwise:
def fill_down_new_cell(above, current):
return above if current is None else current
which we then apply to each pair in the pairwise iteration:
def fill_down_new_row(above, current):
return [fill_down_new_cell(a, c) for a, c in zip(above, current)]
Next we need to consider overlapping pairs of rows from our original list. Each time, we replace the contents of the "current" row with the fill_down_row result, by slice-assigning them to the entire list. In this way, we can elegantly update the row list in place, which allows changes to propagate to the next iteration. So:
def fill_down_inplace(rows):
for above, current in zip(rows, rows[1:]):
current[:] = fill_down_new_row(above, current)
Let's test it:
>>> a = [
... ['A','B','C','D'],
... [None,None,2,None],
... [None,1,None,None],
... [None,None,8,None],
... ['W','R',5,'Q'],
... ['H','S','X','V'],
... [None,None,None,7]
... ]
>>> fill_down_inplace(a)
>>> import pprint
>>> pprint.pprint(a)
[['A', 'B', 'C', 'D'],
['A', 'B', 2, 'D'],
['A', 1, 2, 'D'],
['A', 1, 8, 'D'],
['W', 'R', 5, 'Q'],
['H', 'S', 'X', 'V'],
['H', 'S', 'X', 7]]
I have 2 nested arrays
Test = [['c','d','b','t','j','n','k','s','p','t','k'],['l','u','y','r','c','b']]
Sample = [[1,0,1,1,2,0,3,4,0,0,4],[1,0,1,2,0,3]]
I want output like whenever 0 in Sample array.I want to extract corresponding letter in Test array.Both array lengths are same
Output = [['d','n','p','t],['u','c']]
This should work:
Test = [['c','d','b','t','j','n','k','s','p','t','k'],['l','u','y','r','c','b']]
Sample = [[1,0,1,1,2,0,3,4,0,0,4],[1,0,1,2,0,3]]
final_list = []
for j in range(len(Test)):
sub_list = []
for i in range(len(Test[j])):
if Sample[j][i] == 0:
sub_list.append(Test[j][i])
final_list.append(sub_list)
Where final_list is your expected output
import numpy as np
res = [list(np.array(a)[np.array(b) == 0]) for a,b in zip(Test, Sample)]
for loop and zip() does all the work
final_list = []
for x,y in zip(Test, Sample):
_list=[] # Temp. list to append to
for i,j in zip(x,y):
if j == 0:
_list.append(i)
final_list.append(_list) # appending to final list to create list of list
del _list # del. the temp_list to avoid duplicate values
final_list
This seems like a job for zip() and list comprehensions:
result = [
[t for t, s in zip(test, sample) if s == 0]
for test, sample in zip(Test, Sample)
]
Result:
[['d', 'n', 'p', 't'], ['u', 'c']]
I have 2 lists
mainlist=[['RD-12',12,'a'],['RD-13',45,'c'],['RD-15',50,'e']] and
sublist=[['RD-12',67],['RD-15',65]]
if i join both the list based on 1st element condition by using below code
def combinelist(mainlist,sublist):
dict1 = { e[0]:e[1:] for e in mainlist }
for e in sublist:
try:
dict1[e[0]].extend(e[1:])
except:
pass
result = [ [k] + v for k, v in dict1.items() ]
return result
Its results in like below
[['RD-12',12,'a',67],['RD-13',45,'c',],['RD-15',50,'e',65]]
as their is no element in for 'RD-13' in sublist, i want to empty string on that.
The final output should be
[['RD-12',12,'a',67],['RD-13',45,'c'," "],['RD-15',50,'e',65]]
Please help me.
Your problem can be solved using a while loop to adjust the length of your sublists until it matches the length of the longest sublist by appending the wanted string.
for list in result:
while len(list) < max(len(l) for l in result):
list.append(" ")
You could just go through the result list and check where the total number of your elements is 2 instead of 3.
for list in lists:
if len(list) == 2:
list.append(" ")
UPDATE:
If there are more items in the sublist, just subtract the lists containing the 'keys' of your lists, and then add the desired string.
def combinelist(mainlist,sublist):
dict1 = { e[0]:e[1:] for e in mainlist }
list2 = [e[0] for e in sublist]
for e in sublist:
try:
dict1[e[0]].extend(e[1:])
except:
pass
for e in dict1.keys() - list2:
dict1[e].append(" ")
result = [[k] + v for k, v in dict1.items()]
return result
You can try something like this:
mainlist=[['RD-12',12],['RD-13',45],['RD-15',50]]
sublist=[['RD-12',67],['RD-15',65]]
empty_val = ''
# Lists to dictionaries
maindict = dict(mainlist)
subdict = dict(sublist)
result = []
# go through all keys
for k in list(set(list(maindict.keys()) + list(subdict.keys()))):
# pick the value from each key or a default alternative
result.append([k, maindict.pop(k, empty_val), subdict.pop(k, empty_val)])
# sort by the key
result = sorted(result, key=lambda x: x[0])
You can set up your empty value to whatever you need.
UPDATE
Following the new conditions, it would look like this:
mainlist=[['RD-12',12,'a'], ['RD-13',45,'c'], ['RD-15',50,'e']]
sublist=[['RD-12',67], ['RD-15',65]]
maindict = {a:[b, c] for a, b, c in mainlist}
subdict = dict(sublist)
result = []
for k in list(set(list(maindict.keys()) + list(subdict.keys()))):
result.append([k, ])
result[-1].extend(maindict.pop(k, ' '))
result[-1].append(subdict.pop(k, ' '))
sorted(result, key=lambda x: x[0])
Another option is to convert the sublist to a dict, so items are easily and rapidly accessible.
sublist_dict = dict(sublist)
So you can do (it modifies the mainlist):
for i, e in enumerate(mainlist):
data: mainlist[i].append(sublist_dict.get(e[0], ""))
#=> [['RD-12', 12, 'a', 67], ['RD-13', 45, 'c', ''], ['RD-15', 50, 'e', 65]]
Or a one liner list comprehension (it produces a new list):
[ e + [sublist_dict.get(e[0], "")] for e in mainlist ]
If you want to skip the missing element:
for i, e in enumerate(mainlist):
data = sublist_dict.get(e[0])
if data: mainlist[i].append(data)
print(mainlist)
#=> [['RD-12', 12, 'a', 67], ['RD-13', 45, 'c'], ['RD-15', 50, 'e', 65]]
This question already has answers here:
Find all the keys cluster in a list
(2 answers)
Closed 3 years ago.
I have a list of list with strings in it:
list = [["a","b"],["c","d"],["a", "e"],["f","d"],["x","y"]]
Now i want to merge all lists, that have 1 similar item in it like this:
grouped_list = [["a", "b", "e"],["c","d","f"],["x","y"]]
my code is this til now:
list = [["a","b"],["b","c"],["d","e"],["x","y"]]
clist = list.copy()
result = []
counter = 0
del_list = []
def oneofsame(L1, L2):
counter = 0
for i in L1:
for j in L2:
if i == j:
counter += 1
if counter == 0:
return False
else:
return True
for l in list:
try:
del clist[clist.index(l)]
except:
pass
result.append([])
for i in l:
for cl in clist:
if oneofsame(l, cl):
for j in l:
if j not in result[counter]:
result[counter].append(j)
for j in cl:
if j not in result[counter]:
result[counter].append(j)
del_list.append(cl)
else:
result.append(cl)
del_list.append(cl)
for j in del_list:
del clist[clist.index(j)]
del_list = []
counter += 1
del_list = []
cresult = result.copy()
for i in range(len(cresult)-1, 0, -1):
if cresult[i] == []:
del result[i]
print(result)
but this code doesn't merge all of my example input (I can't paste my example input, because its sensitiv data)
Here is a way to do it.
For each pair:
if we find a group that contains one of the values, we append the pair to the group
if we find a second group that contains the other value, we merge the groups.
if we found no matching group, then our pair constitutes a new one.
def group_equals(lst):
groups = []
for pair in lst:
pair = set(pair)
equals_found = 0
for idx, group in enumerate(groups):
if group.intersection(pair):
equals_found += 1
if equals_found == 1:
# We found a first group that contains one of our values,
# we can add our pair to the group
group.update(pair)
first_group = group
elif equals_found == 2:
# We found a second group that contains the other one of
# our values, we merge it with the first one
first_group.update(group)
del groups[idx]
break
# If none of our values was found, we create a new group
if not equals_found:
groups.append(pair)
return [list(sorted(group)) for group in groups]
tests = [ [["a", "b"], ["c", "d"], ["b", "c"]], # all equal
[["a","b"],["c","d"],["a", "e"],["f","d"]],
[["a","b"],["c","d"],["a", "e"],["f","d"],["x","y"]]
]
for lst in tests:
print(group_equals(lst))
# [['a', 'b', 'c', 'd']]
# [['a', 'b', 'e'], ['c', 'd', 'f']]
# [['a', 'b', 'e'], ['c', 'd', 'f'], ['x', 'y']]
I hope my below code will solve your problem:
import itertools
import copy
lista = [["a","b"],["c","d"],["a", "e"],["f","d"],["x","y"]] #[["a","b"],["e","d1"],["a", "e"],["a","d"],["d","y"]]
def grouped_list(lista):
aa = []
bbc = copy.deepcopy(lista)
flag = False
for a, b in itertools.combinations(lista,2):
bb = a+b
if len(set(bb)) < len(bb):
flag = True
cc = list(set(bb))
cc.sort()
if cc not in aa: aa.append(cc)
if a in lista: lista.remove(a)
if b in lista: lista.remove(b)
if lista: aa = aa + lista
if not flag: return bbc
else: return grouped_list(aa)
print ("Grouped list -->", grouped_list(lista))
Feel free to ask/suggest anything in the above code.
I created a list, a set and a dict and now I want to remove certain items from them
N = [10**i for i in range(0,3)] #range(3,7) for 1000 to 1M
for i in N:
con_list = []
con_set = set()
con_dict = {}
for x in range (i): #this is the list
con_list.append(x)
print(con_list)
for x in range(i): #this is the set
con_set.add(x)
print(con_set)
for x in range(i): #this is the dict
con_dict = dict(zip(range(x), range(x)))
print(con_dict)
items to remove
n = min(10000, int(0.1 * len(con_list)))
indeces_to_delete = sorted(random.sample(range(i),n), reverse=True)
now if I add this:
for a in indeces_to_delete:
del con_list[a]
print(con_list)
it doesn't work
Need to do the same for a set and a dict
Thanks!
You can use pop
On a dict:
d = {'a': 'test', 'b': 'test2'}
calling d.pop('b') will remove the key/value pair for key b
on list:
l = ['a', 'b', 'c']
calling l.pop(2) will remove the third element (as list index start at 0)
Beware on set:
s = {'a', 'b', 'c'}
calling s.pop() will remove a random element as discussed here: In python, is set.pop() deterministic?
you should use s.discard('a') to remove element 'a'
More infos here: https://docs.python.org/2/tutorial/datastructures.html