Finding index of values in a list dynamically - python

I am having two lists as follows:
list_1
['A-1','A-1','A-1','A-2','A-2','A-3']
list_2
['iPad','iPod','iPhone','Windows','X-box','Kindle']
I would like to split the list_2 based on the index values in list_1. For instance,
list_a1
['iPad','iPod','iPhone']
list_a2
['Windows','X-box']
list_a3
['Kindle']
I know index method, but it needs the value to be matched to be passed along with. In this case, I would like to dynamically find the indexes of the values in list_1 with the same value. Is this possible? Any tips/hints would be deeply appreciated.
Thanks.

There are a few ways to do this.
I'd do it by using zip and groupby.
First:
>>> list(zip(list_1, list_2))
[('A-1', 'iPad'),
('A-1', 'iPod'),
('A-1', 'iPhone'),
('A-2', 'Windows'),
('A-2', 'X-box'),
('A-3', 'Kindle')]
Now:
>>> import itertools, operator
>>> [(key, list(group)) for key, group in
... itertools.groupby(zip(list_1, list_2), operator.itemgetter(0))]
[('A-1', [('A-1', 'iPad'), ('A-1', 'iPod'), ('A-1', 'iPhone')]),
('A-2', [('A-2', 'Windows'), ('A-2', 'X-box')]),
('A-3', [('A-3', 'Kindle')])]
So, you just want each group, ignoring the key, and you only want the second element of each element in the group. You can get the second element of each group with another comprehension, or just by unzipping:
>>> [list(zip(*group))[1] for key, group in
... itertools.groupby(zip(list_1, list_2), operator.itemgetter(0))]
[('iPad', 'iPod', 'iPhone'), ('Windows', 'X-box'), ('Kindle',)]
I would personally find this more readable as a sequence of separate iterator transformations than as one long expression. Taken to the extreme:
>>> ziplists = zip(list_1, list_2)
>>> pairs = itertools.groupby(ziplists, operator.itemgetter(0))
>>> groups = (group for key, group in pairs)
>>> values = (zip(*group)[1] for group in groups)
>>> [list(value) for value in values]
… but a happy medium of maybe 2 or 3 lines is usually better than either extreme.

Usually I'm the one rushing to a groupby solution ;^) but here I'll go the other way and manually insert into an OrderedDict:
list_1 = ['A-1','A-1','A-1','A-2','A-2','A-3']
list_2 = ['iPad','iPod','iPhone','Windows','X-box','Kindle']
from collections import OrderedDict
d = OrderedDict()
for code, product in zip(list_1, list_2):
d.setdefault(code, []).append(product)
produces a d looking like
>>> d
OrderedDict([('A-1', ['iPad', 'iPod', 'iPhone']),
('A-2', ['Windows', 'X-box']), ('A-3', ['Kindle'])])
with easy access:
>>> d["A-2"]
['Windows', 'X-box']
and we can get the list-of-lists in list_1 order using .values():
>>> d.values()
[['iPad', 'iPod', 'iPhone'], ['Windows', 'X-box'], ['Kindle']]
If you've noticed that no one is telling you how to make a bunch of independent lists with names like list_a1 and so on-- that's because that's a bad idea. You want to keep the data together in something which you can (at a minimum) iterate over easily, and both dictionaries and list of lists qualify.

Maybe something like this?
#!/usr/local/cpython-3.3/bin/python
import pprint
import collections
def main():
list_1 = ['A-1','A-1','A-1','A-2','A-2','A-3']
list_2 = ['iPad','iPod','iPhone','Windows','X-box','Kindle']
result = collections.defaultdict(list)
for list_1_element, list_2_element in zip(list_1, list_2):
result[list_1_element].append(list_2_element)
pprint.pprint(result)
main()

Using itertools.izip_longest and itertools.groupby:
>>> from itertools import groupby, izip_longest
>>> inds = [next(g)[0] for k, g in groupby(enumerate(list_1), key=lambda x:x[1])]
First group items of list_1 and find the starting index of each group:
>>> inds
[0, 3, 5]
Now use slicing and izip_longest as we need pairs list_2[0:3], list_2[3:5], list_2[5:]:
>>> [list_2[x:y] for x, y in izip_longest(inds, inds[1:])]
[['iPad', 'iPod', 'iPhone'], ['Windows', 'X-box'], ['Kindle']]
To get a list of dicts you can something like:
>>> inds = [next(g) for k, g in groupby(enumerate(list_1), key=lambda x:x[1])]
>>> {k: list_2[ind1: ind2[0]] for (ind1, k), ind2 in
zip_longest(inds, inds[1:], fillvalue=[None])}
{'A-1': ['iPad', 'iPod', 'iPhone'], 'A-3': ['Kindle'], 'A-2': ['Windows', 'X-box']}

You could do this if you want simple code, it's not pretty, but gets the job done.
list_1 = ['A-1','A-1','A-1','A-2','A-2','A-3']
list_2 = ['iPad','iPod','iPhone','Windows','X-box','Kindle']
list_1a = []
list_1b = []
list_1c = []
place = 0
for i in list_1[::1]:
if list_1[place] == 'A-1':
list_1a.append(list_2[place])
elif list_1[place] == 'A-2':
list_1b.append(list_2[place])
else:
list_1c.append(list_2[place])
place += 1

Related

Taking two values from two list (Random Order) of tuples and multiplying

I have two lists and they are lists of tuples.
For example
List1 = [('zaidan', 0.0013568521031207597),('zimmerman', 0.0013568521031207597), ('ypa', 0.004070556309362279)]
List2 = [('zimmerman', 0.0013568521031207597), ('ypa', 0.004070556309362279), ('zaidan', 0.0013568521031207597)]
If the items were in the same order I could use the following code to multiply the two values:
val = [(t1, v1*v2) for (t1, v1), (t2, v2) in zip(tf,idf)]
But my issue is the order of one the lists outputs randomly so the code doesn't work. So essentially I need to see if the word in one list matches the word in the other and then multiply to get an output in a similar way as the list of tuples.
This question excellently demonstrates the advantages of the dictionary data structure and how your problem could benefit from it. So first, we convert your list of tuples to dictionaries (dict-calls) and then you "combine" the two dicts as per your requirement to get the desired result.
lst1 = [('zaidan', 0.0013568521031207597),('zimmerman', 0.0013568521031207597), ('ypa', 0.004070556309362279)]
lst2 = [('zimmerman', 0.0013568521031207597), ('ypa', 0.004070556309362279), ('zaidan', 0.0013568521031207597)]
dct1 = dict(lst1)
dct2 = dict(lst2)
res = {k: v * dct2.get(k, 1) for k, v in dct1.items()}.items()
which produces:
dict_items([('zaidan', 1.8410476297432288e-06), ('zimmerman', 1.8410476297432288e-06), ('ypa', 1.656942866768906e-05)])
And if the dict_item data type is confusing, you can always cast it to a vanilla-list.
res = list(res)
print(res)
# [('zaidan', 1.8410476297432288e-06), ('zimmerman', 1.8410476297432288e-06), ('ypa', 1.656942866768906e-05)]
i would tell you the easiest solution if your data are the same.
just sort it :
ls1 = sorted(ls1, key=lambda tup: tup[0])
ls2 = sorted(ls2, key=lambda tup: tup[0])
val = [(t1, v1*v2) for (t1, v1), (t2, v2) in zip(ls1,ls2)]
If, for any reason, you do not want to use dictionary (although it is a superior solution) but want to do this with lists and tuples, what you are looking for is looping through the lists and checking for equality:
x = [('zaidan', 0.0013568521031207597),('zimmerman', 0.0013568521031207597), ('ypa', 0.004070556309362279)]
y = [('zimmerman', 0.0013568521031207597), ('ypa', 0.004070556309362279), ('zaidan', 0.0013568521031207597)]
z = []
for item in x:
for _item in y:
if item[0] == _item[0]
z.append((item[0], item[1]*_item[1]))
At the end, z will be a list of tuples with the original string at the 0 index and the result of multiplication at the 1 index.

how to separate cam1,2,3,4,5,6 first images from the list

lst = ['Cam218-10-03_16-05-21-54.jpg',
'Cam318-10-03_17-04-21-54.jpg',
'Cam418-10-03_16-04-21-54.jpg',
'Cam218-10-02_16-05-21-54.jpg',
'Cam318-10-02_17-04-21-54.jpg',
'Cam418-10-02_16-04-21-54.jpg',
'Cam218-10-02_16-04-08-31.jpg',
'Cam318-10-02_16-04-08-30.jpg',
'Cam418-10-02_16-04-08-30.jpg',
'Cam518-10-02_16-04-08-35.jpg',
'Cam618-10-02_16-04-08-36.jpg',
'Cam118-10-02_16-04-09-33.jpg',
'Cam218-10-02_16-04-09-33.jpg',
'Cam318-10-02_16-04-09-33.jpg',
'Cam418-10-02_16-04-09-33.jpg',
'Cam518-10-02_16-04-09-33.jpg',
'Cam618-10-02_16-04-09-33.jpg',
'Cam118-10-02_16-04-11-53.jpg',
'Cam218-10-02_16-04-11-53.jpg',
'Cam318-10-02_16-04-11-53.jpg',
'Cam418-10-02_16-04-08-30.jpg',
'Cam118-10-02_16-04-08-31.jpg',
'Cam518-10-02_16-04-11-53.jpg',
'Cam118-10-02_16-04-11-53.jpg']
From this list I want the output:
['Cam118-10-02_16-04-08-31.jpg',
'Cam218-10-02_16-04-08-31.jpg',
'Cam318-10-02_16-04-08-30.jpg',
'Cam418-10-02_16-04-08-30.jpg',
'Cam518-10-02_16-04-08-35.jpg',
'Cam618-10-02_16-04-08-36.jpg']
by using Python. Could anybody help me?
With itertools.groupby - O(n*log(n))
>>> from itertools import groupby
>>> [next(g) for _, g in groupby(sorted(lst), key=lambda cam: cam.partition('-')[0])]
['Cam118-10-02_16-04-08-31.jpg',
'Cam218-10-02_16-04-08-31.jpg',
'Cam318-10-02_16-04-08-30.jpg',
'Cam418-10-02_16-04-08-30.jpg',
'Cam518-10-02_16-04-08-35.jpg',
'Cam618-10-02_16-04-08-36.jpg']
With keeping track of duplicates manually (output not sorted, but potentially useful to other readers) - O(n)
>>> seen = set()
>>> result = []
>>>
>>> for cam in lst:
...: model, *_ = cam.partition('-')
...: if model not in seen:
...: result.append(cam)
...: seen.add(model)
...:
>>> result
['Cam218-10-03_16-05-21-54.jpg',
'Cam318-10-03_17-04-21-54.jpg',
'Cam418-10-03_16-04-21-54.jpg',
'Cam518-10-02_16-04-08-35.jpg',
'Cam618-10-02_16-04-08-36.jpg',
'Cam118-10-02_16-04-09-33.jpg']
you can make if condition to check for the occurrence of the photo tag after sorting the list
list.sort()
i = 1
for item in list:
if(item[3]==str(i)):
i=i+1
print(item)
continue
the result is
Cam118-10-02_16-04-08-31.jpg
Cam218-10-02_16-04-08-31.jpg
Cam318-10-02_16-04-08-30.jpg
Cam418-10-02_16-04-08-30.jpg
Cam518-10-02_16-04-08-35.jpg
Cam618-10-02_16-04-08-36.jpg
if you want to get the first occurrence of item with no regards to its order ascendingly, removing list.sort() shall resolve that.

combine like lists withing a nested list

i have a large nested list, and with in each nested list are two values, a company name and an amount, i am wondering if there is a way to combine the nested lists that have the same name together and then add the values? so for example here is a section of the list
[['Acer', 481242.74], ['Beko', 966071.86], ['Cemex', 187242.16], ['Datsun', 748502.91], ['Equifax', 146517.59], ['Gerdau', 898579.89], ['Haribo', 265333.85], ['Gerdau', 13019.63676], ['Gerdau', 34107.12062], ['Acer', 52153.02848]
i would expect an outcome that looks like the one below
[['Acer',(481242.74+52153.02848)],['Beko', 966071.86],['Cemex', 187242.16],['Datsun', 748502.91],['Equifax', 146517.59],['Gerdau',(898579.89+13019.63676+34107.12062)],['Haribo', 265333.85]]
so essentially im trying to write a code that will go through a nested list and return a list made by finding all the lists with the same [0] element and combining there [1] element
from collections import defaultdict
d = defaultdict(float)
for name, amt in a:
d[name] += amt
What this does is to create a dict where the amount will be zero (float()) by default, and then sum up using the names as keys.
If you really need the result to be a list, you can get it this way:
>>> print d.items()
[('Equifax', 146517.59), ('Haribo', 265333.85), ('Gerdau', 945706.64738), ('Cemex', 187242.16), ('Datsun', 748502.91), ('Beko', 966071.86), ('Acer', 533395.76848)]
defaultdict is probably a good way to go but you can do this with a normal dictionary:
>>> data = [['Acer', 481242.74], ['Beko', 966071.86], ['Cemex', 187242.16], ...]
>>> result = {}
>>> for k, v in data:
... result[k] = result.get(k, 0) + v
>>> result
{'Acer': 533395.76848, 'Beko': 966071.86, 'Cemex': 187242.16, ... }
>>> list(result.items())
[('Acer', 533395.76848), ('Beko', 966071.86), ('Cemex', 187242.16), ...]
from collections import defaultdict
d = defaultdict(list)
l=[['Acer', 481242.74], ['Beko', 966071.86], ['Cemex', 187242.16], ['Datsun', 748502.91], ['Equifax', 146517.59], ['Gerdau', 898579.89], ['Haribo', 265333.85], ['Gerdau', 13019.63676], ['Gerdau', 34107.12062], ['Acer', 52153.02848]]
for k,v in l:
d[k].append(v)
w=[]
for x,y in d.items():
w.append([x,sum(y)])
print w
Prints-
[['Equifax', 146517.59], ['Haribo', 265333.85], ['Gerdau', 945706.64738], ['Cemex', 187242.16], ['Datsun', 748502.91], ['Beko', 966071.86], ['Acer', 533395.76848]]
If want to make it a tuple
map(tuple,w)
Output is-
[('Equifax', 146517.59), ('Haribo', 265333.85), ('Gerdau', 945706.64738), ('Cemex', 187242.16), ('Datsun', 748502.91), ('Beko', 966071.86), ('Acer', 533395.76848)]

Zip Lists together based on many to one relationship

I have two lists and I would like to find a way to link them together (I'm not sure the exact term for doing this) by zipping them.
In list one I have a series of tif files:
list1=['LT50300281984137PAC00_sr_band1.tif',
,'LT50300281984137PAC00_sr_band2.tif'
'LT50300281984137PAC00_sr_band3.tif','LT50300281994260XXX03_sr_band1.tif',
'LT50300281994260XXX03_sr_band2.tif',
'LT50300281994260XXX03_sr_band3.tif']
in list two I have two files:
list2=[LT50300281984137PAC00_mask.tif,LT50300281994260XXX03_mask.tif]
I want to zip the files in list one which start with LT50300281984137PAC00 to the file in list 2 which starts the same way, and the same for the files which start with LT50300281994260XXX03
The code I have tried is:
ziplist=zip(sorted(list1),sorted(list2)
but this returns:
[('LT50300281984137PAC00_sr_band1', 'LT50300281984137PAC00_mask.tif'), ('LT50300281984137PAC00_sr_band2', 'LT50300281994260XXX03_mask.tif')]
I would like this to be returned:
[('LT50300281984137PAC00_sr_band1',LT50300281984137PAC00_sr_band2,LT50300281984137PAC00_sr_band3, 'LT50300281984137PAC00_mask.tif'), ('LT50300281994260XXX03_sr_band1.tif', 'LT50300281994260XXX03_sr_band2.tif','LT50300281994260XXX03_sr_band3.tif','LT50300281994260XXX03_mask.tif')]
You can use itertools.groupby:
from itertools import groupby
list1 = [
'LT50300281984137PAC00_sr_band1.tif',
'LT50300281984137PAC00_sr_band2.tif',
'LT50300281984137PAC00_sr_band3.tif',
'LT50300281994260XXX03_sr_band1.tif',
'LT50300281994260XXX03_sr_band2.tif',
'LT50300281994260XXX03_sr_band3.tif'
]
list2 = [
'LT50300281984137PAC00_mask.tif',
'LT50300281994260XXX03_mask.tif'
]
def extract_key(s):
return s[:s.index('_')]
l = sorted(list1 + list2, key=extract_key)
l = [tuple(items) for s, items in groupby(l, key=extract_key)]
Result:
[('LT50300281984137PAC00_sr_band1.tif', 'LT50300281984137PAC00_sr_band2.tif', 'LT50300281984137PAC00_sr_band3.tif', 'LT50300281984137PAC00_mask.tif'), ('LT50300281994260XXX03_sr_band1.tif', 'LT50300281994260XXX03_sr_band2.tif', 'LT50300281994260XXX03_sr_band3.tif', 'LT50300281994260XXX03_mask.tif')]
The idea is to sort the union of the two lists by the first part of each filename (extract_key). Then use groupby to create groups of the same first part.
You can use list comprehensions and builtin function filter
In [24]: [tuple(filter(lambda x: x.startswith(e.split('_')[0]), list1)+[e]) for e in list2]
Out[24]:
[('LT50300281984137PAC00_sr_band1.tif',
'LT50300281984137PAC00_sr_band2.tif',
'LT50300281984137PAC00_sr_band3.tif',
'LT50300281984137PAC00_mask.tif'),
('LT50300281994260XXX03_sr_band1.tif',
'LT50300281994260XXX03_sr_band2.tif',
'LT50300281994260XXX03_sr_band3.tif',
'LT50300281994260XXX03_mask.tif')]
Can also be done using regex.
import re
list1=['LT50300281984137PAC00_sr_band1.tif'
,'LT50300281984137PAC00_sr_band2.tif',
'LT50300281984137PAC00_sr_band3.tif','LT50300281994260XXX03_sr_band1.tif',
'LT50300281994260XXX03_sr_band2.tif',
'LT50300281994260XXX03_sr_band3.tif']
list2=['LT50300281984137PAC00_mask.tif','LT50300281994260XXX03_mask.tif']
match = re.findall(r'(\b\w+(?:PAC00)\w+.\w+\b)'," ".join(list1))
tuple1 = tuple(match+[list2[0]])
match = re.findall(r'(\b\w+(?:0XXX0)\w+.\w+\b)'," ".join(list1))
tuple2 = tuple(match+[list2[1]])
print [tuple1,tuple2]
Output
[('LT50300281984137PAC00_sr_band1.tif', 'LT50300281984137PAC00_sr_band2.tif', 'LT50300281984137PAC00_sr_band3.tif', 'LT50300281984137PAC00_mask.tif'), ('LT50300281994260XXX03_sr_band1.tif', 'LT50300281994260XXX03_sr_band2.tif', 'LT50300281994260XXX03_sr_band3.tif', 'LT50300281994260XXX03_mask.tif')]
A dictionary will work better here, you can then later repurpose it for what you need:
results = {}
for f in list2:
common = f.split('_')[0]
results[common] = []
for f in list1:
common = f.split('_')[0]
try:
results[common].append(f)
except KeyError:
print('{} not a valid grouper'.format(common))
# To convert into a list of tuples
as_list = [(k,)+tuple(v) for k,v in results.iteritems()]
print(as_list)
I would use itertools.chain and itertools.groupby , with a lambda expression to take only till the first _ for the grouping. Example -
>>> from itertools import chain,groupby
>>> list1=['LT50300281984137PAC00_sr_band1.tif','LT50300281984137PAC00_sr_band2.tif','LT50300281984137PAC00_sr_band3.tif','LT50300281994260XXX03_sr_band1.tif','LT50300281994260XXX03_sr_band2.tif','LT50300281994260XXX03_sr_band3.tif']
>>> list2=['LT50300281984137PAC00_mask.tif','LT50300281994260XXX03_mask.tif']
>>>
>>> chained_sorted = sorted(chain(list1,list2))
>>> ret = []
>>> for i, x in groupby(chained_sorted,lambda x: x.split('_')[0]):
... ret.append(tuple(x))
...
>>> ret
[('LT50300281984137PAC00_mask.tif', 'LT50300281984137PAC00_sr_band1.tif', 'LT50300281984137PAC00_sr_band2.tif', 'LT50300281984137PAC00_sr_band3.tif'), ('LT50300281994260XXX03_mask.tif', 'LT50300281994260XXX03_sr_band1.tif', 'LT50300281994260XXX03_sr_band2.tif', 'LT50300281994260XXX03_sr_band3.tif')]
My first answer on StackOverflow, so please be patient. But I didn't see a need for zip()
mask1, mask2 = list2[0], list2[1]
for b in reversed(list1):
if b[0:20] in mask1:
mask1 = b + " " + mask1
else:
mask2 = b + " " + mask2
ziplist = [tuple(mask1.split()), tuple(mask2.split())]
I think ziplist should now be what you were asking for.

Group one list of string elements based on another list

I want to group by my_list based on another list keys as follows:
my_list = ['apple_2010', 'banana_2010', 'carrot_2010', 'dog_2011', 'eye_2011', 'fig_2011']
keys = ['2010','2010','2010','2011','2011','2011']
for x,y in zip(my_list,keys):
???
The expected answer is:
answer = [['apple_2010', 'banana_2010', 'carrot_2010'],
['dog_2011', 'eye_2011', 'fig_2011']]
>>> my_list = ['apple_2010', 'banana_2010', 'carrot_2010', 'dog_2011', 'eye_2011', 'fig_2011']
>>> keys = ['2010','2010','2010','2011','2011','2011']
>>> print [[value for value in my_list if key in value] for key in set(keys)]
[['dog_2011', 'eye_2011', 'fig_2011'], ['apple_2010', 'banana_2010', 'carrot_2010']]

Categories