I have two lists:
A = ['T', 'D', 'Q', 'D', 'D']
sessionid = [1, 1, 1, 2, 2]
Is there anyway i could group items in A for the same sessionid, so that i could print out the following:
1: ["T", "D","Q"]
2: ["D","D"]
The itertools groupby function is designed to do this sort of thing. Some of the other answers here create a dictionary, which is very sensible, but if you don't actually want a dict then you can do this:
from itertools import groupby
from operator import itemgetter
A = ['T', 'D', 'Q', 'D', 'D']
sessionid = [1, 1, 1, 2, 2]
for k, g in groupby(zip(sessionid, A), itemgetter(0)):
print('{}: {}'.format(k, list(list(zip(*g))[1])))
output
1: ['T', 'D', 'Q']
2: ['D', 'D']
operator.itemgetter(0) returns a callable that fetches the item at index 0 of whatever object you pass it; groupby uses this as the key function to determine what items can be grouped together.
Note that this and similar solutions assume that the sessionid indices are sorted. If they aren't then you need to sort the list of tuples returned by zip(sessionid, A) with the same key function before passing them to groupby.
edited to work correctly on Python 2 and Python 3
Not using itertools, you can use a dictionary:
index = 0
dict = {}
for i in sessionid:
if not (i in dict):
dict[i] = []
else:
dict[i].append(A[index])
index += 1
print(dict) # {1: ['T', 'D', 'Q'], 2: ['D', 'D']}
And based on the remarks below:
from collections import defaultdict
dict = defaultdict(list)
for i, item in enumerate(sessionid):
dict[item].append(A[i])
You could use a dictionary and zip:
A = ['T', 'D', 'Q', 'D', 'D']
sessionid = [1, 1, 1, 2, 2]
result = {i:[] for i in sessionid}
for i,j in zip(sessionid,A):
result[i].append(j)
Or you can use defaultdict:
from collections import defaultdict
result = defaultdict(list)
for k, v in zip(sessionid, A):
result[k].append(v)
Output:
>>> result
{1: ['T', 'D', 'Q'], 2: ['D', 'D']}
One liner
{k: list(i for (i, _) in v) for k, v in itertools.groupby(zip(A, sessionid), operator.itemgetter(1))}
Without nested loop
{k: list(map(operator.itemgetter(0), v)) for k, v in itertools.groupby(zip(A, sessionid), operator.itemgetter(1))}
You can do:
import pandas as pd
A = ['T', 'D', 'Q', 'D', 'D']
sessionid = [1, 1, 1, 2, 2]
pd.DataFrame({'A':A, 'id':sessionid}).groupby('id')['A'].apply(list).to_dict()
#Out[10]: {1: ['T', 'D', 'Q'], 2: ['D', 'D']}
You could also convert them into numpy arrays, and use the indices of the session ids you need with np.where
import numpy as np
A = np.asarray(['T', 'D', 'Q', 'D', 'D'])
sessionid = np.asarray([1, 1, 1, 2, 2])
Ind_1 = np.where(sessionid == 1)
Ind_2 = np.where(sessionid == 2)
print A[Ind_1]
should return ['T' 'D' 'Q']
you could of course turn this into a function which takes N, the desired session and returns your A values.
Hope this helps!
Related
There are two lists and I want to check how many of elements are duplicate. Assuming list one is l1 = ['a', 'b', 'c', 'd', 'e'] and list two is l2 = ['a', 'f', 'c', 'g']. Since a and c are in both lists, therefore, the output should be 2 which means there are two elements that repeated in both lists. Below is my code and I want to count how many 2 are in counter. I am not sure how to count that.
l1 = ['a', 'b', 'c', 'd', 'e']
l2 = ['a', 'f', 'c', 'g']
from collections import Counter
c1 = Counter(l1)
c2 = Counter(l2)
sum = c1+c2
z=sum.count(2)
What you want is set.intersection (if there are no duplicates in each list):
l1 = ['a', 'b', 'c', 'd', 'e']
l2 = ['a', 'f', 'c', 'g']
print(len(set(l1).intersection(l2)))
Output:
2
Every time we use a counter it converts the lists into dict. So it is throwing the error. You can simply change the number of lists and run the following code to get the exact number of duplicate values.
# Duplicate elements in 2 lists
l1 = ['a', 'b', 'c', 'd', 'e']
l2 = ['a', 'f', 'c', 'g']# a,c are duplicate
from collections import Counter
c1 = Counter(l1)
c2 = Counter(l2)
sum = c1+c2
j = sum.values()
print(sum)
print(j)
v = 0
for i in j:
if i>1:
v = v+1
print("Duplicate in lists:", v)
Output:
Counter({'a': 2, 'c': 2, 'b': 1, 'd': 1, 'e': 1, 'f': 1, 'g': 1})
dict_values([2, 1, 2, 1, 1, 1, 1])
Duplicate in lists: 2
I would like print out dictionary key, value pair in the even frequency like
a = dict('A': 3, 'B': 5}
=> ['A', 'B', 'A', 'B', 'A', 'B', 'B', 'B']
a = dict('A': 4, 'B': 1}
=> ['A', 'B', 'A', 'A', 'A']
I know I can use a while loop to print each key and remove the count every time until all value from all key is 0 but if there is better way to do it?
def func(d: dict):
res = []
while any(i > 0 for i in d.values()):
for k, c in d.items():
if c > 0:
res.append(k)
d[k] -= 1
return res
(I'm assuming you're using a version of Python that guarantees the iteration order of dictionaries)
Here's an itertools-y approach. It creates a generator for each letter that yields the letter the given number of times, and it combines all of them together with zip_longest so they get yielded evenly.
from itertools import repeat, zip_longest
def iterate_evenly(d):
generators = [repeat(k, v) for k,v in d.items()]
exhausted = object()
for round in zip_longest(*generators, fillvalue=exhausted):
for x in round:
if x is not exhausted:
yield x
print(list(iterate_evenly({"A": 3, "B": 5})))
print(list(iterate_evenly({"A": 4, "B": 1})))
Result:
['A', 'B', 'A', 'B', 'A', 'B', 'B', 'B']
['A', 'B', 'A', 'A', 'A']
You can do the same thing in fewer lines, although it becomes harder to read.
from itertools import repeat, zip_longest
def iterate_evenly(d):
exhausted = object()
return [x for round in zip_longest(*(repeat(k, v) for k,v in d.items()), fillvalue=exhausted) for x in round if x is not exhausted]
print(iterate_evenly({"A": 3, "B": 5}))
print(iterate_evenly({"A": 4, "B": 1}))
For a one-liner.
First, create a list with two elements: a list of As and a list of Bs:
>>> d = {'A': 3, 'B': 5}
>>> [[k]*v for k, v in d.items()]
[['A', 'A', 'A'], ['B', 'B', 'B', 'B', 'B']]
[k]*v means: a list with v ks. Second, interleave As and B. We need zip_longest because zip would stop after the end of the first list:
>>> import itertools
>>> list(itertools.zip_longest(*[[k]*v for k, v in d.items()]))
[('A', 'B'), ('A', 'B'), ('A', 'B'), (None, 'B'), (None, 'B')]
Now, just flatten the list and remove None values:
>>> [v for vs in itertools.zip_longest(*[[k]*v for k, v in d.items()]) for v in vs if v is not None]
['A', 'B', 'A', 'B', 'A', 'B', 'B', 'B']
Other example:
>>> d = {'A': 4, 'B': 1}
>>> [v for vs in itertools.zip_longest(*[[k]*v for k, v in d.items()]) for v in vs if v is not None]
['A', 'B', 'A', 'A', 'A']
You can just use sum with a generator comprehension:
res = sum(([key]*value for key, value in d.items()), [])
This exploits the fact that sum can "add" anything that can use the + operators, like lists, in addition to sequence multiplication ("A"*4 == "AAAA").
If you want the order to be randomized, use the random module:
from random import shuffle
shuffle(res)
If, as Thierry Lathuille notes, you want to cycle through the values in the original order, you can use some itertools magic:
from itertools import chain, zip_longest
res = [*filter(
bool, # drop Nones
chain(*zip_longest(
*([key]*val for key, val in d.items()))
)
)]
As an alternative to the replication & zip_longest approach, let's try to simplify the OP's original code:
def function(dictionary):
result = []
while dictionary:
result.extend(dictionary)
dictionary = {k: v - 1 for k, v in dictionary.items() if v > 1}
return result
print(function({'A': 3, 'B': 5}))
print(function({'A': 4, 'B': 1}))
OUTPUT
% python3 test.py
['A', 'B', 'A', 'B', 'A', 'B', 'B', 'B']
['A', 'B', 'A', 'A', 'A']
%
Although it might look otherwise, it's not destructive on the dictionary argument, unlike the OP's original code.
It could also be done using a sort of the (position,character) tuples formed by expanding each dictionary entry:
a = {'A': 3, 'B': 5}
result = [c for _,c in sorted( (p,c) for c,n in a.items() for p,c in enumerate(c*n))]
print(result) # ['A', 'B', 'A', 'B', 'A', 'B', 'B', 'B']
If the dictionary's order is usable, you can forgo the sort and use this:
result = [c for i in range(max(a.values())) for c,n in a.items() if i<n]
I have two lists:
a= [0,0,0,1,1,1,3,3,3]
b= ['a','b','c','d','e','f','g','h','i']
output = [['a','b','c'],['d','e','f'],['g','h','i']]
a and b are list of same length.
I need an output array by in such a way that whenever the value in list - a changes from 0 to 1 or from 1 to 3, A new list should be made in the output list.
can someone please help.
Use groupby:
from itertools import groupby
from operator import itemgetter
a = [0, 0, 0, 1, 1, 1, 3, 3, 3]
b = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i']
output = [list(map(itemgetter(1), group)) for _, group in groupby(zip(a, b), key=itemgetter(0))]
print(output)
Output
[['a', 'b', 'c'], ['d', 'e', 'f'], ['g', 'h', 'i']]
A simpler method without using any imports by utilizing dictionary:
a= [0,0,0,1,1,1,3,3,3]
b= ['a','b','c','d','e','f','g','h','i']
d = {e: [] for e in set(a)} # Create a dictionary for each of a's unique key
[d[e].append(b[i]) for i, e in enumerate(a)] # put stuff into lists by index
lofl = list(d.values())
>>> lofl
[['a', 'b', 'c'], ['d', 'e', 'f'], ['g', 'h', 'i']]
Using groupby, you could do:
from itertools import groupby
a= [0,0,0,1,1,1,3,3,3]
b= ['a','b','c','d','e','f','g','h','i']
iter_b = iter(b)
output = [[next(iter_b) for _ in group] for key, group in groupby(a)]
print(output)
# [['a', 'b', 'c'], ['d', 'e', 'f'], ['g', 'h', 'i']]
groupby yields successive groups of identical values of a. For each group, we create a list containing as many of the next elements of b as there are values in the group.
As you added tag algorithm , I believe you want a solution without so many magic.
>>> def merge_lists(A, B):
... output = []
... sub_list = []
... current = A[0]
... for i in range(len(A)):
... if A[i] == current:
... sub_list.append(B[i])
... else:
... output.append(sub_list)
... sub_list = []
... sub_list.append(B[i])
... current = A[i]
... output.append(sub_list)
... return output
...
>>> a= [0,0,0,1,1,1,3,3,3]
>>> b= ['a','b','c','d','e','f','g','h','i']
>>> merge_list(a, b)
[['a', 'b', 'c'], ['d', 'e', 'f'], ['g', 'h', 'i']]
I have two lists:
list1=[0,0,0,1,1,2,2,3,3,4,4,5,5,5]
list2=['a','b','c','d','e','f','k','o','n','q','t','z','w','l']
dictionary=dict(zip(list1,list2))
I would like to output values of the same keys in one for each key such as it will print like this:
0 ['a','b','c']
1 ['d','e']
2 ['f','k']
3 ['o','n']
4 ['q','t']
5 ['z','w','l']
I wrote following code to do that but it does not work as I think
for k,v in dictionary.items():
print (k,v)
Could you tell me how to fix code so that I can get above intended results, please ?
Thanks in advance!
Regarding your code:
dictionary = dict(zip(list1, list2))
creates the dictionary:
{0: 'c', 1: 'e', 2: 'k', 3: 'n', 4: 't', 5: 'l'}
which loses all but the last value in each group. You need to process the zipped lists to construct the grouped data. Two ways are with itertools.groupby() or with a defaultdict(list), shown here.
Use a collections.defaultdict of lists to group the items with keys from list1 and values from list2. Pair the items from each list with zip():
from collections import defaultdict
list1=[0,0,0,1,1,2,2,3,3,4,4,5,5,5]
list2=['a','b','c','d','e','f','k','o','n','q','t','z','w','l']
d = defaultdict(list)
for k,v in zip(list1, list2):
d[k].append(v)
for k in sorted(d):
print('{} {!r}'.format(k, d[k]))
Output:
0 ['a', 'b', 'c']
1 ['d', 'e']
2 ['f', 'k']
3 ['o', 'n']
4 ['q', 't']
5 ['z', 'w', 'l']
Since items in a dictionary are unordered, the output is sorted by key.
The code you've shown does not look anything like what you described.
That aside, you can group values of the same key together by first zipping the lists and then grouping values of the same key using a collections.defaultdict:
from collections import defaultdict
d = defaultdict(list)
for k, v in zip(list1, list2):
d[k].append(v)
print(d)
# defaultdict(<type 'list'>, {0: ['a', 'b', 'c'], 1: ['d', 'e'], 2: ['f', 'k'], 3: ['o', 'n'], 4: ['q', 't'], 5: ['z', 'w', 'l']})
You can use itertool.groupby for a concise, one line solution:
import itertools
list1=[0,0,0,1,1,2,2,3,3,4,4,5,5,5]
list2=['a','b','c','d','e','f','k','o','n','q','t','z','w','l']
final_list = {a:[i[-1] for i in list(b)] for a, b in itertools.groupby(zip(list1, list2), key=lambda x: x[0])}
for a, b in final_list.items():
print(a, b)
Output:
0 ['a', 'b', 'c']
1 ['d', 'e']
2 ['f', 'k']
3 ['o', 'n']
4 ['q', 't']
5 ['z', 'w', 'l']
I would like to index a list with another list like this
L = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
Idx = [0, 3, 7]
T = L[ Idx ]
and T should end up being a list containing ['a', 'd', 'h'].
Is there a better way than
T = []
for i in Idx:
T.append(L[i])
print T
# Gives result ['a', 'd', 'h']
T = [L[i] for i in Idx]
If you are using numpy, you can perform extended slicing like that:
>>> import numpy
>>> a=numpy.array(['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'])
>>> Idx = [0, 3, 7]
>>> a[Idx]
array(['a', 'd', 'h'],
dtype='|S1')
...and is probably much faster (if performance is enough of a concern to to bother with the numpy import)
T = map(lambda i: L[i], Idx)
A functional approach:
a = [1,"A", 34, -123, "Hello", 12]
b = [0, 2, 5]
from operator import itemgetter
print(list(itemgetter(*b)(a)))
[1, 34, 12]
I wasn't happy with any of these approaches, so I came up with a Flexlist class that allows for flexible indexing, either by integer, slice or index-list:
class Flexlist(list):
def __getitem__(self, keys):
if isinstance(keys, (int, slice)): return list.__getitem__(self, keys)
return [self[k] for k in keys]
Which, for your example, you would use as:
L = Flexlist(['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'])
Idx = [0, 3, 7]
T = L[ Idx ]
print(T) # ['a', 'd', 'h']
You could also use the __getitem__ method combined with map like the following:
L = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
Idx = [0, 3, 7]
res = list(map(L.__getitem__, Idx))
print(res)
# ['a', 'd', 'h']
L= {'a':'a','d':'d', 'h':'h'}
index= ['a','d','h']
for keys in index:
print(L[keys])
I would use a Dict add desired keys to index
My problem: Find indexes of list.
L = makelist() # Returns a list of different objects
La = np.array(L, dtype = object) # add dtype!
for c in chunks:
L_ = La[c] # Since La is array, this works.