List comprehension with elements appearing twice - python

Suppose I have list
l = ['a', 'c', 'b']
and what I want is a list where those elements appear twice, one after the other, so
['a', 'a', 'c', 'c', 'b', 'b']
and I want to do this in the most pythonic way possible.
My half solution is doing something like
[[l[i], l[i]] for i in range(len(l))]
which yields
[['a', 'a'], ['c', 'c'], ['b', 'b']]
From here, I'd have to parse (walk) the list to remove the inner lists and obtain a single flat list.
Anyone has a better idea to do this in one go? Obviously things like l * 2 wouldn't help as it gives ['a', 'c', 'b', 'a', 'c', 'b'] and I want the same elements adjacent.

l_2 = [item for item in l for i in range(n)]
Link to origin: Stackoverflow: Repeating elements of a list n times

Using only list comprehension, you can do:
[i for j in my_list for i in [j]*2]
Output:
>>> my_list = ['a', 'c', 'b']
>>> [i for j in my_list for i in [j]*2]
['a', 'a', 'c', 'c', 'b', 'b']

You can zip the list against itself, then flatten it in a list comprehension.
>>> [i for j in zip(l,l) for i in j]
['a', 'a', 'c', 'c', 'b', 'b']

You can use zip function
l = ['a', 'c', 'b']
a = [i for j in zip(l,l) for i in j]
print(a)
Output
['a', 'a', 'c', 'c', 'b', 'b']

More general:
def ntimes(iterable, times=2):
for elt in iterable:
for _ in range(times):
yield elt

Here is a short solution without list comprehension, using the intuitive idea l*2:
sorted(l*2, key=l.index)
#['a', 'a', 'c', 'c', 'b', 'b']

If you like functional approaches, you can do this:
from itertools import chain, tee
l = ['a', 'c', 'b']
n = 2
list(chain.from_iterable(zip(*tee(l, n))))
While this might not perform as fast as the other answers, it can easily be used for arbitrary iterables (especially when they are infite or when you don't know when they end) by omitting list().
(Note that some of the other answers can also be adapted for arbitrary iterables by replacing their list comprehension by a generator expression.)

Related

concat two lists in a list of lists

Is it possible to concat two lists in a list of lists?
From:
listA = ['a', 'b', 'c']
listB = ['A', 'B', 'C']
listFull = listA + listB
To:
print(listFull)
[['a', 'A'],['b', 'B'],['c', 'C']]
List comprehension using zip()
listFull = [list(x) for x in zip(listA, listB)]
print(listFull)
You can also use map() without looping
listFull = list(map(list, zip(listA, listB)))
[['a', 'A'],['b', 'B'],['c', 'C']]
there are several ways to achieve this.
you either need atleast 1 for loop or a map/zip combination.
both possibilities here:
listA = ['a', 'b', 'c']
listB = ['A', 'B', 'C']
#1
listFull =[]
for i in range(len(listA)):
listFull.append([listA[i],listB[i]])
print(listFull)
# output [['a', 'A'], ['b', 'B'], ['c', 'C']]
#2
listfull2 = list(map(list,zip(listA,listB)))
print(listfull2)
# output [['a', 'A'], ['b', 'B'], ['c', 'C']]
"Concatenate" is what the + does, which puts one list after another. What you're describing is called a "zip" operation, and Python has exactly that function built-in.
zip returns an iterable, so if you want a list, you can call list on it.
print(list(zip(listA, listB)))
# Output: [('a', 'A'), ('b', 'B'), ('c', 'C')]

Appending an element in nested list comprehension python

My (derailed) mind would like to do the following:
list1 = [1,2,3]
list2 = ['a','b','c']
list3 = [list([a for a in list2]).append(n) for n in list1]
to output this:
[['a', 'b', 'c', '1'], ['a', 'b', 'c', '2'], ['a', 'b', 'c', '3']]
only using single line list comprehensions (yes I'm in blinded by Haskell love).
Instead it outputs a list of 3 None type items which is understandable as I'm getting the output of append 3 times.
I think there's a key python idea I'm missing here on how I could make this work (or me being completely illogical), any help would be appreciated :)
No need to over-complicate things.
>>> list1 = [1,2,3]
>>> list2 = ['a','b','c']
>>> [list2 + [x] for x in list1]
[['a', 'b', 'c', 1], ['a', 'b', 'c', 2], ['a', 'b', 'c', 3]]

duplicate each element in a list arbitrary of times

I am wondering how to duplicate each element in a list arbitrary of times, e.g.
l = ['a', 'b', 'c']
the duplicate elements in l result in a new list,
n = ['a', 'a', 'a', 'a', 'b', 'b', 'c', 'c', 'c']
so 'a' has been duplicated 3 times, 'b' once, 'c' twice. The number of duplicates for each element are decided by numpy.random.poisson e.g. numpy.random.poisson(2).
Here's a NumPy based vectorized approach using np.repeat to create an array -
np.repeat(l, np.random.poisson([2]*len(l)))
If you need a list as output, append .tolist() there -
np.repeat(l, np.random.poisson([2]*len(l))).tolist()
If you would like to keep at least one entry for each element, add a clipping there with np.random.poisson([2]*len(arr)).clip(min=1).
Multiply each element in the list with the value returned from numpy.random.poisson(2), join it and then feed it to list:
r = list(''.join(i*random.poisson(2) for i in l))
For one run, this randomly results in:
['a', 'b', 'b', 'b', 'b', 'b', 'b', 'c', 'c', 'c']
Since you use np either way, I'd go with Divakar's solution (which, for lists larger than your example, executes faster).
>>> l = ['a', 'b', 'c']
>>> n = []
>>> for e in l:
... n.extend([e] * numpy.random.poisson(2))
...
>>> n
['a', 'a', 'b', 'c']

Keep strings that occur N times or more

I have a list that is
mylist = ['a', 'a', 'a', 'b', 'b', 'c', 'c', 'd']
And I used Counter from collections on this list to get the result:
from collection import Counter
counts = Counter(mylist)
#Counter({'a': 3, 'c': 2, 'b': 2, 'd': 1})
Now I want to subset this so that I have all elements that occur some number of times, for example: 2 times or more - so that the output looks like this:
['a', 'b', 'c']
This seems like it should be a simple task - but I have not found anything that has helped me so far.
Can anyone suggest somewhere to look? I am also not attached to using Counter if I have taken the wrong approach. I should note I am new to python so I apologise if this is trivial.
[s for s, c in counts.iteritems() if c >= 2]
# => ['a', 'c', 'b']
Try this...
def get_duplicatesarrval(arrval):
dup_array = arrval[:]
for i in set(arrval):
dup_array.remove(i)
return list(set(dup_array))
mylist = ['a', 'a', 'a', 'b', 'b', 'c', 'c', 'd']
print get_duplicatesarrval(mylist)
Result:
[a, b, c]
The usual way would be to use a list comprehension as #Adaman does.
In the special case of 2 or more, you can also subtract one Counter from another
>>> counts = Counter(mylist) - Counter(set(mylist))
>>> counts.keys()
['a', 'c', 'b']
from itertools import groupby
mylist = ['a', 'a', 'a', 'b', 'b', 'c', 'c', 'd']
res = [i for i,j in groupby(mylist) if len(list(j))>=2]
print res
['a', 'b', 'c']
I think above mentioned answers are better, but I believe this is the simplest method to understand:
mylist = ['a', 'a', 'a', 'b', 'b', 'c', 'c', 'd']
newlist=[]
newlist.append(mylist[0])
for i in mylist:
if i in newlist:
continue
else:
newlist.append(i)
print newlist
>>>['a', 'b', 'c', 'd']

How do you remove a sublist if a certain element is not found at a specific position across all sublists?

In other words, if it's found that "f" is in the 4th position of the sublist, return that sublist, otherwise, exclude it if "f" is not found.
List = [['a','b','c','d','f'],['a','b','c','d','e'],['a','b','c','d','e'],['a','b','c','f','f'],['a','b']]
I have the following function which would work if all the sublists were the same size.
def Function(SM):
return filter(lambda x: re.search("f",str(x[4])),List)
IndexError: list index out of range
Desired_List = [['a','b','c','d','f'],['a','b','c','f','f']]
I'm reluctant to use a for loop, because of the speed and efficiency costs. Are there any alternatives that are just as quick?
You can use list comprehension:
lst = [['a','b','c'], ['a','b','c','d','f'],['a','b','c','d','e'],['a','b','c','d','e'],['a','b','c','f','f'],['a','b']]
lst_desired = [l for l in lst if len(l) >= 5 and l[4] == "f"]
print lst_desired
Output
[['a', 'b', 'c', 'd', 'f'], ['a', 'b', 'c', 'f', 'f']]
>>> li=[['a','b','c','d','f'],['a','b','c','d','e'],['a','b','c','d','e'],['a','b','c','f','f'],['a','b']]
>>> filter(lambda l: l[4:5]==['f'], li)
[['a', 'b', 'c', 'd', 'f'], ['a', 'b', 'c', 'f', 'f']]

Categories