I am wondering how to duplicate each element in a list arbitrary of times, e.g.
l = ['a', 'b', 'c']
the duplicate elements in l result in a new list,
n = ['a', 'a', 'a', 'a', 'b', 'b', 'c', 'c', 'c']
so 'a' has been duplicated 3 times, 'b' once, 'c' twice. The number of duplicates for each element are decided by numpy.random.poisson e.g. numpy.random.poisson(2).
Here's a NumPy based vectorized approach using np.repeat to create an array -
np.repeat(l, np.random.poisson([2]*len(l)))
If you need a list as output, append .tolist() there -
np.repeat(l, np.random.poisson([2]*len(l))).tolist()
If you would like to keep at least one entry for each element, add a clipping there with np.random.poisson([2]*len(arr)).clip(min=1).
Multiply each element in the list with the value returned from numpy.random.poisson(2), join it and then feed it to list:
r = list(''.join(i*random.poisson(2) for i in l))
For one run, this randomly results in:
['a', 'b', 'b', 'b', 'b', 'b', 'b', 'c', 'c', 'c']
Since you use np either way, I'd go with Divakar's solution (which, for lists larger than your example, executes faster).
>>> l = ['a', 'b', 'c']
>>> n = []
>>> for e in l:
... n.extend([e] * numpy.random.poisson(2))
...
>>> n
['a', 'a', 'b', 'c']
Related
I have the current list
[['A', 'B', 'C'],
['A', 'B', 'D'],
['A', 'B', 'C', 'D'],
['A', 'C', 'D'],
['B', 'C', 'D']]
I want to delete the list ['A', 'B', 'C', 'D']. Is there any way to do this by checking inside the list and removing the list if it has more than 3 elements?
Thanks
I know I could iterate over the whole list, and add the elements where the lenght is 3. But I don't think it is very efficient
Use a list comprehension!
[ x for x in array ]
will return the array as is. x represents an element in the array, in your case, the sub-arrays that you want to check the length of.
You can add a condition to this to filter:
[ x for x in array if ... ]
This will return the array but only for elements x that pass the condition after the if.
So you will want to check the length of x using the len function. In this case, your list comprehension should only include x when len(x) <= 3
In the spirit of learning I haven't given you the literal answer but you should be able to piece this information together.
Play around with list comprehensions, they are very powerful and constantly used in python to transform, map, and filter arrays.
b=[['A', 'B', 'C'],
['A', 'B', 'D'],
['A', 'B', 'C', 'D'],
['A', 'C', 'D'],
['B', 'C', 'D']]
c=[]
for i in b:
if len(i)<=3:
print(i)
c.append(i)
This script creates a new list by parsing through each index of the original list and getting the length of it. If the length is 3 then it gets added to the new table, if it doesn't have a length of 3 then it gets forgotten about
newlist = []
for a in range (0, len(oldlist)):
if len(oldlist[a]) == 3:
newlist.append(oldlist[a])
If you have a list my_list = ['a', 'd', 'e', 'c', 'b', 'f'] and you want to construct a sublist, containing all elements up to a given one, for example my_list_up_to_c = ['a', 'd', 'e'], how can this be done in a way that scales easily? Also can this be made faster by using numpy arrays?
The least amount of code would probably be using .index() (note that this searches till the first occurence of the element in said list):
>>> my_list = ['a', 'd', 'e', 'c', 'b', 'f']
>>> my_list
['a', 'd', 'e', 'c', 'b', 'f']
>>> my_list[:my_list.index('c')] # excluding the specified element
['a', 'd', 'e']
>>> my_list[:my_list.index('c')+1] # including the specified element
['a', 'd', 'e', 'c']
The time complexity of the call to .index() is O(n), meaning it will at most iterate once over the list. The list slicing has complexity O(k) (according to this source), meaning it depends on the size of the slice.
So in the worst case the element you look for is at the end of the list, so your search will run till the end of the list (O(n)) and the slice will copy the whole list as well (also O(n)), resulting in a worst case of O(2n) which is still linear complexity.
Use index() to get the first occurrence of a list item. Then use the slice notation to get the desired part of the list.
>>> my_list = ['a', 'd', 'e', 'c', 'b', 'f']
>>> my_list[:my_list.index('c')]
['a', 'd', 'e']
The itertools solution
In[9]: from itertools import takewhile
In[10]: my_list = ['a', 'd', 'e', 'c', 'b', 'f']
In[11]: list(takewhile(lambda x: x != 'c', my_list))
Out[11]: ['a', 'd', 'e']
In Haskell it would be
takeWhile ((/=) 'c') "adecbf"
Want to merge two lists and discard the intersecting elements
A = ['a', 'b', 'c', 'd']
B = ['a', 'b', 'd', 'e', 'f']
Expected result:
['c', 'e', 'f']
I can get this by:
[i for i in A if i not in B] + [i for i in B if i not in A]
But is there a more convenient way to get the same result without loops and preferably through Pandas.
Best regards
Use sets:
set(A).symmetric_difference(B)
or equivalent:
set(A)^set(B)
(You can convert back to list if needs to be...)
I'm looking for a function that would take a list such as [a,b,c,d] and output a list of all the permutations where adjacent indices are swapped i.e. [[b,a,c,d], [a,c,b,d],[a,b,d,c], [d,b,c,a]]
Thanks
Simple way,you can just use a for loop and swap the adjacent items,tmp=l[:] will make a shallow copy,and it won't change original list l.
See more details from What exactly is the difference between shallow copy, deepcopy and normal assignment operation?:
l=['a', 'b', 'c', 'd']
for i in range(len(l)):
tmp=l[:]
tmp[i],tmp[i-1]=tmp[i-1],tmp[i]
print tmp
Result:
['d', 'b', 'c', 'a']
['b', 'a', 'c', 'd']
['a', 'c', 'b', 'd']
['a', 'b', 'd', 'c']
Suppose I have list
l = ['a', 'c', 'b']
and what I want is a list where those elements appear twice, one after the other, so
['a', 'a', 'c', 'c', 'b', 'b']
and I want to do this in the most pythonic way possible.
My half solution is doing something like
[[l[i], l[i]] for i in range(len(l))]
which yields
[['a', 'a'], ['c', 'c'], ['b', 'b']]
From here, I'd have to parse (walk) the list to remove the inner lists and obtain a single flat list.
Anyone has a better idea to do this in one go? Obviously things like l * 2 wouldn't help as it gives ['a', 'c', 'b', 'a', 'c', 'b'] and I want the same elements adjacent.
l_2 = [item for item in l for i in range(n)]
Link to origin: Stackoverflow: Repeating elements of a list n times
Using only list comprehension, you can do:
[i for j in my_list for i in [j]*2]
Output:
>>> my_list = ['a', 'c', 'b']
>>> [i for j in my_list for i in [j]*2]
['a', 'a', 'c', 'c', 'b', 'b']
You can zip the list against itself, then flatten it in a list comprehension.
>>> [i for j in zip(l,l) for i in j]
['a', 'a', 'c', 'c', 'b', 'b']
You can use zip function
l = ['a', 'c', 'b']
a = [i for j in zip(l,l) for i in j]
print(a)
Output
['a', 'a', 'c', 'c', 'b', 'b']
More general:
def ntimes(iterable, times=2):
for elt in iterable:
for _ in range(times):
yield elt
Here is a short solution without list comprehension, using the intuitive idea l*2:
sorted(l*2, key=l.index)
#['a', 'a', 'c', 'c', 'b', 'b']
If you like functional approaches, you can do this:
from itertools import chain, tee
l = ['a', 'c', 'b']
n = 2
list(chain.from_iterable(zip(*tee(l, n))))
While this might not perform as fast as the other answers, it can easily be used for arbitrary iterables (especially when they are infite or when you don't know when they end) by omitting list().
(Note that some of the other answers can also be adapted for arbitrary iterables by replacing their list comprehension by a generator expression.)