I want to merge two arrays in python in a special way.
The entries with an odd index of my output array out shall be the coresponding entries of my first input array in0. The entries with an even index in out shall be the coresponding entries of my second input array
in1.
in0, in1 and out are all the same length.
Example:
The input arrays
in0 = [0, 1, 2, 3]
in1 = [4, 5, 6, 7]
shall be merge to the output array
out = [0, 5, 2, 7]
Is there a nicer way than to loop over the whole length of the inputs and fill my out 'by hand'?
You could use a list comprehension and select values from in0 on even indices and in1 on odd indices:
[in0[i] if i % 2 == 0 else in1[i] for i in range(len(in0))]
# [0, 5, 2, 7]
If you're happy to make full list copy, this is simple with slicing:
>>> in0 = [0, 1, 2, 3]
>>> in1 = [4, 5, 6, 7]
>>> out = in0[:]
>>> out[1::2] = in1[1::2]
>>> out
[0, 5, 2, 7]
If you don't mind some verbosity...
from itertools import cycle
in0 = [0, 1, 2, 3]
in1 = [4, 5, 6, 7]
out = [pair[i] for pair, i in zip(zip(in0, in1), cycle([0,1]))]
How it works:
zip(in0, in1) is a sequence of tuples, (0,4), (1,5), (2,6), (3,7).
cycle([0,1]) is an endless stream of alternating 0s and 1s to be used as indices in the tuples from step 1.
zip(zip(...), cycle(...)) produces a pair of tuples and indices:
(0, (0,4)), (1, (1,5)), (0, (2,6)), (1, (3,7)).
The list comprehension takes the correct element from each tuple.
In the end, the list comprehension is a general version of
[(0,4)[0], (1,5)[1], (2,6)[0], (3,7)[1]]
Without using loops, but not in the exact same order you requested:
>> in0 = [0, 1, 2, 3]
>> in1 = [4, 5, 6, 7]
>> out = in0[0::2] + in1[1::2]
>> out
[0, 2, 5, 7]
EDIT: correcting the output order with itertools:
>> import itertools
>> in0 = [0, 1, 2, 3]
>> in1 = [4, 5, 6, 7]
>> out = list(itertools.chain(*zip(in0[0::2], in1[1::2])))
>> out
[0, 5, 2, 7]
Related
I have 3 lists.
A_set = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Q_act = [2, 3]
dur = [0, 4, 5, 2, 1, 3, 4, 8, 2, 3]
All lists are integers.
What I am trying to do is to compare Q_act with A_set then obtain the indices of the numbers that match from A_set.
(Example:
Q_act has the elements [2,3]
it is located in indices [1,2] from A_set)
Afterwards, I will use those indices to obtain the corresponding value in dur and store this in a list called p_dur_Q_act.
(Example: using the result from the previous example, [1,2]
The values in the dur list corresponding to the indices [1,2] should be stored in another list called p_dur_Q_act
i.e. [4,5] should be the values stored in the list p_dur_Q_act)
So, how do I get the index of the common integer element (which is [1,2]) from two separate lists and plug it to another list?
So far here are the code(s) I used:
This one, I wrote because it returns the index. But not [4,5].
p_Q = set(Q_act).intersection(A_set)
p_dur_Q_act = [i + 1 for i, x in enumerate(p_Q)]
print(p_dur_Q_act)
I also tried this but I receive an error TypeError: argument of type 'int' is not iterable
p_dur_Q_act = [i + 1 for i, x in enumerate(Q_act) if any(elem in x for elem in A_set)]
print(p_dur_Q_act)
Another option is to use the enumerate iterator to generate every index, and then select only the ones you want:
a_set = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
q_act = [2, 3]
dur = [0, 4, 5, 2, 1, 3, 4, 8, 2, 3]
p_dur_q_act = [i for i,v in enumerate(a_set) if v in q_act]
print([dur[p] for p in p_dur_q_act if p in dur]) # [4, 5]
This is more efficient than repeatedly calling index if the number of matches is large, because the number of calls is proportional to the number of matches, but the duration of calls is proportional to the length of a_set. The enumerate approach can be made even more efficient by turning q_act into a set, since in scales better with sets than lists. At these scales, though, there will be no observable difference.
You don't need to map these to index values, though. You can get the same result if you use zip to map a_set to dur and then select the d values whose a values are in q_act.
a_set = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
q_act = {2, 3}
dur = [0, 4, 5, 2, 1, 3, 4, 8, 2, 3]
p_dur_q_act = [d for a, d in zip(a_set, dur) if a in q_act]
Use index function to get the index of the element in the list.
>>> a_set = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> q_act = [2, 3]
>>> dur = [0, 4, 5, 2, 1, 3, 4, 8, 2, 3]
>>>
>>> print([dur[a_set.index(q)] for q in set(a_set).intersection(q_act)])
[4, 5]
This is a follow up to a previous question. If I have a NumPy array [0, 1, 2, 2, 3, 4, 2, 2, 5, 5, 6, 5, 5, 2, 2], for each repeat sequence (starting at each index), is there a fast way to to then find all matches of that repeat sequence and return the index for those matches?
Here, the repeat sequences are [2, 2] and [5, 5] (note that the length of the repeat is specified by the user but will be the same length and can be much greater than 2). The repeats can be found at [2, 6, 8, 11, 13] via:
def consec_repeat_starts(a, n):
N = n-1
m = a[:-1]==a[1:]
return np.flatnonzero(np.convolve(m,np.ones(N, dtype=int))==N)-N+1
But for each unique type of repeat sequence (i.e., [2, 2] and [5, 5]) I want to return something like the repeat followed by the indices for where the repeat is located:
[([2, 2], [2, 6, 13]), ([5, 5], [8, 11])]
Update
Additionally, given the repeat sequence, can you return the results from a second array. So, look for [2, 2] and [5, 5] in:
[2, 2, 5, 5, 1, 4, 9, 2, 5, 5, 0, 2, 2, 2]
And the function would return:
[([2, 2], [0, 11, 12]), ([5, 5], [2, 8]))]
Here's a way to do so -
def group_consec(a, n):
idx = consec_repeat_starts(a, n)
b = a[idx]
sidx = b.argsort()
c = b[sidx]
cut_idx = np.flatnonzero(np.r_[True, c[:-1]!=c[1:],True])
idx_s = idx[sidx]
indices = [idx_s[i:j] for (i,j) in zip(cut_idx[:-1],cut_idx[1:])]
return c[cut_idx[:-1]], indices
# Perform lookup in another array, b
n = 2
v_a,indices_a = group_consec(a, n)
v_b,indices_b = group_consec(b, n)
idx = np.searchsorted(v_a, v_b)
idx[idx==len(v_a)] = 0
valid_mask = v_a[idx]==v_b
common_indices = [j for (i,j) in zip(valid_mask,indices_b) if i]
common_val = v_b[valid_mask]
Note that for simplicity and ease of usage, the first output arg off group_consec has the unique values per sequence. If you need them in (val, val,..) format, simply replicate at the end. Similarly, for common_val.
Suppose I have some numpy array (all elements are unique) that I want to sort in descending order. I need to find out which positions elements of initial array will take in sorted array.
Example.
In1: [1, 2, 3] # Input
Out1: [2, 1, 0] # Expected output
In2: [1, -2, 2] # Input
Out2: [1, 2, 0] # Expected output
I tried this one:
def find_positions(A):
A = np.array(A)
A_sorted = np.sort(A)[::-1]
return np.argwhere(A[:, None] == A_sorted[None, :])[:, 1]
But it doesn't work when the input array is very large (len > 100000). What I did wrong and how can I resolve it?
Approach #1
We could use double argsort -
np.argsort(a)[::-1].argsort() # a is input array/list
Approach #2
We could use one argsort and then array-assignment -
# https://stackoverflow.com/a/41242285/ #Andras Deak
def argsort_unique(idx):
n = idx.size
sidx = np.empty(n,dtype=int)
sidx[idx] = np.arange(n)
return sidx
out = argsort_unique(np.argsort(a)[::-1])
Take a look at numpy.argsort(...) function:
Returns the indices that would sort an array.
Perform an indirect sort along the given axis using the algorithm specified by the kind keyword. It returns an array of indices of the same shape as a that index data along the given axis in sorted order.
Here is the reference from the documentation, and the following is a simple example:
import numpy
arr = numpy.random.rand(100000)
indexes = numpy.argsort(arr)
the indexes array will contain all the indexes in the order in which the array arr would be sorted
I face the same problem for plain lists, and would like to avoid using numpy. So I propose a possible solution that should also work for an np.array, and which avoids reversal of the result:
def argsort(A, key=None, reverse=False):
"Indirect sort of list or array A: return indices of elements in order."
keyfunc = (lambda i: A[i]) if key is None else lambda i: key(A[i])
return sorted(range(len(A)), keyfunc, reverse=reverse)
Example of use:
>>> L = [3,1,4,1,5,9,2,6]
>>> argsort( L )
[1, 3, 6, 0, 2, 4, 7, 5]
>>> [L[i]for i in _]
[1, 1, 2, 3, 4, 5, 6, 9]
>>> argsort( L, key=lambda x:(x%2,x) ) # even elements first
[6, 2, 7, 1, 3, 0, 4, 5]
>>> [L[i]for i in _]
[2, 4, 6, 1, 1, 3, 5, 9]
>>> argsort( L, key=lambda x:(x%2,x), reverse = True)
[5, 4, 0, 1, 3, 7, 2, 6]
>>> [L[i]for i in _]
[9, 5, 3, 1, 1, 6, 4, 2]
Feedback would be welcome! (Efficiency compared to previously proposed solutions? Suggestions for improvements?)
I have a list with mixed sequences like
[1,2,3,4,5,2,3,4,1,2]
I want to know how I can use itertools to split the list into increasing sequences cutting the list at decreasing points. For instance the above would output
[[1, 2, 3, 4, 5], [2, 3, 4], [1, 2]]
this has been obtained by noting that the sequence decreases at 2 so we cut the first bit there and another decrease is at one cutting again there.
Another example is with the sequence
[3,2,1]
the output should be
[[3], [2], [1]]
In the event that the given sequence is increasing we return the same sequence. For example
[1,2,3]
returns the same result. i.e
[[1, 2, 3]]
For a repeating list like
[ 1, 2,2,2, 1, 2, 3, 3, 1,1,1, 2, 3, 4, 1, 2, 3, 4, 5, 6]
the output should be
[[1, 2, 2, 2], [1, 2, 3, 3], [1, 1, 1, 2, 3, 4], [1, 2, 3, 4, 5, 6]]
What I did to achieve this is define the following function
def splitter (L):
result = []
tmp = 0
initialPoint=0
for i in range(len(L)):
if (L[i] < tmp):
tmpp = L[initialPoint:i]
result.append(tmpp)
initialPoint=i
tmp = L[i]
result.append(L[initialPoint:])
return result
The function is working 100% but what I need is to do the same with itertools so that I can improve efficiency of my code. Is there a way to do this with itertools package to avoid the explicit looping?
With numpy, you can use numpy.split, this requires the index as split positions; since you want to split where the value decreases, you can use numpy.diff to calculate the difference and check where the difference is smaller than zero and use numpy.where to retrieve corresponding indices, an example with the last case in the question:
import numpy as np
lst = [ 1, 2,2,2, 1, 2, 3, 3, 1,1,1, 2, 3, 4, 1, 2, 3, 4, 5, 6]
np.split(lst, np.where(np.diff(lst) < 0)[0] + 1)
# [array([1, 2, 2, 2]),
# array([1, 2, 3, 3]),
# array([1, 1, 1, 2, 3, 4]),
# array([1, 2, 3, 4, 5, 6])]
Psidom already has you covered with a good answer, but another NumPy solution would be to use scipy.signal.argrelmax to acquire the local maxima, then np.split.
from scipy.signal import argrelmax
arr = np.random.randint(1000, size=10**6)
splits = np.split(arr, argrelmax(arr)[0]+1)
Assume your original input array:
a = [1, 2, 3, 4, 5, 2, 3, 4, 1, 2]
First find the places where the splits shall occur:
p = [ i+1 for i, (x, y) in enumerate(zip(a, a[1:])) if x > y ]
Then create slices for each such split:
print [ a[m:n] for m, n in zip([ 0 ] + p, p + [ None ]) ]
This will print this:
[[1, 2, 3, 4, 5], [2, 3, 4], [1, 2]]
I propose to use more speaking names than p, n, m, etc. ;-)
I am trying remove duplicate elements from the list, whose number of duplicates is odd.
For example for the following list: [1, 2, 3, 3, 3, 5, 8, 1, 8] I have 1 duplicated 2 times, 3 duplicated 3 times, and 8 duplicated 2 times. So 1 and 8 should be out and instead of 3 elements of 3 I need to leave only 1.
This is what I came up with:
def remove_odd_duplicates(arr):
h = {}
for i in arr:
if i in h:
h[i] += 1
else:
h[i] = 1
arr = []
for i in h:
if h[i] % 2:
arr.append(i)
return arr
It returns everything correctly: [2, 3, 5], but I do believe that this can be written in a nicer way. Any ideas?
You can use collections.Counter and list comprehension, like this
data = [1, 2, 3, 3, 3, 5, 8, 1, 8]
from collections import Counter
print [item for item, count in Counter(data).items() if count % 2]
# [2, 3, 5]
The Counter gives a dictionary, with every element in the input iterable as the keys and their corresponding counts as the values. So, we iterate over that dict and check if the count is odd and filter only those items out.
Note: The complexity of this solution is still O(N), just like your original program.
If order doesn't matter:
>>> a = [1, 2, 3, 3, 3, 5, 8, 1, 8]
>>> list(set([x for x in a if a.count(x)%2 == 1]))
[2, 3, 5]
The list comprehension [x for x in a if a.count(x)%2 == 1] returns only the elements which appear an odd number of times in the list. list(set(...)) is a common way of removing duplicate entries from a list.
you can possibly use scipy.stats.itemfreq:
>>> from scipy.stats import itemfreq
>>> xs = [1, 2, 3, 3, 3, 5, 8, 1, 8]
>>> ifreq = itemfreq(xs)
>>> ifreq
array([[1, 2],
[2, 1],
[3, 3],
[5, 1],
[8, 2]])
>>> i = ifreq[:, 1] % 2 != 0
>>> ifreq[i, 0]
array([2, 3, 5])