Vectorize a for loop based on condition using numpy - python

I have 2 numpy arrays l1 and l2 as follows:
start, jump = 1, 2
L = 8
l1 = np.arange(start, L)
l2 = np.arange(start + jump, L+jump)
This results in:
l1 = [1 2 3 4 5 6 7]
l2 = [3 4 5 6 7 8 9]
Now, I want 2 resultant arrays r1 and r2 such that while appending elements of l1 and l2 one by one in r1 and r2 respectively, it should check if r2 does not contain $i^{th}$ element of l1.
Implementing this using for loop is easy. But I am stuck on how to implement it using only numpy (without using loops) as I am new to it.
This is what I tried and want I am expecting:
r1 = []
r2 = []
for i in range(len(l1)):
if (l1[i] not in r2):
r1.append(l1[i])
r2.append(l2[i])
This gives:
r1 = [1, 2, 5, 6]
r2 = [3, 4, 7, 8]
Thanks in advance :)

As suggested by #Chrysoplylaxs in comments, I made a boolean mask and it worked like a charm!
mask = np.tile([True]*jump + [False]*jump, len(l1)//jump).astype(bool)
r1 = l1[mask[:len(l1)]]
r2 = l2[mask[:len(l2)]]

Related

Adding "i"th elements of each of the inner arrays of an array (2 dimensional) in Python and to make a new array with the addition as the "i"th element

#Python Code
n = 5
sprints = [2, 4, 1, 3]
s = [[0]*n for i in range(len(sprints)-1)] #array of 5*3
add_arr = [0 for i in range(n)] #array of 5
for i in range((len(sprints)-1)):
for j in range(n):
if sprints[i]<sprints[i+1]:
for k in range(sprints[i]-1,sprints[i+1]):
s[i][k] = 1
else:
for m in range(sprints[i+1]-1,sprints[i]):
s[i][m] = 1
print(s)
Output -
[[0,1,1,1,0],[1,1,1,1,0],[1,1,1,0,0]]
I want to add each of the "i"th elements of the inner arrays to create a new array such that:
add_arr = [[0+1+1],[1+1+1],[1+1+1],[1+1+0],[0+0+0]] = [2,3,3,2,0]
Please Help!
Use zip and map:
add_arr = list(map(sum, zip(*s)))
print(add_arr)
[2, 3, 3, 2, 0]

Getting unique values in python using List Comprehension technique

I want to get the values that appear in one of the lists but not in the others. I even tried using '<>', it says invalid syntax. I am trying using list comprehensions.
com_list = []
a1 = [1,2,3,4,5]
b1 = [6,4,2,1]
come_list = [a for a in a1 for b in b1 if a != b ]
Output:
[1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 5, 5, 5, 5]
My expected output would be `[3, 5, 6]
What you want is called symmetric difference, you can do:
a1 = [1,2,3,4,5]
b1 = [6,4,2,1]
set(a1).symmetric_difference(b1)
# {3, 5, 6}
which you can also write as:
set(a1) ^ set(b1)
If you really want a list in the end, just convert it:
list(set(a1) ^ set(b1))
# [3, 5, 6]
a1 = [1,2,3,4,5]
b1 = [6,4,2,1]
If you really want to do that using list comprehensions, well, here it is, but it's really not the right thing to do here.
A totally inefficient version:
# Don't do that !
sym_diff = [x for x in a1+b1 if x in a1 and x not in b1 or x in b1 and x not in a1]
print(sym_diff)
# [3, 5, 6]
It would be a bit better using sets to test membership efficiently:
# Don't do that either
a1 = set([1,2,3,4,5])
b1 = set([6,4,2,1])
sym_diff = [x for x in a1|b1 if x in a1 and x not in b1 or x in b1 and x not in a1]
print(sym_diff)
# [3, 5, 6]
But if you start using sets, which is the right thing to do here, use them all the way properly and use symmetric_difference.
You can do
come_list =[i for i in list((set(a1) - set(b1))) + list((set(b1) - set(a1)))]
print(come_list)
Output
[3, 5, 6]
This new list contains all unique numbers for both of the lists together.
the problem with this line come_list = [a for a in a1 for b in b1 if a != b ] is that the items iterating over each item in the first list over all the items in the second list to check if it's inited but it's not giving unique numbers between both.

Construct a matrix using a for loop

I have calculated 9 matrix elements named sij, with i and j being variables (i,j = [1, 2, 3]). Here, i denotes rows and j columns. Suppose I want a 3x3 matrix that consists of the matrix elements s11, s12, ... s32, s33 (nine elements in total).
s11 = 1
s12 = 2
s13 = 3
(...)
s33 = 9
How can I use for loops to construct a matrix out of these elements? Like this:
matrix = [[s11, s12, s13], [s21, s22, s23], [s31, s32, s33]]
So that I get a matrix that looks like this.
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
I would consider renaming the sij to s[i][j]. Then using them in loops would be trivial.
s[1][1] = 1
s[1][2] = 2
s[1][3] = 3
(...)
s[3][3] = 9
Then:
instead of:
matrix = [[s11, s12, s13], [s21, s22, s23], [s31, s32, s33]]
You can have the following two nested loops to construct the matrix.
for i in (1,4):
for j in (1,4):
BTW, having a 0 based numbering would be more Pythonic.
You are better off writing an array and reshaping such that you don't need to type out elements to variables but here is a one-liner
>> np.reshape([eval('s{0}{1}'.format(x,y)) for x in range(1,4) for y in range(1,4)], (3,3))
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]]
We can use this code for creating our desired matrix with for loop:
n = int(input('n:'))
for i in range(1,n):
for j in range(1,n):
if i<j:
print(1,end = ' ')
else :
print('0',end = ' ')
print()
Output
n:5
0 1 1 1
0 0 1 1
0 0 0 1
0 0 0 0

numpy method to join two meshgrids and their result arrays

Consider two n-dimensional, possibly overlapping, numpy meshgrids, say
m1 = (x1, y1, z1, ...)
m2 = (x2, y2, z2, ...)
Within m1 and m2 there are no duplicate coordinate tuples. Each meshgrid has a result array, which may result from different functions:
r1 = f1(m1)
r2 = f2(m2)
such that f1(m) != f2(m). Now I would like to join those two meshgrids and their result arrays, e.g. m=m1&m2 and r=r1&r2 (where & would denote some kind of union), such that the coordinate tuples in m are still sorted and the values in r still correspond to the original coordinate tuples. Newly created coordinate tuples should be identifiable (for instance with a special value).
To elaborate on what I'm after, I have two examples that kind of do what I want with simple for and if statements. Here's a 1D example:
x1 = [1, 5, 7]
r1 = [i**2 for i in x1]
x2 = [2, 4, 6]
r2 = [i*3 for i in x2]
x,r = list(zip(*sorted([(i,j) for i,j in zip(x1+x2,r1+r2)],key=lambda x: x[0])))
which gives
x = (1, 2, 4, 5, 6, 7)
r = (1, 6, 12, 25, 18, 49)
For 2D it starts getting quite complicated:
import numpy as np
a1 = [1, 5, 7]
b1 = [2, 5, 6]
x1,y1 = np.meshgrid(a1,b1)
r1 = x1*y1
a2 = [2, 4, 6]
b2 = [1, 3, 8]
x2, y2 = np.meshgrid(a2,b2)
r2 = 2*x2
a = [1, 2, 4, 5, 6, 7]
b = [1, 2, 3, 5, 6, 8]
x,y = np.meshgrid(a,b)
r = np.ones(x.shape)*-1
for i in range(x.shape[0]):
for j in range(x.shape[1]):
if x[i,j] in a1 and y[i,j] in b1:
r[i,j] = r1[a1.index(x[i,j]),b1.index(y[i,j])]
elif x[i,j] in a2 and y[i,j] in b2:
r[i,j] = r2[a2.index(x[i,j]),b2.index(y[i,j])]
This gives the desired result, with new coordinate pairs having the value -1:
x=
[[1 2 4 5 6 7]
[1 2 4 5 6 7]
[1 2 4 5 6 7]
[1 2 4 5 6 7]
[1 2 4 5 6 7]
[1 2 4 5 6 7]]
y=
[[1 1 1 1 1 1]
[2 2 2 2 2 2]
[3 3 3 3 3 3]
[5 5 5 5 5 5]
[6 6 6 6 6 6]
[8 8 8 8 8 8]]
r=
[[ -1. 4. 4. -1. 4. -1.]
[ 2. -1. -1. 5. -1. 6.]
[ -1. 8. 8. -1. 8. -1.]
[ 10. -1. -1. 25. -1. 30.]
[ 14. -1. -1. 35. -1. 42.]
[ -1. 12. 12. -1. 12. -1.]]
but this will also become slow quickly with increasing dimensions and array sizes. So here finally the question: How can this be done using only numpy functions. If it is not possible, what would be the fastest way to implement this in python. If it is anyhow relevant, I prefer using Python 3. Note that the functions I use in the examples are not the actual functions I use.
We can make use of some masking to replace the A in B parts to give us 1D masks. Then, we can use those masks with np.ix_ to extend to desired number of dimensions.
Thus, for a 2D case, it would be something along these lines -
# Initialize o/p array
r_out = np.full([len(a), len(b)],-1)
# Assign for the IF part
mask_a1 = np.in1d(a,a1)
mask_b1 = np.in1d(b,b1)
r_out[np.ix_(mask_b1, mask_a1)] = r1.T
# Assign for the ELIF part
mask_a2 = np.in1d(a,a2)
mask_b2 = np.in1d(b,b2)
r_out[np.ix_(mask_b2, mask_a2)] = r2.T
a could be created, like so -
a = np.concatenate((a1,a2))
a.sort()
Similarly, for b.
Also, we could make use of indices instead of masks for use with np.ix_. For the same, we could use np.searchsorted. Thus, instead of the mask np.in1d(a,a1), we could get corresponding indices with np.searchsorted(a,a1) and so on for the rest of the masks. This should be considerably faster.
For a 3D case, I would assume that we would have another array, say c. Thus, the initialization part would involve using len(c). There would be one more mask/index-array corresponding to c and hence one more term into np.ix_ and there would be transpose of r1 and r2.
Divakar's answer is exactly what I needed. I wanted, however, to still try out the second suggestion in that answer and on top I did some profiling. I thought the results may be interesting to others. Here is the code I used for profiling:
import numpy as np
import timeit
import random
def for_join_2d(x1,y1,r1, x2,y2,r2):
"""
The algorithm from the question.
"""
a = sorted(list(x1[0,:])+list(x2[0,:]))
b = sorted(list(y1[:,0])+list(y2[:,0]))
x,y = np.meshgrid(a,b)
r = np.ones(x.shape)*-1
for i in range(x.shape[0]):
for j in range(x.shape[1]):
if x[i,j] in a1 and y[i,j] in b1:
r[i,j] = r1[a1.index(x[i,j]),b1.index(y[i,j])]
elif x[i,j] in a2 and y[i,j] in b2:
r[i,j] = r2[a2.index(x[i,j]),b2.index(y[i,j])]
return x,y,r
def mask_join_2d(x1,y1,r1,x2,y2,r2):
"""
Divakar's original answer.
"""
a = np.sort(np.concatenate((x1[0,:],x2[0,:])))
b = np.sort(np.concatenate((y1[:,0],y2[:,0])))
# Initialize o/p array
x,y = np.meshgrid(a,b)
r_out = np.full([len(a), len(b)],-1)
# Assign for the IF part
mask_a1 = np.in1d(a,a1)
mask_b1 = np.in1d(b,b1)
r_out[np.ix_(mask_b1, mask_a1)] = r1.T
# Assign for the ELIF part
mask_a2 = np.in1d(a,a2)
mask_b2 = np.in1d(b,b2)
r_out[np.ix_(mask_b2, mask_a2)] = r2.T
return x,y,r_out
def searchsort_join_2d(x1,y1,r1,x2,y2,r2):
"""
Divakar's second suggested solution using searchsort.
"""
a = np.sort(np.concatenate((x1[0,:],x2[0,:])))
b = np.sort(np.concatenate((y1[:,0],y2[:,0])))
# Initialize o/p array
x,y = np.meshgrid(a,b)
r_out = np.full([len(a), len(b)],-1)
#the IF part
ind_a1 = np.searchsorted(a,a1)
ind_b1 = np.searchsorted(b,b1)
r_out[np.ix_(ind_b1,ind_a1)] = r1.T
#the ELIF part
ind_a2 = np.searchsorted(a,a2)
ind_b2 = np.searchsorted(b,b2)
r_out[np.ix_(ind_b2,ind_a2)] = r2.T
return x,y,r_out
##the profiling code:
if __name__ == '__main__':
N1 = 100
N2 = 100
coords_a = [i for i in range(N1)]
coords_b = [i*2 for i in range(N2)]
a1 = random.sample(coords_a, N1//2)
b1 = random.sample(coords_b, N2//2)
a2 = [i for i in coords_a if i not in a1]
b2 = [i for i in coords_b if i not in b1]
x1,y1 = np.meshgrid(a1,b1)
r1 = x1*y1
x2,y2 = np.meshgrid(a2,b2)
r2 = 2*x2
print("original for loop")
print(min(timeit.Timer(
'for_join_2d(x1,y1,r1,x2,y2,r2)',
setup = 'from __main__ import for_join_2d,x1,y1,r1,x2,y2,r2',
).repeat(7,1000)))
print("with masks")
print(min(timeit.Timer(
'mask_join_2d(x1,y1,r1,x2,y2,r2)',
setup = 'from __main__ import mask_join_2d,x1,y1,r1,x2,y2,r2',
).repeat(7,1000)))
print("with searchsort")
print(min(timeit.Timer(
'searchsort_join_2d(x1,y1,r1,x2,y2,r2)',
setup = 'from __main__ import searchsort_join_2d,x1,y1,r1,x2,y2,r2',
).repeat(7,1000)))
For each function I used 7 sets of 1000 iterations and picked the fastest set for evaluation. The results for two 10x10 arrays was:
original for loop
0.5114614190533757
with masks
0.21544912096578628
with searchsort
0.12026709201745689
and for two 100x100 arrays it was:
original for loop
247.88183582702186
with masks
0.5245905339252204
with searchsort
0.2439237720100209
For big matrices the use of numpy functionality unsurprisingly makes a huge difference and indeed searchsort and indexing instead of masking about halves the run time.

Updating list values with new values read - Python [duplicate]

This question already has answers here:
How do i add two lists' elements into one list?
(4 answers)
Closed 9 years ago.
I was't really sure how to ask this. I have a list of 3 values initially set to zero. Then I read 3 values in at a time from the user and I want to update the 3 values in the list with the new ones I read.
cordlist = [0]*3
Input:
3 4 5
I want list to now look like:
[3, 4, 5]
Input:
2 3 -6
List should now be
[5, 7, -1]
How do I go about accomplishing this? This is what I have:
cordlist += ([int(g) for g in raw_input().split()] for i in xrange(n))
but that just adds a new list, and doesn't really update the values in the previous list
In [17]: import numpy as np
In [18]: lst=np.array([0]*3)
In [19]: lst+=np.array([int(g) for g in raw_input().split()])
3 4 5
In [20]: lst
Out[20]: array([3, 4, 5])
In [21]: lst+=np.array([int(g) for g in raw_input().split()])
2 3 -6
In [22]: lst
Out[22]: array([ 5, 7, -1])
I would do something like this:
cordlist = [0, 0, 0]
for i in xrange(n):
cordlist = map(sum, zip(cordlist, map(int, raw_input().split())))
Breakdown:
map(int, raw_input().split()) is equivalent to [int(i) for i in raw_input().split()]
zip basically takes a number a lists, and returns a list of tuples containing the elements that are in the same index. See the docs for more information.
map, as I explained earlier, applies a function to each of the elements in an iterable, and returns a list. See the docs for more information.
cordlist = [v1+int(v2) for v1, v2 in zip(cordlist, raw_input().split())]
tested like that:
l1 = [1,2,3]
l2 = [2,3,4]
print [v1+v2 for v1, v2 in zip(l1, l2)]
result: [3, 5, 7]
I would go that way using itertools.zip_longest:
from itertools import zip_longest
def add_lists(l1, l2):
return [int(i)+int(j) for i, j in zip_longest(l1, l2, fillvalue=0)]
result = []
while True:
l = input().split()
print('result = ', add_lists(result, l))
Output:
>>> 1 2 3
result = [1, 2, 3]
>>> 3 4 5
result = [4, 6, 8]
More compact version of #namit's numpy solution
>>> import numpy as np
>>> lst = np.zeros(3, dtype=int)
>>> for i in range(2):
lst += np.fromstring(raw_input(), dtype=int, sep=' ')
3 4 5
2 3 -6
>>> lst
array([ 5, 7, -1])

Categories