I want to create a python program which computes a matrix from a vector with some coefficients. So lets say we have the following vector of coefficients c = [c0, c1, c2] = [0, 1, 0], then I want to compute the matrix:
So how do I go from the vector c to creating a lower triangular matrix A. I know how to index it manually, but I need a program that can do it. I was maybe thinking about a for-loop inside another for-loop but I struggle with how it is done practically, what do you guys think should be done here?
One way (assuming you're using plain arrays and not numpy or anything):
src = [0, 1, 0]
dst = [
[
src[i-j] if i >= j else 0
for j in range(len(src))
] for i in range(len(src))
]
You can try the following:
import numpy as np
c = [1, 2, 3, 4, 5]
n = len(c)
a = np.zeros((n,n))
for i in range(n):
np.fill_diagonal(a[i:, :], c[i])
print(a)
It gives:
[[1. 0. 0. 0. 0.]
[2. 1. 0. 0. 0.]
[3. 2. 1. 0. 0.]
[4. 3. 2. 1. 0.]
[5. 4. 3. 2. 1.]]
Related
I'm trying to create a matrix that reads:
[0,1,2]
[3,4,5]
[6,7,8]
However, my elements keep repeating. How do I fix this?
import numpy as np
n = 3
X = np.empty(shape=[0, n])
for i in range(3):
for j in range(1,4):
for k in range(1,7):
X = np.append(X, [[(3*i) , ((3*j)-2), ((3*k)-1)]], axis=0)
print(X)
Results:
[[ 0. 1. 2.]
[ 0. 1. 5.]
[ 0. 1. 8.]
[ 0. 1. 11.]
[ 0. 1. 14.]
[ 0. 1. 17.]
[ 0. 4. 2.]
[ 0. 4. 5.]
I'm not really sure how you think your code was supposed to work. You are appending a row in X at each loop, so 3 * 3 * 7 times, so you end up with a matrix of 54 x 3.
I think maybe you meant to do:
for i in range(3):
X = np.append(X, [[3*i , 3*i+1, 3*i+2]], axis=0)
Just so you know, appending array is usually discouraged (just create a list of list, then make it a numpy array).
You could also do
>> np.arange(9).reshape((3,3))
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
I have a numpy array of shape [12, 8, 5, 5]. I want to modify the values of 3rd and 4th dimension for each element.
For e.g.
import numpy as np
x = np.zeros((12, 80, 5, 5))
print(x[0,0,:,:])
Output:
[[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]]
Modify values:
y = np.ones((5,5))
x[0,0,:,:] = y
print(x[0,0,:,:])
Output:
[[ 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1.]]
I can modify for all x[i,j,:,:] using two for loops. But, I was wondering if there is any pythonic way to do it without running two loops. Just curious to know :)
UPDATE
Actual use case:
dict_weights = copy.deepcopy(combined_weights)
for i in range(0, len(combined_weights[each_layer][:, 0, 0, 0])):
for j in range(0, len(combined_weights[each_layer][0, :, 0, 0])):
# Extract 5x5
trans_weight = combined_weights[each_layer][i,j]
trans_weight = np.fliplr(np.flipud(trans_weight ))
# Update
dict_weights[each_layer][i, j] = trans_weight
NOTE: The dimensions i, j of combined_weights can vary. There are around 200 elements in this list with varied i and j dimensions, but 3rd and 4th dimensions are always same (i.e. 5x5).
I just want to know if I can updated the elements combined_weights[:,:,5, 5] with transposed values without running 2 for loops.
Thanks.
Simply do -
dict_weights[each_layer] = combined_weights[each_layer][...,::-1,::-1]
I'm trying to implement a k-nearest neighbour classifier in Python, and so I want to calculate the Euclidean distance. I have a dataset that I have converted into a big numpy array
[[ 0. 0. 4. ..., 1. 0. 1.]
[ 0. 0. 5. ..., 0. 0. 1.]
[ 0. 0. 14. ..., 16. 9. 1.]
...,
[ 0. 0. 3. ..., 2. 0. 3.]
[ 0. 1. 7. ..., 0. 0. 3.]
[ 0. 2. 10. ..., 0. 0. 3.]]
where the last element of each row indicates the class. So when calculating the Euclidean distance, I obviously don't want to include the last element. I thought I could do the following
for row in dataset:
distance = euclidean_distance(vector, row[:dataset.shape[1] - 1])
but that still includes the last element
print row
>>> [[ 0. 0. 4. ..., 1. 0. 1.]]
print row[:dataset.shape[1] - 1]
>>> [[ 0. 0. 4. ..., 1. 0. 1.]]
as you can see both are the same.
You can subset the data using numpy slicing. If you find yourself iterating over a numpy array, stop and try to find a method that takes advantage of the vectorized nature of numpy operations.
Assuming your array is called arr:
data_points = arr[:,:-1]
classes = arr[:,-1]
For distance to vector calculations:
To find the distance between a 1d array and all of the rows of a 2d array, you can use to following. It assumes the 1d array is v and the 2d array is arr.
dist = np.power(arr - v, 2).sum(axis=1)
dist will be a 1d array of distances.
For pairwise calculations:
The following function takes a 2d array of numbers and returns the upper-diagonal matrix of pair-wise distances using the given L-x distance measurement (the Euclidean distance measure is the L=2 metric).
def pairwise_distance(arr, L=2):
d = arr.shape[0]
out = np.zeros(d)
for f in range(1, d):
out[:-f].ravel()[f::d+1] = np.power(arr[:-f]-arr[f:], L).sum(axis=1)
return np.power(out, 1.0/L)
I have a numpy array D of dimensions 4x4
I want a new numpy array based on an user defined value v
If v=2, the new numpy array should be [D D].
If v=3, the new numpy array should be [D D D]
How do i initialise such a numpy array as numpy.zeros(v) dont allow me to place arrays as elements?
If I understand correctly, you want to take a 2D array and tile it v times in the first dimension? You can use np.repeat:
# a 2D array
D = np.arange(4).reshape(2, 2)
print D
# [[0 1]
# [2 3]]
# tile it 3 times in the first dimension
x = np.repeat(D[None, :], 3, axis=0)
print x.shape
# (3, 2, 2)
print x
# [[[0 1]
# [2 3]]
# [[0 1]
# [2 3]]
# [[0 1]
# [2 3]]]
If you wanted the output to be kept two-dimensional, i.e. (6, 2), you could omit the [None, :] indexing (see this page for more info on numpy's broadcasting rules).
print np.repeat(D, 3, axis=0)
# [[0 1]
# [0 1]
# [0 1]
# [2 3]
# [2 3]
# [2 3]]
Another alternative is np.tile, which behaves slightly differently in that it will always tile over the last dimension:
print np.tile(D, 3)
# [[0, 1, 0, 1, 0, 1],
# [2, 3, 2, 3, 2, 3]])
You can do that as follows:
import numpy as np
v = 3
x = np.array([np.zeros((4,4)) for _ in range(v)])
>>> print x
[[[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]]
[[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]]
[[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]]]
Here you go, see if this works for you.
import numpy as np
v = raw_input('Enter: ')
To intialize the numpy array of arrays from user input (obviously can be whatever shape you're wanting here):
b = np.zeros(shape=(int(v),int(v)))
I know this isn't initializing a numpy array but since you mentioned wanting an array of [D D] if v was 2 for example, just thought I'd throw this in as another option as well.
new_array = []
for x in range(0, int(v)):
new_array.append(D)
I have an index to choose elements from one array. But sometimes the index might have repeated entries... in that case I would like to choose the corresponding smaller value. Is it possible?
index = [0,3,5,5]
dist = [1,1,1,3]
arr = np.zeros(6)
arr[index] = dist
print arr
what I get:
[ 1. 0. 0. 1. 0. 3.]
what I would like to get:
[ 1. 0. 0. 1. 0. 1.]
addendum
Actually I have a third array with the (vector) values to be inserted. So the problem is to insert values from values into arr at positions index as in the following. However I want to choose the values corresponding to minimum dist when multiple values have the same index.
index = [0,3,5,5]
dist = [1,1,1,3]
values = np.arange(8).reshape(4,2)
arr = np.zeros((6,2))
arr[index] = values
print arr
I get:
[[ 0. 1.]
[ 0. 0.]
[ 0. 0.]
[ 2. 3.]
[ 0. 0.]
[ 6. 7.]]
I would like to get:
[[ 0. 1.]
[ 0. 0.]
[ 0. 0.]
[ 2. 3.]
[ 0. 0.]
[ 4. 5.]]
Use groupby in pandas:
import pandas as pd
index = [0,3,5,5]
dist = [1,1,1,3]
s = pd.Series(dist).groupby(index).min()
arr = np.zeros(6)
arr[s.index] = s.values
print arr
If index is sorted, then itertools.groupby could be used to group that list.
np.array([(g[0],min([x[1] for x in g[1]])) for g in
itertools.groupby(zip(index,dist),lambda x:x[0])])
produces
array([[0, 1],
[3, 1],
[5, 1]])
This is about 8x slower than the version using np.unique. So for N=1000 is similar to the Pandas version (I'm guessing since something is screwy with my Pandas import). For larger N the Pandas version is better. Looks like the Pandas approach has a substantial startup cost, which limits its speed for small N.