I want to write vectorized style code in Julia in the context of wanting to define a function which takes more than one vector as arguments like below.
[code]
using PyPlot;
m=[453 21 90;34 1 44;13 553 66]
a = [1,2,3]
b=[1,2,3]
f(x,y) = m[x,y]
f.(a,b)
#= expected result
3×3 Matrix{Int64}:
453 21 90
34 1 44
13 553 66
#
[real result]
3-element Vector{Int64}:
453
1
66
The dot notation only picks the first element of each row, ignoring the others, and makes a vector with just 3 elements instead of 3 x 3 matrix.
How can I write to get the expected result?
Any information would be appreciated.
one of the two vectors needs to be a row vector so that Julia understands what you want to do, this simple example should help you understand Julia broadcasting:
julia> [1,2,3] .+ [10,20,30] # both have the same dimensions
3-element Vector{Int64}:
11
22
33
julia> [1,2,3]' .+ [10,20,30]
# first has dimensions (1,3) and second (3,1) => result is dimension (3,3)
3×3 Matrix{Int64}:
11 12 13
21 22 23
31 32 33
You're looking for
julia> f.(a, b')
3×3 Matrix{Int64}:
453 21 90
34 1 44
13 553 66
Note the relevant section in the documentation for broadcast (type ?broadcast into a REPL session to access it):
Singleton and missing dimensions are expanded to match the extents of the other arguments by virtually repeating the value.
a is treated as a 3x1 matrix (but has the type Vector{T}), while b' is used as a 1x3 matrix (with the type Adjoint(T, Vector{T})). These are broadcast to the resulting 3x3 matrix.
When using a and b directly, no expansion of dimensions is necessary, and you'll end up with a 3x1 matrix.
Related
I have a python array that i got using
array = np.arange(2,201,2).reshape(25,4)
which gave me this:
[[ 2 4 6 8]
[ 18 20 22 24]
[ 34 36 38 40]
[ 50 52 54 56]
[ 66 68 70 72]
[ 82 84 86 88]
[ 98 100 102 104]
[114 116 118 120]
[130 132 134 136]
[146 148 150 152]
[162 164 166 168]
[178 180 182 184]
[194 196 198 200]]
but now i'm instructed to select only the values below 50 from "array", add 5 to these values, and then multiply by 2. The other values should remain unchanged and everything should be saved as "array". This is a school assignment so I don't have the output but basically the output should be the array in the same 25x4 shape and the first ~3 rows will be changed (since those are the ones under 50) and the other rows/values will be the same (since they're over 50). I've tried the following code:
for i in array:
if array < 50:
print((i+5)*2)
else:
print(i)
and I'm getting an error that says -
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
any help would be greatly appreciated since I can't find any other articles with similar questions
There are 2 ways to address this question. A Python one and a numpy one (numpy is not Python...).
Python way:
You have a sequence of sequence containers. You can use a double iteration to test the values one at a time and replace the ones that have to be:
for row in array: # iterate over the rows
for i, val in enumerate(row): # then the values in the row
if val <=50: # test them
row[i] = (val + 5) * 2 # and replace
This works as soon as the outer iteration gives you a direct access to the row container. This is true for both Python containers (lists) and numpy arrays but may not be guaranteed for any type of containers. The super safe way would be to keep the indexes and directly modify array:
for i in range(len(array)):
for j in range(len(array[i])):
if array[i][j]< 50:
array[i][j] = (array[i][j] + 2) * 5
Numpy way:
The power of numpy is to provide high speed iterations on its arrays. In numpy wordings it is called vectorization. You should first extract the relevant indexes and then change the values in one single vectorized operation:
ix = np.where(array < 50)
array[ix] = (array[ix] + 5) * 2
For large arrays, this second way should be at least one magnitude order faster than the first one.
For your question, the correct way is the one that matches your current lesson, either Python or numpy...
import numpy as np
array = np.arange(2,201,2).reshape(25,4)
values = [ (element+5)*2 if element < 50 else element for innerList in array for element in innerList ]
print(values)
one more time i need your help,
To introduce the problem, i got this :
x=[0 1 3 4 5 6 7 8]
y=[9 10 11 12 13 14 15 16]
x=x(:)
y=y(:)
X=[x.^2, x.*y,y.^2,x,y]
a=sum(X)/(X'*X)
X=
0 0 81 0 9
1 10 100 1 10
9 33 121 3 11
16 48 144 4 12
25 65 169 5 13
36 84 196 6 14
49 105 225 7 15
64 128 256 8 16
a =
-0.0139 0.0278 -0.0139 -0.2361 0.2361
Considere that the matlab code is absolutely true
and i translate this to :
x=[0,1,3,4,5,6,7,8]
y=[9,10,11,12,13,14,15,16]
X=np.array([x*x,x*y,y*y,x,y]).T
a=np.sum(X)/np.dot(X.T,X)#line with the probleme
X is the same
But i get (5,5) matrix on a
Probleme come from the mult beetwen X.T and X i think, i'll try np.matmul, np.dot, transpose and T and i don't know why i can't get a (1,5) or (5,1) vector... what is wrong is the translation beetwen those 2 langage on the a calculation
Any Suggestions ?
The division of such two matrices in MATLAB:
s = sum(X)
XX = (X'*X)
a = s / XX
is solving for t the linear system: XX * t = s.
To achieve the same in Python/NumPy, just use np.linalg.solve() (making sure to use np.sum() with the correct axis parameter to mimic the same behavior as MATLAB's sum(), as indicated in the comments and #AnderBiguri's answer):
x=np.array([0,1,3,4,5,6,7,8])
y=np.array([9,10,11,12,13,14,15,16])
X=np.array([x*x,x*y,y*y,x,y]).T
s = np.sum(X, 0)
XX = np.dot(X.T, X)
a = np.linalg.solve(XX, s)
print(a)
# [-0.01388889 0.02777778 -0.01388889 -0.23611111 0.23611111]
The issue is sum.
In MATLAB, default sum sums over the first axis. In numpy sum sums all the values.
a=np.sum(X, axis=0)/np.dot(X.T,X)
This question is a generalized version of a question which I have asked before:
Reshaping a Numpy Array into lexicographical list of cubes of shape (n, n, n)
The question is, given an nd-array of shape (x, y, z) and a query window (p, q), with the restriction that x % p == 0 and y % q == 0, how do I transpose the matrix in such a way that it has shape (p, q, -1) and maintains the ordering proposed in the original question. The idea is that I can quickly take slices of a specific shape instead of having to iterate to the relevant indices.
In the original post, this answer was proposed:
N = 4
a = np.arange(N**3).reshape(N,N,N)
b = a.reshape(2,N//2,2,N//2,N).transpose(1,3,0,2,4).reshape(N//2,N//2,N*4)
with output:
print(b):
[[[ 0 1 2 3 8 9 10 11 32 33 34 35 40 41 42 43]
[ 4 5 6 7 12 13 14 15 36 37 38 39 44 45 46 47]]
[[16 17 18 19 24 25 26 27 48 49 50 51 56 57 58 59]
[20 21 22 23 28 29 30 31 52 53 54 55 60 61 62 63]]]
This would correspond to input shape (4, 4, 4), query shape (2, 2) and output shape (2, 2, -1).
The accepted answer in the original question is close to what I need, but its output shape is dependent on the shape of the nd-array. That is not the behavior that I am looking for as I'd like to use any query shape (p, q) for any input shape (x, y, z).
I am not very proficient in using Numpy transpose to implement these kinds of operations (I have tried to use this answer and generalize its myself without success), so it would be greatly appreciated if, when answered, the answer could be supplemented with a bit of an explanation about the approach which the answerer took or point to some resources which could help me out with this!
Hope that makes it clear!
It can be just a simple modification modified, think (p,q) = (2,2) in this case. So something like this:
a.reshape(p, x//p, q, y//q, -1).transpose(3,1,2,0,4).reshape(p,q,-1)
I am trying to concatenate 4 numpy matrices along the x axis. Below is the code I have written.
print(dt.shape)
print(condition.shape)
print(uc.shape)
print(rt.shape)
x = np.hstack((dt, condition, uc, rt))
print(x.shape)
I am getting the following output.
(215063, 1)
(215063, 1112)
(215063, 1)
(215063, 1)
I am getting the following error.
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 2 dimension(s) and the array at index 1 has 1 dimension(s)
Final output should be
(215063, 1115)
I shall recommend you to use numpy concatenate. I used this to merge two images in a single image.It provides you option to concatenate in either of the two axes X and Y. For more info on this visit this link
Your code is OK. To confirm it, I performed the following test
on smaller arrays:
dt = np.arange(1,6).reshape(-1,1)
condition = np.arange(11,41).reshape(-1,6)
uc = np.arange(71,76).reshape(-1,1)
uc = np.arange(81,86).reshape(-1,1)
print(dt.shape, condition.shape, uc.shape, rt.shape)
x = np.hstack((dt, condition, uc, rt))
print(x.shape)
print(x)
and got:
(5, 1) (5, 6) (5, 1) (5, 1)
(5, 9)
[[ 1 11 12 13 14 15 16 81 41]
[ 2 17 18 19 20 21 22 82 42]
[ 3 23 24 25 26 27 28 83 43]
[ 4 29 30 31 32 33 34 84 44]
[ 5 35 36 37 38 39 40 85 45]]
So probably there is something wrong with your data.
Attempt to run np.hstack on the above set of arrays, dropping
each (one) of them in turn.
If in one case (without some array) the execution succeeds, then
the source of problem is just the array missing in this case.
Then you should look thoroughly at this array and find what is wrong with it.
I have an array which is 1 -> 160. I want to split this into 10 arrays that are split every sixteen numbers. This is what I have so far:
amplitude=[]
for i in range (0,160):
amplitude.append(i+1)
print(amplitude)
#split arrays up into a line for each sample
traceno=10 #number of traces in file
samplesno=16 #number of samples in each trace. This wont change.
amplitude_split=np.zeros((traceno,samplesno) ,dtype=np.int)
#fill in the arrays with amplitude/sample numbers
for i in range(len(amplitude)):
for j in range(traceno):
for k in range(samplesno):
amplitude_split[j,k]=amplitude[i]
print(amplitude_split[1,:])
As an output I only get [160 160 160 160 160 160 160 160 160 160 160 160 160 160 160 160]
Where I require something along the lines of:
[1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16]
[17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32]
etc...
You are nesting the loops. So you consistently fill the new array with the same number from the first one, and end with the last one 160 repeated everywhere.
You only need to copy the list into a 1D numpy array, and then reshape it:
amplitude_split=np.array(amplitude, dtype=np.int).reshape((traceno,samplesno))
Well, if we're using Numpy arrays, we can use Numpy functionality:
amplitude = np.arange(1, 161)
amplitude_split = amplitude.reshape(10, 16)
Otherwise, you've already been linked to how to do it for plain lists, but I'd like to point out that you still don't need a loop to fill amplitude in the first place:
amplitude = list(range(1, 161))
In general, with Python you should be trying hard not to think in terms of starting with an initially blank "storage" area that you then fill in. Just create the data you want directly - by conversions of the sort above, by list comprehensions etc., or if necessary by .append() ing - rather than overwriting a dummy value.
See grouper in https://docs.python.org/2/library/itertools.html#recipes
def grouper(iterable, n, fillvalue=None):
"Collect data into fixed-length chunks or blocks"
# grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
args = [iter(iterable)] * n
return izip_longest(fillvalue=fillvalue, *args)