I don't understand this question. Actually just this part;
"Given two vectors of length n that are represented with one-dimensional arrays"
I use two vectors but I don't know what value they have.
For example,
vector can be a = [1,2,3]
but I don't know exactly what are they? What do they have?
Maybe it is a = [3,4,5].
You don't need numpy do something as simple as this.
Instead just translate the formula into Python code:
import math
a = [1, 2, 3]
b = [3, 4, 5]
n = len(a)
# Compute Euclidean distance between vectors "a" and "b".
# First sum the squares of the difference of each component of vectors.
distance = 0
for i in range(n):
difference = a[i] - b[i]
distance += difference * difference
# The answer is square root of those summed differences.
distance = math.sqrt(distance)
print(distance) # -> 3.4641016151377544
Your task is to write code that computes the value if the vectors a and b are given. Your job is not to write down a number.
You could start with this:
distance = 0
for value in a:
[your code]
print(distance)
You could use numpy. Your so called vectors would then correspond to numpy arrays.
import numpy as np
np.sqrt(np.sum(np.power(a-b,2)))
You might need to add this before
a, b = np.array(a),np.array(b)
Related
This is an example of what I am trying to do. Suppose the following numpy array:
A = np.array([3, 0, 1, 5, 7]) # in practice, this array is a huge array of float numbers: A.shape[0] >= 1000000
I need the fastest possible way to get the following result:
result = []
for a in A:
result.append( 1 / np.exp(A - a).sum() )
result = np.array(result)
print(result)
>>> [1.58297157e-02 7.88115138e-04 2.14231906e-03 1.16966657e-01 8.64273193e-01]
Option 1 (faster than previous code):
result = 1 / np.exp(A - A[:,None]).sum(axis=1)
print(result)
>>> [1.58297157e-02 7.88115138e-04 2.14231906e-03 1.16966657e-01 8.64273193e-01]
Is there a faster way to get "result" ?
EDIT: yes, scipy.special.softmax did the trick
Rather than trying to compute each value by normalizing it in place (effectively adding up all the values, repeatedly for each value), instead just get the exponentials and then normalize once at the end. So:
raw = np.exp(A)
result = A / sum(A)
(In my testing, the builtin sum is over 2.5x as fast as np.sum for summing a small array. I did not test with larger ones.)
Yes: scipy.special.softmax did the trick
from scipy.special import softmax
result = softmax(A)
Thank you #j1-lee and #Karl Knechtel
I want to sum all the lines of one matrix hence, if I have a n x 2 matrix, the result should be a 1 x 2 vector with all rows summed. I can do something like that with np.sum( arg, axis=1 ) but I get an error if I supply a vector as argument. Is there any more general sum function which doesn't throw an error when a vector is supplied? Note: This was never a problem in MATLAB.
Background: I wrote a function which calculates some stuff and sums over all rows of the matrix. Depending on the number of inputs, the matrix has a different number of rows and the number of rows is >= 1
According to numpy.sum documentation, you cannot specify axis=1 for vectors as you would get a numpy AxisError saying axis 1 is out of bounds for array of dimension 1.
A possible workaround could be, for example, writing a dedicated function that checks the size before performing the sum. Please find below a possible implementation:
import numpy as np
M = np.array([[1, 4],
[2, 3]])
v = np.array([1, 4])
def sum_over_columns(input_arr):
if len(input_arr.shape) > 1:
return input_arr.sum(axis=1)
return input_arr.sum()
print(sum_over_columns(M))
print(sum_over_columns(v))
In a more pythonic way (not necessarily more readable):
def oneliner_sum(input_arr):
return input_arr.sum(axis=(1 if len(input_arr.shape) > 1 else None))
You can do
np.sum(np.atleast_2d(x), axis=1)
This will first convert vectors to singleton-dimensional 2D matrices if necessary.
The homework problem is written as follows:
Write a function called unitVec that determines a unit vector in the direction of the line that connects two points (A and B) in space. The function should take as input two vectors (lists), each with the coordinates of a point in space. The output should be a vector (list) with the components of the unit vector in the direction from A to B. If points A and B have two coordinates each (i.e., they lie in the x y plane), the output vector should have two elements. If points A and B have three coordinates each (i.e., they lie in general space), the output vector should have three elements.
I have basically the entire code written but cannot for the life of me figure out how to square each element in the list called connects[].
To calculate a unit vector the program will subtract the elements in vector B with the corresponding elements in vector A and create a new list (connects[]) with these values. Then each of these elements needs to be squared and they all need to be added together. Then the square root will be taken of this number and each element in connects[] will be divided by this number and stored in a new list which will be the unit vector.
I'm trying to add the squares of elements in connects[] by using the line
add = add + (connects[i]**2)
but I know this only returns the list twice. The rest of my code is fine I just need help squaring these elements.
from math import *
vecA = []
vecB = []
unitV = []
connects = []
vec = []
elements = int(input("How many elements will your vectors contain?"))
for i in range(0,elements):
A = float(input("Enter element for vector A:"))
vecA.append(A)
B = float(input("Enter element for vector B:"))
vecB.append(B)
def unitVec(vecA,vecB):
for i in range(0,elements):
unit = 0
add = 0
connect = vecB[i] - vecA[i]
connects.append(connect)
add = add + (connects[i]**2)
uVec = sqrt(add)
result = connects[i]/uVec
unitV.append(result)
return unitV
print("The unit vector connecting your two vectors is:",unitVec(vecA,vecB))
You need to change your function to the following:
def unitVec(vecA,vecB):
add = 0
for i in range(0, elements):
unit = 0
connect = vecB[i] - vecA[i]
connects.append(connect)
add = add + (connect**2)
uVec = sqrt(add)
unitV = [val/uVec for val in connects]
return unitV
You cannot do everything in a single for loop, since you need to add all the differences before being able to get the square root. Then you can divide the differences by this uVec.
python's list is for general use and its arithmetric operation is different from vector operation. for example, [1,2,3]*2 is replication operation instead of vector scalar multiplication such that result is [1,2,3,1,2,3] instead of [2,4,6].
I would use numpy array which is designed for numerical array and provide vector operations.
import numpy as np
a = [1,2,3]
# convert python list into numpy array
b = np.array(a)
# vector magnitude
magnitude = np.sqrt((b**2).sum()) # sqrt( sum(b_i^2))
# or
magnitude = (b**2).sum()**0.5 # sqrt( sum(b_i^2))
# unit vector calculation
unit_b = b/magnitude
Suppose I have a Numpy array, such as
rand = np.random.randn(6, 6)
I need the central four values in the array, since it has axes of even length. If it had been odd, such as 5 by 5, then there would only be one central value. What is the simplest/fastest/easiest way of retrieving these four entries? I can obtain them very crudely with indices, but I'm looking for a faster way than calling a bunch of functions and performing a bunch of calculations.
For example, consider the following:
array([[ 0.25659355, -0.75456113, 0.39467396, 0.50805361],
[-0.77218172, 1.00016061, -0.70389486, 1.67632146],
[-0.41106158, -0.63757421, 1.70390504, -0.79073362],
[-0.2016959 , 0.55316318, -1.55280823, 0.45740193]])
I want the following:
array([[1.00016061, -0.70389486],
[-0.63757421, 1.70390504]])
But not just for a 4 by 4 array - if it is even by even, I want the central four elements, as above.
Is something like this too complicated?
def get_middle(arr):
n = arr.shape[0] / 2.0
n_int = int(n)
if n % 2 == 1:
return arr[[n_int], [n_int]]
else:
return arr[n_int:n_int + 2, n_int:n_int + 2]
You can do this with a single slicing operation:
rand = np.random.randn(n,n)
# assuming n is even
center = rand[n/2-1:n/2+1, n/2-1:n/2+1]
I'm abusing order of operations by leaving out the parens, just to make it a little less messy.
Given array a:
import numpy as np
a = np.array([[ 0.25659355, -0.75456113, 0.39467396, 0.50805361],
[-0.77218172, 1.00016061, -0.70389486, 1.67632146],
[-0.41106158, -0.63757421, 1.70390504, -0.79073362],
[-0.2016959 , 0.55316318, -1.55280823, 0.45740193]])
The easiest way to get the central 4 values is:
ax, ay = a.shape
a[int(ax/2)-1:int(ax/2)+1, int(ay/2)-1:int(ay/2)+1]
This works if you have even numbers for the dimensions of the array. In case of odd numbers, there won't be a central 4 values.
Could you just use indexing? Like:
A = np.array([[ 0.25659355, -0.75456113, 0.39467396, 0.50805361],
[-0.77218172, 1.00016061, -0.70389486, 1.67632146],
[-0.41106158, -0.63757421, 1.70390504, -0.79073362],
[-0.2016959 , 0.55316318, -1.55280823, 0.45740193]])
])
A[1:3,1:3]
Or if matrix A had odd dimensions, say 5x5 then:
A[2,2]
So I feel like I might have coded myself into a corner -- but here I am.
I have created a dictionary of arrays (well specifically ascii Columns) because I needed to create five arrays performing the same calculation on an array with five different parameters (The calculation involved multiplying arrays and one of five arbitrary constants).
I now want to create an array where each element corresponds to the sum of the equivalent element from all five arrays. I'd rather not use the ugly for loop that I've created (it's also hard to check if i'm getting the right answer with the loop).
Here is a modified snippet for testing!
import numpy as np
from astropy.table import Column
from pylab import *
# The five paramaters for the Columns
n1 = [14.18,19.09,33.01,59.73,107.19,172.72] #uJy/beam
n2 = [14.99,19.04,32.90,59.99,106.61,184.06] #uJy/beam
n1 = np.array([x*1e-32 for x in n1]) #W/Hz
n2 = np.array([x*1e-32 for x in n2]) #W/Hz
# an example of the arrays being mathed upon
luminosity=np.array([2.393e+24,1.685e+24,2.264e+23,5.466e+22,3.857e+23,4.721e+23,1.818e+23,3.237e+23])
redshift = np.array([1.58,1.825,0.624,0.369,1.247,0.906,0.422,0.66])
field = np.array([True,True,False,True,False,True,False,False])
DMs = {}
for i in range(len(n1)):
DMs['C{0}'.format(i)]=0
for SC,SE,level in zip(n1,n2,DMs):
DMmax = Column([1 for x in redshift], name='DMmax')
DMmax[field]=(((1+redshift[field])**(-0.25))*(luminosity[field]/(4*pi*5*SE))**0.5)*3.24078e-23
DMmax[~field]=(((1+redshift[~field])**(-0.25))*(luminosity[~field]/(4*pi*5*SC))**0.5)*3.24078e-23
DMs[level] = DMmax
Thanks all!
Numpy was built for this! (provided all arrays are of the same shape)
Just add them, and numpy will move element-wise through the arrays. This also has the benefit of being orders of magnitude faster than using a for-loop in the Python layer.
Example:
>>> n1 = np.array([1,2,3])
>>> n2 = np.array([1,2,3])
>>> total = n1 + n2
>>> total
array([2,4,6])
>>> mask = np.array([True, False, True])
>>> n1[mask] ** n2[mask]
array([ 1, 27])
Edit additional input
You might be able to do something like this:
SE_array = (((1+redshift[field]) ** (-0.25)) * (luminosity[field]/(4*pi*5*n1[field])) ** 0.5) * 3.24078e-23
SC_array = (((1+redshift[field]) ** (-0.25)) * (luminosity[field]/(4*pi*5*n2[field])) ** 0.5) * 3.24078e-23
and make the associations by stacking the new arrays:
DM = np.dstack((SE_array, SC_array))
reshaper = DM.shape[1:] # take from shape (1, 6, 2) to (6,2), where 6 is the length of the arrays
DM = DM.reshape(reshaper)
This will give you a 2d array like:
array([[SE_1, SC_1],
[SE_2, SC_2]])
Hope this is helpful
If you can't just add the numpy arrays you can extract the creation of the composite array into a function.
def get_element(i):
global n1, n2, luminosity, redshift, field
return n1[i] + n2[i] + luminosity[i] + redshift[i] + field[i]
L = len(n1)
composite = [get_element(i) for i in range(L)]
The answer was staring at me in the face, but thanks to #willnx, #cricket_007, and #andrew-lavq. Your suggestions made me realise how simple the solution is.
Just add them, and numpy will move element-wise through the arrays. -- willnx
You need a loop to sum all values of a collection -- cricket_007
so it really is as simple as
sum(x for x in DMs.values())
I'm not sure if this is the fastest solution, but I think it's the simplest.