Checking existence of an array inside an array of arrays python [duplicate] - python

This question already has answers here:
Python: Find number of occurrences of given array within two-dimensional array
(6 answers)
Closed 9 years ago.
I have a numpy array of arrays:
qv=array([[-1.075, -1.075, -3. ],
[-1.05 , -1.075, -3. ],
[-1.025, -1.075, -3. ],
...,
[-0.975, -0.925, -2. ],
[-0.95 , -0.925, -2. ],
[-0.925, -0.925, -2. ]])
And I want to determine if an array is contained in that 2-D array and return its index.
qt=array([-1. , -1.05, -3. ])
I can convert both arrays to lists and use the list.index() function:
qlist=qv.tolist()
ql=qt.tolist()
qindex=qlist.index(ql)
But I would like to avoid doing this because I think it will be a performance hit.

This should do the trick,
import numpy as np
np.where((qv == qt).all(-1))
Or
import numpy as np
tol = 1e-8
diff = (qv - qt)
np.where((abs(diff) < tol).all(-1))
The second method might be more appropriate when floating point precision issues come into play. Also, there might be a better approach if you have many qt to test against. For example scipy.spatial.KDTree.

Related

Filling a numpy zero array with distances between 2D coordinates [duplicate]

This question already has answers here:
Is it possible to speed up this loop in Python?
(6 answers)
Vectorized spatial distance in python using numpy
(1 answer)
Closed 3 years ago.
I think this has been asked before, but I'm trying to implement the following: I've got a list of tuples containing the 2D coordinates of the N particles that I have. I have defined a numpy.zeros((N,N)) array to store the distance between them. How could I do this the fastest?
Thanks in advance for any help! :)
Edited to add: I've already written a function to measure the distance between two tuples, and was wondering how to iterate it!
My distance measuring function:
def calc_distance(p1, p2):
distance = numpy.linalg.norm(p1 - p2)
return distance
Distance matrix is what you are looking for:
coords = [(0,0), (1,1), (3,2)]
from scipy.spatial import distance_matrix
distance_matrix(coords, coords)
Output:
array([[0. , 1.41421356, 3.60555128],
[1.41421356, 0. , 2.23606798],
[3.60555128, 2.23606798, 0. ]])

Markov Clustering in Python

As the title says, I'm trying to get a Markov Clustering Algorithm to work in Python, namely Python 3.7
Unfortunately, it's not doing much of anything, and it's driving me up the wall trying to fix it.
EDIT: First, I've made the adjustments to the main code to make each column sum to 100, even if it's not perfectly balanced. I'm going to try to account for that in the final answer.
To be clear, the biggest problem is that the numbers spiral out of control, into such easily-understandable numbers as 5.56268465e-309, and I don't know how to convert that into something understandable.
Here's the code so far:
import numpy as np
import math
## How far you'd like your random-walkers to go (bigger number -> more walking)
EXPANSION_POWER = 2
## How tightly clustered you'd like your final picture to be (bigger number -> more clusters)
INFLATION_POWER = 2
ITERATION_COUNT = 10
def normalize(matrix):
return matrix/np.sum(matrix, axis=0)
def expand(matrix, power):
return np.linalg.matrix_power(matrix, power)
def inflate(matrix, power):
for entry in np.nditer(transition_matrix, op_flags=['readwrite']):
entry[...] = math.pow(entry, power)
return matrix
def run(matrix):
#np.fill_diagonal(matrix, 1)
#print(matrix)
matrix = normalize(matrix)
print(matrix)
for _ in range(ITERATION_COUNT):
matrix = normalize(inflate(expand(matrix, EXPANSION_POWER), INFLATION_POWER))
return matrix
transition_matrix = np.array ([[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
[0.5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
[0.5,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
[0,0,0.34,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
[0,0,0.33,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
[0,0,0.33,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
[0,0,0,0.34,0,0,0,0,0,0,0,0,0,0,0,0,0.125,0],
[0,0,0,0.33,0,0,0.5,0,0,0,0,0,0,0,0,0,0.125,1],
[0,0,0,0.33,0,0,0.5,1,1,0,0,0,0,0,0,0,0.125,0],
[0,0,0,0,0.166,0,0,0,0,0,0,0,0,0,0,0,0.125,0],
[0,0,0,0,0.166,0,0,0,0,0.2,0,0,0,0,0,0,0.125,0],
[0,0,0,0,0.167,0,0,0,0,0.2,0.25,0,0,0,0,0,0.125,0],
[0,0,0,0,0.167,0,0,0,0,0.2,0.25,0.5,0,0,0,0,0,0],
[0,0,0,0,0.167,0,0,0,0,0.2,0.25,0.5,0,1,0,0,0.125,0],
[0,0,0,0,0.167,0,0,0,0,0.2,0.25,0,1,0,1,0,0.125,0],
[0,0,0,0,0,0.34,0,0,0,0,0,0,0,0,0,0,0,0],
[0,0,0,0,0,0.33,0,0,0,0,0,0,0,0,0,0.5,0,0],
[0,0,0,0,0,0.33,0,0,0,0,0,0,0,0,0,0.5,0,0]])
run(transition_matrix)
print(transition_matrix)
This is part of a uni assignment - I need to do this array both weighted and unweighted (though the weighted part can just wait until I've got the bloody thing working at all) any tips or suggestions?
Your transition matrix is not valid.
>>> transition_matrix.sum(axis=0)
>>> matrix([[1. , 1. , 0.99, 0.99, 0.96, 0.99, 1. , 1. , 0. , 1. ,
1. , 1. , 1. , 0. , 0. , 1. , 0.88, 1. ]])
Not only does some of your columns not sum to 1, some of them sum to 0.
This means when you try to normalize your matrix, you will end up with nan because you are dividing by 0.
Lastly, is there a reason why you are using a Numpy matrix instead of just a Numpy array, which is the recommended container for such data? Because using Numpy arrays will simplify some of the operations, such as raising each entry to a power. Also, there are some differences between Numpy matrix and Numpy array which can result in subtle bugs.

Distance with array of different sizes

I have an array with dimensions as such:
pos = np.array([[ 1.72, 2.56],
[ 0.24, 5.67],
[ -1.24, 5.45],
[ -3.17, -0.23],
[ 1.17, -1.23],
[ 1.12, 1.08]])
and I want to find the distance between each line of the array to an index point which would be
ref = np.array([1.22, 1.18])
I would thus have an array with 4 elements as an answer but I'm really confused as to the method of approaching this with only numpy as I've tried many ways yet the size of the ref array presents a challenge. Thanks for the help.
The expected answer is an array with 6 elements. The elements are approximately:
[ 1.468, 4.596, 4.928 , 4.611, 2.410, 0.141 ]
Using numpy and assuming Euclidean metric:
import numpy as np
np.linalg.norm(pos - ref, axis=1)
If you need a Python list (instead of numpy array), add .tolist() to the previous line:
np.linalg.norm(pos - ref, axis=1).tolist()

How to plot graph involving complex numbers?

I want to plot a graph of the magnitude of 1/(1+(i)(omega)(tau)) against frequency f, where i is the imaginary number, omega=(2)(pi)(f), tau is a constant. The following is the first part of the code:
import pylab as pl
import numpy as np
f=np.logspace(-2,4,10)
tau=1.0
omega=2*np.pi*f
y=np.complex(1,omega*tau)
print y
But I get this TypeError: only length-1 arrays can be converted to Python scalars. What's the problem? Why can't I put f (which is an array right?) to y? By the way, I am using enthought canopy.
One more question: What's the difference between pylab and matplotlib? Different modules? If I'm just plotting graphs, dealing with complex numbers and matrix, which one should I use?
You can't construct numpy arrays with np.complex. In python when you put a j after a number it makes it imaginary. Thus, to make complex arrays simply do:
y = 1 + omega * tau * 1j
This is a case of having to use np.vectorize. That is,
def main():
f = np.logspace(-2,4,10)
print(f)
tau=1.0
omega=2*np.pi*f
y=np.vectorize(complex)(1,omega*tau)
print (y)
, will return first:
[ 1.00000000e-02 4.64158883e-02 2.15443469e-01 1.00000000e+00
4.64158883e+00 2.15443469e+01 1.00000000e+02 4.64158883e+02
2.15443469e+03 1.00000000e+04]
And then return:
[ 1. +6.28318531e-02j 1. +2.91639628e-01j 1. +1.35367124e+00j
1. +6.28318531e+00j 1. +2.91639628e+01j 1. +1.35367124e+02j
1. +6.28318531e+02j 1. +2.91639628e+03j 1. +1.35367124e+04j
1. +6.28318531e+04j]

Adding 2D matrices as numpy arrays

I want to add two 3x2 matrices, g and temp_g.
Currently g is
[[ 2.77777778e+000 6.58946653e-039]
[ 4.96398713e+173 1.64736663e-039]
[ -1.88888889e+000 -3.29473326e-039]]
And temp_g is:
[[ -5.00000000e-01 -2.77777778e+00]
[ -1.24900090e-16 -4.44444444e-01]
[ 5.00000000e-01 1.88888889e+00]]
But when I do g = g + temp_g, and output g, I get this:
[[ 2.27777778e+000 -2.77777778e+000]
[ 4.96398713e+173 -4.44444444e-001]
[ -1.38888889e+000 1.88888889e+000]]
Maybe I'm having trouble understanding long float numbers... but is this what the result ought to be? I expected that g[0][0] would get added to temp_g[0][0], and g[0][1] to temp_g [0][1] and so on...
Your addition is working fine, but your two arrays have some seriously different orders of magnitude.
Taking for example 4.96398713e+173 - 1.24900090e-16, your first number is 189 orders of magnitude bigger than the second. Floating point numbers don't have this level or accuracy, you're talking about talking a number with ~170 0s at the end of it and adding a number along the lines of 0.00000000000000001249 to it.
I would suggest looking at this to see some of the limitations of floating point numbers (in all languages, not just necessarily Python).
The Decimal library can be used for handling numbers more accurately than floats.
import numpy as np
import decimal
a = decimal.Decimal(4.96398713e+173)
b = decimal.Decimal(1.24900090e-16)
print(a+b)
# 4.963987129999999822073620193E+173
# You can also set the dtype of your array to decimal.Decimal
a = np.array([[ 2.77777778e+000, 6.58946653e-039],
[ 4.96398713e+173, 1.64736663e-039],
[ -1.88888889e+000, -3.29473326e-039]],
dtype=np.dtype(decimal.Decimal))

Categories