How to plot eigenvalues representing symbolic functions in Python? - python

I need to calculate the eigenvalues of an 8x8-matrix and plot each of the eigenvalues for a symbolic variable occuring in the matrix. For the matrix I'm using I get 8 different eigenvalues where each is representing a function in "W", which is my symbolic variable.
Using python I tried calculating the eigenvalues with Scipy and Sympy which worked kind of, but the results are stored in a weird way (at least for me as a newbie not understanding much of programming so far) and I didn't find a way to extract just one eigenvalue in order to plot it.
import numpy as np
import sympy as sp
W = sp.Symbol('W')
w0=1/780
wl=1/1064
# This is my 8x8-matrix
A= sp.Matrix([[w0+3*wl, 2*W, 0, 0, 0, np.sqrt(3)*W, 0, 0],
[2*W, 4*wl, 0, 0, 0, 0, 0, 0],
[0, 0, 2*wl+w0, np.sqrt(3)*W, 0, 0, 0, np.sqrt(2)*W],
[0, 0, np.sqrt(3)*W, 3*wl, 0, 0, 0, 0],
[0, 0, 0, 0, wl+w0, np.sqrt(2)*W, 0, 0],
[np.sqrt(3)*W, 0, 0, 0, np.sqrt(2)*W, 2*wl, 0, 0],
[0, 0, 0, 0, 0, 0, w0, W],
[0, 0, np.sqrt(2)*W, 0, 0, 0, W, wl]])
# Calculating eigenvalues
eva = A.eigenvals()
evaRR = np.array(list(eva.keys()))
eva1p = evaRR[0] # <- this is my try to refer to the first eigenvalue
In the end I hope to get a plot over "W" where the interesting range is [-0.002 0.002]. For the ones interested it's about atomic physics and W refers to the rabi frequency and I'm looking at so called dressed states.

You're not doing anything incorrectly -- I think you're just caught up since your eigenvalues look so jambled and complicated.
import numpy as np
import sympy as sp
import matplotlib.pyplot as plt
W = sp.Symbol('W')
w0=1/780
wl=1/1064
# This is my 8x8-matrix
A= sp.Matrix([[w0+3*wl, 2*W, 0, 0, 0, np.sqrt(3)*W, 0, 0],
[2*W, 4*wl, 0, 0, 0, 0, 0, 0],
[0, 0, 2*wl+w0, np.sqrt(3)*W, 0, 0, 0, np.sqrt(2)*W],
[0, 0, np.sqrt(3)*W, 3*wl, 0, 0, 0, 0],
[0, 0, 0, 0, wl+w0, np.sqrt(2)*W, 0, 0],
[np.sqrt(3)*W, 0, 0, 0, np.sqrt(2)*W, 2*wl, 0, 0],
[0, 0, 0, 0, 0, 0, w0, W],
[0, 0, np.sqrt(2)*W, 0, 0, 0, W, wl]])
# Calculating eigenvalues
eva = A.eigenvals()
evaRR = np.array(list(eva.keys()))
# The above is copied from your question
# We have to answer what exactly the eigenvalue is in this case
print(type(evaRR[0])) # >>> Piecewise
# Okay, so it's a piecewise function (link to documentation below).
# In the documentation we see that we can use the .subs method to evaluate
# the piecewise function by substituting a symbol for a value. For instance,
print(evaRR[0].subs(W, 0)) # Will substitute 0 for W
# This prints out something really nasty with tons of fractions..
# We can evaluate this mess with sympy's numerical evaluation method, N
print(sp.N(evaRR[0].subs(W, 0)))
# >>> 0.00222190090611143 - 6.49672880062804e-34*I
# That's looking more like it! Notice the e-34 exponent on the imaginary part...
# I think it's safe to assume we can just trim that off.
# This is done by setting the chop keyword to True when using N:
print(sp.N(evaRR[0].subs(W, 0), chop=True)) # >>> 0.00222190090611143
# Now let's try to plot each of the eigenvalues over your specified range
fig, ax = plt.subplots(3, 3) # 3x3 grid of plots (for our 8 e.vals)
ax = ax.flatten() # This is so we can index the axes easier
plot_range = np.linspace(-0.002, 0.002, 10) # Range from -0.002 to 0.002 with 10 steps
for n in range(8):
current_eigenval = evaRR[n]
# There may be a way to vectorize this computation, but I'm not familiar enough with sympy.
evaluated_array = np.zeros(np.size(plot_range))
# This will be our Y-axis (or W-value). It is set to be the same shape as
# plot_range and is initally filled with all zeros.
for i in range(np.size(plot_range)):
evaluated_array[i] = sp.N(current_eigenval.subs(W, plot_range[i]),
chop=True)
# The above line is evaluating your eigenvalue at a specific point,
# approximating it numerically, and then chopping off the imaginary.
ax[n].plot(plot_range, evaluated_array, "c-")
ax[n].set_title("Eigenvalue #{}".format(n))
ax[n].grid()
plt.tight_layout()
plt.show()
And as promised, the Piecewise documentation.

Related

Torch tensor randomly contains huge, impossible values

I'm new to PyTorch and I'm trying to debug my code using IntelliJ PyCharm. I have a line that logs the content of a torch.IntTensor
logger.debug(f"action_tensor = {action_tensor}")
Most of the time this seems to work just fine, but occasionally the print out shows one or several huge values in the tensor, such as:
2021-08-06 09:21:17,737 DEBUG main.py state_tensor = tensor([2089484293, 0, 0, 1, 0, 1,
1, 0, 0, 0, 0, 0,
0, 1, 1, 1, 1, 1,
1, 1, 1, 2, 2, 0,
0, 0, 0, 0, 0, 6,
0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0],
dtype=torch.int32)
The tensor is created by extracting a few values from the state of an object
rolls = int(self.rolls)
allowed = [int(self.scorecard[c]["allowed"] == True) for c in self.scorecard]
scored = [int(self.scorecard[c]["score"]) if self.scorecard[c]["score"] else int(0) for c in self.scorecard]
return torch.cat([torch.IntTensor(rolls),
torch.IntTensor(allowed),
torch.IntTensor(scored)])
I've checked multiple times, and there is no way any of these values are as large as the example above (e.g. 2089484293). I've tried just creating a numpy array instead of a tensor, and print that shows no problems. I'm suspecting there is something I don't know about how torch.IntTensor.
What is wrong with the way I create my tensor that results in these huge values appearing sometimes?

Can one hardcode convolutional filters to detect characters in a CNN?

In Pytorch, you can hardcode your filters to be whatever you like.
At the moment, I'm doing text detection and I need to identify the location of a certain information. This information always starts with the letter 'X'. Could this radically improve detection performance if I hardcode the 'X' filter?
Here's what I have so far:
import torch
import torch.nn as nn
import matplotlib.pyplot as plt
kernel = (torch.zeros((9, 9)) + \
torch.eye(9) + \
torch.rot90(torch.eye(9))).type(torch.bool)*1
print(kernel)
tensor([[1, 0, 0, 0, 0, 0, 0, 0, 1],
[0, 1, 0, 0, 0, 0, 0, 1, 0],
[0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 1, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 1, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 1, 0],
[1, 0, 0, 0, 0, 0, 0, 0, 1]])
We can visualize it like this:
plt.imshow(kernel)
plt.show()
Then, we can set the filter weights as such:
conv = nn.Conv2d(in_channels=1,
out_channels=1,
kernel_size=3,
stride=3,
bias=None)
conv.weight.data = kernel
No, I do not think this will improve detection performance.
Detection performance is usually known as "inference," that is, it is the process of running your network on new data where the training labels are unknown. Hard-coding the weights will make absolutely no difference on the test performance of the network, as you still need to compute the convolutions.
We could also ask if it will improve the training performance. Here, too, I expect that the answer is no. One of the reasons that neural networks achieve the high accuracy they do is that they pick up on subtle patterns in the training data. A real x on a real page is very unlikely to align with the pixels you set to 1 in your example. Slight rotations or sub-pixel shifts or even different aspect ratios of the letter will change what the optimal filter will look like.
Indeed, one of the major changes in Machine Learning as we move into the Deep Learning era is that neural networks do a better job of picking the low-level features than a human engineer could do.
But thank you for the question -- just the code snippet of how to hard-code the value of a layer was useful to me!

Crop empty arrays (padding) from a volume

What I want to do is crop a volume to remove all irrelevant data. For example, say I have a 100x100x100 volume filled with zeros, except for a 50x50x50 volume within that is filled with ones.
How do I obtain the cropped 50x50x50 volume from the original ?
Here's the naive method I came up with.
import numpy as np
import tensorflow as tf
test=np.zeros((100,100,100)) # create an empty 100x100x100 volume
rand=np.random.rand(66,25,34) # create a 66x25x34 filled volume
test[10:76, 20:45, 30:64] = rand # partially fill the empty volume
# initialize the cropping coordinates
minx=miny=minz=0
maxx=maxy=maxz=0
maxx,maxy,maxz=np.subtract(test.shape,1)
# compute the optimal cropping coordinates
dimensions=test.shape
while(tf.reduce_max(test[minx,:,:]) == 0): # check for empty slices along the x axis
minx+=1
while(tf.reduce_max(test[:,miny,:]) == 0): # check for empty slices along the y axis
miny+=1
while(tf.reduce_max(test[:,:,minz]) == 0): # check for empty slices along the z axis
minz+=1
while(tf.reduce_max(test[maxx,:,:]) == 0):
maxx-=1
while(tf.reduce_max(test[:,maxy,:]) == 0):
maxy-=1
while(tf.reduce_max(test[:,:,maxz]) == 0):
maxz-=1
maxx,maxy,maxz=np.add((maxx,maxy,maxz),1)
crop = test[minx:maxx,miny:maxy,minz:maxz]
print(minx,miny,minz,maxx,maxy,maxz)
print(rand.shape)
print(crop.shape)
This prints:
10 20 30 76 45 64
(66, 25, 34)
(66, 25, 34)
, which is correct. However, it takes too long and is probably suboptimal. I'm looking for better ways to achieve the same thing.
NB:
The subvolume wouldn't necessarily be a cuboid, it could be any shape.
I want to keep gaps within the subvolume, only remove what's "outside" the shape to be cropped.
(Edit)
Oops, I hadn't seen the comment about keeping the so-called "gaps" between elements! This should be the one, finally.
def get_nonzero_sub(arr):
arr_slices = tuple(np.s_[curr_arr.min():curr_arr.max() + 1] for curr_arr in arr.nonzero())
return arr[arr_slices]
While you wait for a sensible response (I would guess this is a builtin function in an image processing library somewhere), here's a way
y, x = np.where(np.any(test, 0))
z, _ = np.where(np.any(test, 1))
test[min(z):max(z)+1, min(y):max(y)+1, min(x):max(x)+1]
I think leaving tf out of this should up your performance.
Explanation (based on 2D array)
test = np.array([
[0, 0, 0, 0, 0, ],
[0, 0, 1, 2, 0, ],
[0, 0, 3, 0, 0, ],
[0, 0, 0, 0, 0, ],
[0, 0, 0, 0, 0, ],
])
We want to crop it to get
[[1, 2]
[3, 0]]
np.any(..., 0) this will 'iterate' over axis 0 and return True if any of the elements in the slice are truthy. I show the result of this in the comments here:
np.array([
[0, 0, 0, 0, 0, ], # False
[0, 0, 1, 2, 0, ], # True
[0, 0, 3, 0, 0, ], # True
[0, 0, 0, 0, 0, ], # False
[0, 0, 0, 0, 0, ], # False
])
i.e. it returns np.array([False, True, True, False, False])
np.any(..., 1) does the same as step 2 but over axis 1 instead of axis zero i.e.
np.array([
[0, 0, 0, 0, 0, ],
[0, 0, 1, 2, 0, ],
[0, 0, 3, 0, 0, ],
[0, 0, 0, 0, 0, ],
[0, 0, 0, 0, 0, ],
# False False True True False
])
Note that in the case of a 3D array, these steps return 2D arrays
(x,) = np.where(...) this returns the index values of the truthy values in an array. So np.where([False, True, True, False, False]) returns (array([1, 2]),). Note that this is a tuple so in the 2D case we would need to call (x,) = ... so x is just the array array([1, 2]). The syntax is nicer in the 2D case as we can use tuple-unpacking i.e x, y = ...
Note that in the 3D case, np.where can give us the value for 2 axes at a time. I chose to do x-y in one go and then z-? in the second go. The ? is either x or y, I can't be bothered to work out which and since we don't need it I throw it away in a variable named _ which by convention is a reasonable place to store junk output you don't actually want. Note I need to do z, _ = as I want the tuple-unpacking and not just z = otherwise z become the tuple with both arrays.
Well, this step is pretty much the same as what you did at the end of your answer so I assume you understand it. Simple slicing in each dimension from the first element with a value in that dimension to the last. You need the + 1 because slicing in python are not inclusive of the index after the :.
Hopefully that's clear?

How do I perform dimensionality reduction on two independent XOR gates?

Take the probability distribution of a XOR gate in which every configuration is equally probable (configurations are given by outcomes_sub; the probability mass function by pmf_xor_sub):
import numpy as np
import itertools as it
outcomes_sub = [list(item) for item in list(it.product([0,1], repeat=3))]
pmf_xor_sub = np.array([1/4, 0, 0, 1/4, 0, 1/4, 1/4, 0])
Now take the probability distribution corresponding to two uncorrelated such XORs:
outcomes = [outcome1 + outcome2 for (outcome1, outcome2)
in it.product(outcomes_sub, outcomes_sub)]
pmf_xor = [pmf1 * pmf2 for (pmf1, pmf2)
in it.product(pmf_xor_sub, pmf_xor_sub)]
And create some data based on it:
indices = np.random.choice(len(outcomes), 10000, p=pmf_xor)
data_xor = np.array([outcomes[index] for index in indices])
data_xor looks like this:
array([[1, 1, 0, 0, 0, 0],
[1, 0, 1, 0, 0, 0],
[0, 1, 1, 1, 1, 0],
...,
[0, 1, 1, 1, 1, 0],
[1, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0]])
I.e., two independent XORs back to back. What's the right way to perform dimensionality reduction on it? PCA won't work (because the dependence is non-linear, right?):
from sklearn import decomposition
pca_xor = decomposition.PCA()
pca_xor.fit(data_xor)
Now, pca_xor.explained_variance_ratio_ gives:
array([ 0.17145045, 0.17018817, 0.16758773, 0.16575979, 0.16410862,
0.16090524], dtype=float32)
No two components stand out. I understand that a non-linear method such as kernel PCA should work here, but I am struggling to find pointers to ways of applying it to my problem.
To give a bit more context: what I am actually after is ways to bring out the structure in data_xor: two big XOR blobs, each of which is composed of some finer-grained stuff. If I am going about it all wrong, feel free to point that out too.

How to optimize parameters for binomial log-likelihood in python/scipy?

I am converting some R code (not mine) for estimating the parameters of a choice model to Python. My Python version does not converge onto same parameters as the R version for some test data and I am not sure why.
The R code defines a log-likelihood function (L) and then uses the nlm() function to estimate the parameters:
L <- function(p, y1, m, i1, i0)
-sum(dbinom(y1, m, 1/(1 + i0 %*% p/i1 %*% p), log=TRUE))
out <- nlm(L, s, y1=y1, m=n, i1=idx1, i0=idx0)
For a set of test data this produces parameter estimates:
[1] 0.014302792 0.001703516 0.002347832 0.035365775 0.517465153 0.063503823 0.005776879
In python I have written (what I believe to be) an equivalent log-likelihood function (it returns the same values as R version for test parameters) and tried using scipy.optimize.minimize() in place of nlm():
def LL(p, *args):
y1=args[0]
m=args[1]
i1=args[2]
i0=args[3]
i0p=np.dot(i0,p)
i1p=np.dot(i1,p)
P=1/(1 + np.divide(i0p,i1p))
# y1 are observed successes in pairwise comparison experiment
# m the number of trials, P the probability of success in one trial.
# I'm fairly sure these inputs are the same in python and R versions
return -np.sum(stats.binom.logpmf(y1, m, P))
out = scipy.optimize.minimize(LL, s, args=(y1,n,idx1,idx0))
However, on running, minimize() seems to be unsuccessful:
out:
status: 2
success: False
njev: 21
nfev: 201
hess_inv: array([[1, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 0],
[0, 0, 0, 0, 0, 0, 1]])
fun: -273.75549396685
x: array([ 0.14285714, 0.14285714, 0.14285714, 0.14285714, 0.14285714,
0.14285714, 0.14285714])
message: 'Desired error not necessarily achieved due to precision loss.'
jac: array([ 27.99998093, -552.99998856, -500.49999237, 111.99997711,
671.99995422, 255.49996948, -14.00000381])
Other methods (e.g. 'Powell') report success but parameters are way off those from the example in R.
My questions are:
Elsewhere I've seen that 'Desired error not necessarily achieved due to precision loss.' is a result of badly behaved likelihood function - Can anyone tell is this is the case here? How might I fix it?
Should I try some of the other optimisation methods? They require derivatives to be passed to the minimise() method - How do I define the gradient (and if necessary hessian) for my LL function? I saw an example using statsmodel GenericLikelihoodModel but became confused about exog/endog...

Categories