running PCA analysis matplotlib results print <matplotlib.mlab.PCA instance at 0xffa4ee6c> - python

I am trying to do a PCA analysis of some data using the matplotlib function however when I run it and try to print the results this gets printed as the results
import numpy as np
import os
import matplotlib
from matplotlib.mlab import PCA
x=np.zeros((62,2))
a=np.genfromtxt('1.txt').T[3] #list 62numbers
#print a
x[:,0]=a
print x[:,0]
b=np.genfromtxt('2.txt').T[3] #list 4numbers
x[:,1]=b
#print x
results=PCA(x)
print results
the result that gets printed is matplotlib.mlab.PCA instance at 0xffa4ee6c why is that?

If you look at the documentation for matplotlib.mlab.PCA you see from the header
class matplotlib.mlab.PCA(a)
that it is actually a class you are dealing with. When you do PCA(x) you are creating an instance of that class, and when you print it in the following line you are told that you have got yourself an instance of matplotlib.mlab.PCA.
You can confirm this by printing the output of dir(results), where you'll see what attributes you have on the object. It can helpful to determine what object you are dealing with.
What you want to do here is to use the attributes of this object you've got. For instance,
print results.Y # print the projection in PCA space

Related

wavedec does not returning any coefficients in python using pywt library

i used wavelet decomposition command in python using pywt library but it does not return any coefficients. my code is given below .
import numpy as np
import pywt as pywt
(e,f)=pywt.wavedec(y,'db12' ,level=2)
print("e:"+str(e))
print("f:"+str(f))
I also tried with pywt.dwt(y,' db12', level=2) it is also not returning any coefficients
it returns a null output, where y is a matrix contains my input
I tried reproducing your results with a random (discrete) signal like so:
import numpy as np
import pyw
x = np.random.randint(0,100,500)
y = pywt.wavedec(x, 'db12', level=2)
(e,f) = pywt.dwt(x, 'db12')
I noticed two things: For a 1D signal, wavedec returns more than two coefficient arrays, as also mentioned in the docs. Similarly, the dwt function does not know the keyword level=, but works just fine with the command specified above.
Hope that helps

Plotting pandas dataframe and multiprocessing in Python

I have a pandas dataframe and I want to plot slices of it, in a function using multiprocessing. Even though the function "process_expression" works when I call it independently, when I use the "multiprocessing" option it is not giving any plots.
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import scipy
import seaborn as sns
import sys
from multiprocessing import Pool
import os
os.system("taskset -p 0xff %d" % os.getpid())
pool = Pool()
gn = pool.map(process_expression, gene_ids)
pool.close()
pool.join()
def process_expression(gn_name, df_gn=df_coding):
df_part = df_gn.loc[df_gn['Gene_id'] == gn_name]
df_part = df_part.drop('Gene_id', 1)
df_part = df_part.drop('Transcript_biotype', 1)
COUNT100 = df_part[df_part >100 ].count()
COUNT10 = (df_part[df_part >10 ].count()) - COUNT100
COUNT1 = (df_part[df_part >1].count())- COUNT100 - COUNT10
COUNT0 = (df_part[df_part >0].count())- COUNT100-COUNT10- COUNT1
result = pd.concat([COUNT0,COUNT1,COUNT10,COUNT100], axis=1)
result.columns = [ '0 TO 1', '1 TO 10','10 TO 100', '>100']
result.plot( kind='bar', figsize=(50, 20), fontsize=7, stacked=True)
plt.savefig('./expression_levels/all_genes/'+gn_name+'.png')#,bbox_inches='tight')
plt.close()
the df_coding table is something like (it has more columns, I erased some):
Isoform_name,heart,heart.1,lung.3,Gene_id,Transcript_biotype
ENST00000296782,0.14546900000000001,0.161245,0.09479889999999999,ENSG00000164327,protein_coding
ENST00000357387,6.53902,5.86969,7.057689999999999,ENSG00000164327,protein_coding
ENST00000514735,0.0,0.0,0.0,ENSG00000164327,protein_coding
The input dataframe df_coding is a dataframe with a column Gene_id. In this column I have a list of gn_name. What I want is to take each time only the parts of the dataframe which have the name gn_name[i] in the Gene_id column and plot a barplot based on this dataframe.
For example if I call the 'process_expression('ENSG00000164327')', which is a specific gn_name, the output is something like this:
What am I doing wrong? I know that the process stops at the plotting command when I run it with multiprocessing.
The problem is between multiprocessing and matplotlib. With multiprocessing you create a completely new context with each process. The new context does not (and can not) successfully initialize the context because it is already initialized in the parent process.
If you are trying to overcome a performance issue then you may be on the right track. However, plotting back to the correctly initialized context of the parent process will require you to go a lot deeper into the structure of the underlying matplotlib guts. Here is an example of setting a data pipe back to the original application. Really this is only going to help if you are dealing with a lot of processing of the data before it is plotted. It doesn't look like that is what you are doing here.
If you are trying to get a visual effect like stacked / overlayed results then you probably want to look into repeating the plot function or modifying the data structure to better represent what you want to visualize.
So. What problem are you trying to solve? A performance problem, or a visualization problem? If it is a visualization problem then you do NOT want to use multiprocessing.

AttributeError: function' object has no attribute 'linreg'

I am new to python and programming and am working my way through "Machine Learning: An Algorithmic Perspective." I was told to normalise data, seperate it into training and testing data, recover the beta vector, and then use the sum-of-least squares error. I keep getting,
File "/Users/shaune/Dropbox/Shaune PhD/auto-mpg.py", line 34, in
beta=linreg.linreg(trainin,traintgt)
AttributeError: 'function' object has no attribute 'linreg'
when running the following:
import os
import pylab as pl
import numpy as np
from pylab import *
from numpy import *
import linreg
os.chdir('/Users/shaune/Dropbox/Shaune PhD')
auto=np.loadtxt('auto-mpg.data.txt',comments='"')
#normalise the data
auto=(auto-auto.mean(axis=0))/auto.var(axis=0)
#seperate the training and testing data
trainin=auto[::2,:8]
testin=auto[1::2,:8]
traintgt=auto[::2,1:2]
testtgt=auto[1::2,1:2]
#recover the beta vector
def linreg(trainin,traintgt):
trainin=np.concatenate((trainin,-np.ones((np.shape(trainin)[0],1))),axis=1)
beta=np.dot(np.dot(np.linalg.inv(np.dot(np.transpose(trainin),trainin)),np.transpose(trainin)),traintgt)
traintgt=np.dot(trainin, beta)
#sum of squares error to get predicted values on test set (want small values)
beta=linreg.linreg(trainin,traintgt)
testin=concatenate((testin,-np.ones((np.shape(testin)[0],1))),axis=1)
testout=dot(testin,beta)
error=sum((testout-testtgt)**2)
print error
Please help! Thanks.
The definition of this function
def linreg(trainin,traintgt):
is overwriting the name linreg that you imported with
import linreg
Rename the function. The comment says recover the beta vector, so perhaps a better name is recover_beta. That is, change the def statement to
def recover_beta(trainin,traintgt):
You'll probably want to add a return statement to the function while you are at it. Currently it doesn't return anything.

Creating movie from a series of matplotlib plots using matplotlib.animation

I have a script which generates a series of time dependent plots. I'd like to "stitch" these together to make a movie.
I would preferably like to use matplotlib.animation. I have looked at examples from matplotlib documentation but I can't understand how it works.
My script currently makes 20 plots at successive time values and saves these as 00001.png up to 000020.png:
from scipy.integrate import odeint
from numpy import *
from math import cos
import pylab
omega=1.4
delta=0.1
F=0.35
def f(initial,t):
x,v=initial
xdot=v
vdot=x-x**3-delta*v-F*cos(omega*t)
return array([xdot,vdot])
T=2*pi/omega
nperiods = 100
totalsteps= 1000
small=int((totalsteps)/nperiods)
ntransients= 10
initial=[-1,0]
kArray= linspace(0,1,20)
for g in range (0,20):
k=kArray[g]
x,v=initial
xpc=[]
vpc=[]
if k==0:
x,v=x,v
else:
for i in range(1,nperiods)
x,v=odeint(f,[x,v],linspace(0,k*T,small))[-1] )
for i in range (1,nperiods):
x,v=odeint(f,[x,v],linspace(k*T,T+k*T,small))[-1]
xpc.append(x)
vpc.append(v)
xpc=xpc[ntransients:]
vpc=vpc[ntransients:]
pylab.figure(17.8,10)
pylab.scatter(xpc,vpc,color='red',s=0.2)
pylab.ylim([-1.5,1.5])
pylab.xlim([-2,2])
pylab.savefig('0000{0}.png'.format(g), dpi=200)
I'd appreciate any help. Thank you.
I think matplotlib.animation.FuncAnimation is what you're looking for. Basically, it repeatedly calls a defined function, passing in (optional) arguments as needed. This is exactly what you're already doing in your for g in range(0,20): code. You can also define an init function to get things set up. Check out the base class matplotlib.animation.Animation for more info on formats, saving, the MovieWriter class, etc.

Use pylab to plot image returned from Scipy

I'm working to migrate from MatLab to python in Sage.
So I use these commands and I faced this error in Sage:
from scipy import misc
l = misc.lena();
import pylab as pl
pl.imshow(l)
The Error or message (i don't know) is:
matplotlib.image.AxesImage object at 0xb80198c
And it doesn't show any image
It's not an error, just print the object that method returned.
There are two ways to show the figure:
Add pl.show() after calling pl.imshow(l)
Use ipython --pylab to open your python shell,
That is an object being returned from pylab after using the "imshow" command. That is the location of the Axes image object.
documentation:
http://matplotlib.sourceforge.net/api/pyplot_api.html#matplotlib.pyplot.imshow
Looks like it says it displays the object to the current axes. If you havent already created a plot I imagine you wont see anything
Simple google search suggests this might be what you are looking for
http://docs.scipy.org/doc/scipy/reference/generated/scipy.misc.lena.html
from scipy import misc
l = misc.lena();
import pylab as pl
pl.imshow(l)
####use this
pl.show()

Categories