Convert vtkPoints to numpy array? - python

I am using Mayavi2 in a Python script to calculate 3d iso-surfaces. As a result I get a vtkPoints object. Now I want to convert this vtkPoints object ('vtkout' in the code sample below) to a simple numpy array with 3 lines containing all x, y and z values.
I get vtkout using a code like this:
import numpy
from enthought.mayavi import mlab
import array
randVol = numpy.random.rand(50,50,50) # fill volume with some random potential
X, Y, Z = numpy.mgrid[0:50, 0:50, 0:50] # grid
surf = mlab.contour3d(X, Y, Z, randVol, contours=[0.5]) # calc contour
vtkout = surf.contour.contour_filter.output.points # get the vtkPoints object
At the moment I use the following code to extract the points into an array:
pointsArray = numpy.zeros((3, vtkout.number_of_points))
for n in range(vtkout.number_of_points):
pointsArray[0,n] = vtkout[n][0]
pointsArray[1,n] = vtkout[n][1]
pointsArray[2,n] = vtkout[n][2]
I wonder if there is no general routine doing such conversions for me in a convenient, fast and safe way?

vtk_points.to_array() did not work for me (to_array() doesn't seem to exist in plain vtk).
What has actually worked in my case is using the numpy_support module:
from vtk.util import numpy_support
as_numpy = numpy_support.vtk_to_numpy(vtk_points.GetData())

As confirmed from comments on the original post, you might try:
vtkout.to_array().T
This is a direct method that does not require looping.

Related

Accessing (the right) data when using holoviews/bokeh

I am having difficulties accessing (the right) data when using holoviews/bokeh, either for connected plots showing a different aspect of the dataset, or just customising a plot with dynamic access to the data as plotted (say a tooltip).
TLDR: How to add a projection plot of my dataset (different set of dimensions and linked to main plot, like a marginal distribution but, you know, not restricted to histogram or distribution) and probably with a similar solution a related question I asked here on SO
Let me exemplify (straight from a ipynb, should be quite reproducible):
import numpy as np
import random, pandas as pd
import bokeh
import datashader as ds
import holoviews as hv
from holoviews import opts
from holoviews.operation.datashader import datashade, shade, dynspread, spread, rasterize
hv.extension('bokeh')
With imports set up, let's create a dataset (N target 10e12 ;) to use with datashader. Beside the key dimensions, I really need some value dimensions (here z and z2).
import numpy as np
import pandas as pd
N = int(10e6)
x_r = (0,100)
y_r = (100,2000)
z_r = (0,10e8)
x = np.random.randint(x_r[0]*1000,x_r[1]*1000,size=(N, 1))
y = np.random.randint(y_r[0]*1000,y_r[1]*1000,size=(N, 1))
z = np.random.randint(z_r[0]*1000,z_r[1]*1000,size=(N, 1))
z2 = np.ones((N,1)).astype(int)
df = pd.DataFrame(np.column_stack([x,y,z,z2]), columns=['x','y','z','z2'])
df[['x','y','z']] = df[['x','y','z']].div(1000, axis=0)
df
Now I plot the data, rasterised, and also activate the tooltip to see the defaults. Sure, x/y is trivial, but as I said, I care about the value dimensions. It shows z2 as x_y z2. I have a question related to tooltips with the same sort of data here on SO for value dimension access for the tooltips.
from matplotlib.cm import get_cmap
palette = get_cmap('viridis')
# palette_inv = palette.reversed()
p=hv.Points(df,['x','y'], ['z','z2'])
P=rasterize(p, aggregator=ds.sum("z2"),x_range=(0,100)).opts(cmap=palette)
P.opts(tools=["hover"]).opts(height=500, width=500,xlim=(0,100),ylim=(100,2000))
Now I can add a histogram or a marginal distribution which is pretty close to what I want, but there are issues with this soon past the trivial defaults. (E.g.: P << hv.Distribution(p, kdims=['y']) or P.hist(dimension='y',weight_dimension='x_y z',num_bins = 2000,normed=True))
Both are close approaches, but do not give me the other value dimension I'd like visualise. If I try to access the other value dimension ('x_y z') this fails. Also, the 'x_y z2' way seems very clumsy, is there a better way?
When I do something like this, my browser/notebook-extension blows up, of course.
transformed = p.transform(x=hv.dim('z'))
P << hv.Curve(transformed)
So how do I access all my data in the right way?

Making a density plot in Python from imported data file

I have a .dat file whose structure is given by three columns that I suppose to refer to be x, y and z = f(x,y), respectively.
I want to make a density plot out of this data. While looking for some example that could help me out, I came across the following posts:
How to plot a density map in python?
matplotlib plot X Y Z data from csv as pcolormesh
What I have tried so far is the following:
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
x, y, z = np.loadtxt('data.dat', unpack=True, delimiter='\t')
N = int(len(z)**.5)
z = z.reshape(N, N)
plt.imshow(z, extent=(np.amin(x), np.amax(x), np.amin(y), np.amax(y)),cmap=cm.hot)
plt.colorbar()
plt.show()
The file data can be accessed here: data.dat.
When I run the script above, it returns me the following error message:
cannot reshape array of size 42485 into shape (206,206)
Can someone help me to understand what I have done wrong and how to fix it?
The reason is that your data is not exactly 260*260, but your z is larger.
One option is to slice the z, but you are missing data when you are doing that.
And if that is what you want you are no longer using your x,y values.
z = z[:N**2].reshape(N,N)
In the link you posted I saw this statement:
I assume here that your data can be transformed into a 2d array by a simple reshape. If this is not the case than you need to work a bit harder on getting the data in this form.
The assumption does not hold for your data.

Plot using pandas

I have some event times in a list and I would like to plot an exponentially weighted moving average of them. I can do this using the following code.
import numpy as np
import matplotlib.pyplot as plt
print "Code runnning"
a=0.01
l = [3.0,7.0,10.0,20.0,200.0]
y = np.zeros(1000)
for item in l:
y[item]=1
s = np.zeros(1000)
x = np.linspace(0,1000,1000)
for i in xrange(1000):
s[i] = a*y[i-1]+(1-a)*s[i-1]
plt.plot(x, s)
plt.show()
This is clearly a horrible way to use python however. What's the right way to do this? Is it possible to do it without making all these extra sparse arrays?
The output should look like this.
Pandas comes to mind for this task:
import pandas as pd
l = [3.0,7.0,10.0,20.0,200.0]
s = pd.Series(np.ones_like(l), index=l)
y = s.reindex(range(1000), fill_value=0)
pd.ewma(y, 199).plot()
The period 199 is related to your parameter alpha 0.01 as n=2/(a+1). Result:
AFAIK there's not a very good way to do this with numpy or the scipy.sparse module -- the sparse matrices in scipy.sparse are designed to be 2D matrices, and to create one in the first place you'd basically need to use the code you've already written in your first loop (i.e., to set all of the nonzero locations in a sparse matrix), with the additional complexity of always having to specify two index values.
As if that's not bad enough, np.convolve doesn't work with sparse arrays, so you'd still need to write out the computation in your second loop to compute the moving average.
My recommendation, which probably isn't much help if you're looking for a fancy numpy version, is to fall back on Python's excellent support as a general-purpose language :
import matplotlib.pyplot as plt
a=0.01
l = set([3, 7, 10, 20, 200])
s = np.zeros(1000)
for i in xrange(len(s)):
s[i] = a * int(i-1 in l) + (1-a) * s[i-1]
plt.plot(s)
plt.show()
Here, I've stored the event index values in l, just as you did, but I used a set to make lookup times O(1) -- though if len(l) isn't very large, you might even be better off with a plain list or tuple, you'd need to measure it to be sure. Then you can avoid creating the y array and just rely on Iverson's convention to convert the Boolean value x in y into an int. You might not even need the explicit cast, but I find it helpful to be explicit.
I think you're looking for something like this:
import numpy as np
import matplotlib.pyplot as plt
from scikits.timeseries.lib.moving_funcs import mov_average_expw
l = [ 3.0, 7.0, 10.0, 20.0, 200.0 ]
y = np.zeros(1000)
y[[l]] = 1
emav = mov_average_expw(y, 199)
plt.plot(emav)
plt.show()
This makes use of mov_average_expw from scikits.timeseries. Check that method's documentation to see how I came up with the span parameter based on your code's a variable.

Matplotlib pcolor

I am using Matplotlib to create an image based on some data. All of the data falls in the range of 0 through to 1 and I am trying to color the data based on its value using a colormap and this works perfectly in Matlab, however when converting the code across to Python I simply get a black square as the output. I believe this is because I'm plotting the image wrong and so it is plotting all the data as 0. I have tried searching this problem for several hours and I have tried plt.set_clim([0, 1]) however that didn't seem to do anything. I am new to Python and Matplotlib, although I am not new to programming (Java, javascript, PHP, etc), but I cannot see where I am going wrong. If any body can see anything glaringly incorrect in my code then I would be extremely grateful.
Thank you
from numpy import *
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.colors as myColor
e1cx=[]
e1cy=[]
e1cz=[]
print("Reading files...")
in_file = open("eigenvector_1_component_x.txt", "rt")
for line in in_file.readlines():
e1cx.append([])
for i in line.split():
e1cx[-1].append(float(i))
in_file.close()
in_file = open("eigenvector_1_component_y.txt", "rt")
for line in in_file.readlines():
e1cy.append([])
for i in line.split():
e1cy[-1].append(float(i))
in_file.close()
in_file = open("eigenvector_1_component_z.txt", "rt")
for line in in_file.readlines():
e1cz.append([])
for i in line.split():
e1cz[-1].append(float(i))
in_file.close()
print("...done")
nx = 120
ny = 128
nz = 190
fx = zeros((nz,nx,ny))
fy = zeros((nz,nx,ny))
fz = zeros((nz,nx,ny))
z = 0
while z<nz-1:
x = 0
while x<nx:
y = 0
while y<ny:
fx[z][x][y]=e1cx[(z*128)+y][x]
fy[z][x][y]=e1cy[(z*128)+y][x]
fz[z][x][y]=e1cz[(z*128)+y][x]
y += 1
x += 1
z+=1
if((z % 10) == 0):
plt.figure(num=None)
plt.axis("off")
normals = myColor.Normalize(vmin=0,vmax=1)
plt.pcolor(fx[z][:][:],cmap='spectral', norm=normals)
filename = 'Imagex_%d' % z
plt.savefig(filename)
plt.colorbar(ticks=[0,2,4,6], format='%0.2f')
Although you have resolved your original issue and have code that works, I wanted to point out that both python and numpy provide several tools that make code like this much simpler to write. Here are a few examples:
Loading data
Instead of building up lists by appending to the end of an empty one, it is often easier to generate them from other lists. For example, instead of
e1cx = []
for line in in_file.readlines():
e1cx.append([])
for i in line.split():
e1cx[-1].append(float(i))
you can simply write:
e1cx = [[float(i) for i in line.split()] for line in in_file]
The syntax [x(y) for y in l] is known as a list comprehension, and, in addition to being more concise will execute more quickly than a for loop.
However, for loading tabular data from a text file, it is even simpler to use numpy.loadtxt:
import numpy as np
e1cx = np.loadtxt("eigenvector_1_component_x.txt")
for more information,
print np.loadtxt.__doc__
See also, its slightly more sophisticated cousin numpy.genfromtxt
Reshaping data
Now that we have our data loaded, we need to reshape it. The while loops you use work fine, but numpy provides an easier way. First, if you prefer to use your method of loading the data, then convert your eigenvector arrays into proper numpy arrays using e1cx = array(e1cx), etc.
The array class provides methods for rearranging how the data in an array is indexed without requiring it to be copied. The simplest method is array.reshape, which will do half of what your while loops do:
almost_fx = e1cx.reshape((nz,ny,nx))
Here, almost_fx is a rank-3 array indexed as almost_fx[iz,iy,ix]. One important thing to be aware of is that e1cx and almost_fx share their data. So, if you change e1cx[0,0], you will also change almost_fx[0,0,0].
In your code, you swapped the x and y locations. If this is indeed what you wanted to do, you can accomplish this with array.swapaxes:
fx = almost_fx.swapaxes(1,2)
Of course, you could always combine this into one line
fx = e1cx.reshape((nz,ny,nx)).swapaxes(1,2)
However, if you want the z-slices (fx[z,:,:]) to plot with x horizontal and y vertical, you probably do not want to swap the axes above. Just reshape and plot.
Slicing arrays
Finally, rather than looping over the z-index and testing for multiples of 10, you can loop directly over a slice of the array using:
for fx_slice in fx[::10]:
# plot fx_slice and save it
This indexing syntax is array[start:end:step] where start is included in the result end is not. Leaving start blank implies 0, while leaving end blank implies the end of the list.
Summary
In summary your complete code (after introducing a few more python idioms like enumerate) could look something like:
import numpy as np
from matplotlib import pyplot as pt
shape = (190,128,120)
fx = np.loadtxt("eigenvectors_1_component_x.txt").reshape(shape).swapaxes(1,2)
for i,fx_slice in enumerate(fx[::10]):
z = i*10
pt.figure()
pt.axis("off")
pt.pcolor(fx_slice, cmap='spectral', vmin=0, vmax=1)
pt.colorbar(ticks=[0,2,4,6], format='%0.2f')
pt.savefig('Imagex_%d' % z)
Alternatively, if you want one pixel per element, you can replace the body of the for loop with
z = i*10
pt.imsave('Imagex_%d' % z, fx_slice, cmap='spectral', vmin=0, vmax=1)

Can anyone please explain how this python code works line by line?

I am working in image processing right now in python using numpy and scipy all the time. I have one piece of code that can enlarge an image, but not sure how this works.
So please some expert in scipy/numpy in python can explain to me line by line. I am always eager to learn.
import numpy as N
import os.path
import scipy.signal
import scipy.interpolate
import matplotlib.pyplot as plt
import matplotlib.cm as cm
def enlarge(img, rowscale, colscale, method='linear'):
x, y = N.meshgrid(N.arange(img.shape[1]), N.arange(img.shape[0]))
pts = N.column_stack((x.ravel(), y.ravel()))
xx, yy = N.mgrid[0.:float(img.shape[1]):1/float(colscale),
0.:float(img.shape[0]):1/float(rowscale)]
large = scipy.interpolate.griddata(pts, img.flatten(), (xx, yy), method).T
large[-1,:] = large[-2,:]
large[:,-1] = large[:,-2]
return large
Thanks a lot.
First, a grid of empty points is created with point per pixel.
x, y = N.meshgrid(N.arange(img.shape[1]), N.arange(img.shape[0]))
The actual image pixels are placed into the variable pts which will be needed later.
pts = N.column_stack((x.ravel(), y.ravel()))
After that, it creates a mesh grid with one point per pixel for the enlarged image; if the original image was 200x400, the colscale set to 4 and rowscale set to 2, the mesh grid would have (200*4)x(400*2) or 800x800 points.
xx, yy = N.mgrid[0.:float(img.shape[1]):1/float(colscale),
0.:float(img.shape[0]):1/float(rowscale)]
Using scipy, the points in pts variable are interpolated into the larger grid. Interpolation is the manner in which missing points are filled or estimated usually when going from a smaller set of points to a larger set of points.
large = scipy.interpolate.griddata(pts, img.flatten(), (xx, yy), method).T
I am not 100% certain what the last two lines do without going back and looking at what the griddata method returns. It appears to be throwing out some additional data that isn't needed for the image or performing a translation.
large[-1,:] = large[-2,:]
large[:,-1] = large[:,-2]
return large

Categories