Related
i'm looking for the best way to create a contour plot using a numpy meshgrid.
I have excel data in columns simplyfied looking like this:
x data values: -3, -2, -1, 0, 1, 2 ,3, -3, -2, -1, 0, 1, 2, 3
y data values: 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2
z data values: 7 , 5, 6, 5, 1, 0, 9, 5, 3, 8, 3, 1, 0, 4
The x and y values define a 2d plane with the length (x-Axis) of 7 values and depth (y-Axis) of 2 values. The z values define the colour at the corresponing points (more or less a z-Axis).
I've tried:
import matplotlib.pyplot as plt
import numpy as np
x = [-3,-2,-1,0,1,2,3]
y = [1,2]
z = [7,5,6,5,1,0,9,5,3,8,3,1,0,4]
x, y = np.meshgrid(x, y)
A = np.array(z)
B = np.reshape(A, (-1, 2))
fig = plt.figure()
ax1 = plt.contourf(x, y, B)
plt.show()
I'm pretty sure i'm not getting how the meshgrid works. Do i have to use the whole List of x and y values for it to work?
How do i create a rectangular 2d plot with the length (x) of 7 and the depth (y) of 2 and the z values defining the shading/colour at the x and y values?
Thanks in advance guys!
Try
x_, y_ = np.meshgrid(x, y)
z_grid = np.array(z).reshape(2,7)
fig = plt.figure()
ax1 = plt.contourf(x_,y_,z_grid)
plt.show()
Edit: If you would like to smooth, as per your comment, you can try something like scipy.ndimage.zoom() as described here, i.e., in your case
from scipy import ndimage
z_grid = np.array(z).reshape(2,7)
z_grid_interp = ndimage.zoom(z_grid, 100)
x_, y_ = np.meshgrid(np.linspace(-3,3,z_grid_interp.shape[1]),np.linspace(1,2,z_grid_interp.shape[0]))
and then plot as before:
fig = plt.figure()
ax1 = plt.contourf(x_,y_,z_grid_interp)
plt.show()
This is one way where you use the shape of the meshgrid (X or Y) to reshape your z array. You can, moreover, add a color bar using plt.colorbar()
import matplotlib.pyplot as plt
import numpy as np
x = [-3,-2,-1,0,1,2,3]
y = [1,2]
z = np.array([7,5,6,5,1,0,9,5,3,8,3,1,0,4])
X, Y = np.meshgrid(x, y)
print (X.shape, Y.shape)
# (2, 7) (2, 7) Both have same shape
Z = z.reshape(X.shape) # Use either X or Y to define shape
fig = plt.figure()
ax1 = plt.contourf(X, Y, Z)
plt.colorbar(ax1)
plt.show()
def f(x, y):
return np.sin(x) ** 10 + np.cos(10 + y * x) * np.cos(x)
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 2, 3 )
y = np.linspace(0, 3, 4)
X, Y = np.meshgrid(x, y)
Z = f(X, Y)
plt.contour(X, Y, Z, cmap='RdGy');
I am trying to convert a dictionary into a form which can be plotted as a contour using matplotlib. The keys to the dictionary are a tuple of the X,Y coordinates, and the value is the reading at that coordinate. I would like put these into a three numpy array, a 1D array of x coordinates, a 1D array of y coordinates, and a 2D array of values. The respective indices of the x,y arrays should corresponds to the index of the value in the 2D array defined in the dictionary.
An edit to better define the question:
Example Input Data:
Dictionary
(0,0): 1
(1.5,0): 2
(0,1.5): 3
(1.5,1.5): 4
What I would like
x = [0,1.5]
y = [0,1.5]
values = [[1,2],[3,4]]
I have got
for key in corr_data.items():
X.append(key[0])
Y.append(key[1])
X = list(dict.fromkeys(X))
Y = list(dict.fromkeys(Y))
which gets the x and y arrays but the values array eludes me.
Any help is appreciated
You can simply iterate over your dict and create your lists and maybe convert that lists to numpy.ndarray
x = []
y = []
vals = np.zeros(your_grid_shape)
for ((i,j), v) in your_dict.iteritems():
x.append(i)
y.append(j)
vals[i, j] = v
x = list(set(x))
y = list(set(y))
I here a 'self-containing' answer in the sense that I first generate some input data, which I then convert into a dictionary and then back into the original arrays. On the way, I add some random noise to keep the x and y values close to each other but still make them unique. Following this answer, a list of all values that are 'close' to each other can be found by first rounding the values and then using np.unique.
mport numpy as np
##generating some input data:
print('input arrays')
xvals = np.linspace(1,10, 5)
print(xvals)
yvals = np.linspace(0.1, 0.4, 4)
print(yvals)
xvals, yvals = np.meshgrid(xvals, yvals)
##adding some noise to make it more interesting:
xvals += np.random.rand(*xvals.shape)*1e-3
yvals += np.random.rand(*yvals.shape)*1e-5
zvals = np.arange(xvals.size).reshape(*xvals.shape)
print(zvals)
input_dict ={
(i,j): k for i,j,k in zip(
list(xvals.flatten()), list(yvals.flatten()), list(zvals.flatten())
)
}
##print(input_dict)
x,y,z = map(np.array,zip(*((x,y,z) for (x,y),z in input_dict.items())))
##this part will need some tweaking depending on the size of your
##x and y values
xlen = len(np.unique(x.round(decimals=2)))
ylen = len(np.unique(y.round(decimals=3)))
x = x.round(decimals=2).reshape(ylen,xlen)[0,:]
y = y.round(decimals=3).reshape(ylen,xlen)[:,0]
z = z.reshape(ylen,xlen)
print('\n', 'output arrays')
print(x)
print(y)
print(z)
The output looks like this:
input arrays
[ 1. 3.25 5.5 7.75 10. ]
[0.1 0.2 0.3 0.4]
[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]
[15 16 17 18 19]]
output arrays
[ 1. 3.25 5.5 7.75 10. ]
[0.1 0.2 0.3 0.4]
[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]
[15 16 17 18 19]]
Old Answer:
There are a lot of assumptions in this answer, mainly because there is not quite enough information in the question. But, assuming that
the x and y values are as nicely ordered as in the example data
the x and y values are complete
One could go about the problem with a list comprehension and a reshaping of numpy ndarrays:
import numpy as np
input_dict = {
(0,0): 1,
(1,0): 2,
(0,1): 3,
(1,1): 4,
}
x,y,z = map(np.array,zip(*((x,y,z) for (x,y),z in input_dict.items())))
xlen = len(set(x))
ylen = len(set(y))
x = x.reshape(xlen,ylen)[0,:]
y = y.reshape(xlen,ylen)[:,0]
z = z.reshape(xlen,ylen)
print(x)
print(y)
print(z)
which gives
[0 1]
[0 1]
[[1 2]
[3 4]]
hope this helps.
PS: If the x and y values are not in necessarily in the order suggested by the posted example data, one can still solve the issue with some clever sorting.
In the REPL
In [9]: d = {(0,0): 1, (1,0): 2, (0,1): 3, (1,1): 4}
In [10]: x = set(); y = set()
In [11]: for xx, yy in d.keys():
...: x.add(xx)
...: y.add(yy)
In [12]: x
Out[12]: {0, 1}
In [13]: x = sorted(x) ; y = sorted(y)
In [14]: x
Out[14]: [0, 1]
In [15]: v = [[d.get((xx,yy)) for yy in y] for xx in x]
In [16]: v
Out[16]: [[1, 3], [2, 4]]
As you can see, my result is different from your example but it's common to have x corresponding to rows and y corresponding to columns. If you want a more geographic convention, swap x and y in the final list comprehension.
As a script we may write
def extract{d}:
x = set(); y = set()
for xx, yy in d.keys():
x.add(xx)
y.add(yy)
x = sorted(x) ; y = sorted(y)
v = [[d.get((xx,yy)) for yy in y] for xx in x]
# = [[d.get((xx,yy)) for xx in x] for yy in y]
return x, y, v
I have the following three lists of unequal lengths:
a = [2.13, 5.48,-0.58]
b = [4.17, 1.12, 2.13, 3.48,-1.01,-1.17]
c = [6.73, 8, 12]
d = [(2.13,2.13),(5.48,-1.17),(-0.58,4.17)]
e = [(4.17,12),(2.13,6.73)]
I need to create a combination_abc = [ (x,y,z) for x in a
for y in b
for z in c] such that (x,y) is not equal to d and (y,z) is not equal to e
If I understood you correct, just add if-statement into your list comprehension:
[(x, y, z) for x in a for y in b for z in c if (x, y) not in d and (y, z) not in e]
Also you can use itertools.product for simplicity:
from itertools import product
[(x, y, z) for x, y, z in product(a, b, c) if (x, y) not in d and (y, z) not in e]
I'm trying to create a dataframe to represent a topographical expression. So far I've written a pair of for loops that can individually be used to express the x and y axis, specifically in the forms,
a = []
for x in range(1,i,1):
x1 = some function of x
x2 = another function of x
a.append({'a':x, 'b':x1, 'c': x2})
xaxis = pd.DataFrame(a)
for the x axis and,
a = []
for y in range(-j, j, 1):
y1 = some function of y
a.append({'a':y,'b':y1})
yaxis = pd.DataFrame(a)
for the y axis.
That's all simple enough and works fine, however...
I want to expand on this such that the y axis loop is repeated with each iteration of the x axis loop and have the y1 function depend on the parameters of the x axis loop. I get this far,
a = []
for x in range(1,i,1):
x1 = some function of x
x2 = another function of x
for y in range(-j, j, 1):
y1 = some function of y that calls x2
a.append({
and I'm stumped.
The output I'm after is essentially this,
x x1 x2 y y1
x1(1) x2(1) -j y1(1,-j)
1 x1(1) x2(1) 0 y1(1,0)
x1(1) x2(1) j y1(1,j)
x1(2) x2(2) -j y1(2,-j)
2 x1(2) x2(2) 0 y1(2,0)
x1(2) x2(2) j y1(2,j)
....
and so on to x = i.
The end desire is to have data that can be plotted in a 2D histogram
If there's a better way to do this then please do let me know, this is just the only way I can currently think of that may get the result I'm after.
edit: Turns out this can be done quite effectively using numpy arrays. This is a general expression on how I achieved this goal in the end,
y1 = lambda x,y: f(x,y)
np.array( [ [ y1(x,y) for x in xrange(1,i,1)] for y in xrange(-j,j,1)] )
You have to find a vectorized version of your functions. This can be achieved (for intstant) by using numpy's several vectorized functions or by using numpy.vectorized(). Take a look on that example :
import numpy as np
def f1(x):
return x**2
def f2(x):
return np.abs(x)
def f3(x,y):
return x**2 + y**2
i = 3 ; j = 2
x = np.arange(1,i,1)
y = np.arange(-j,j,1)
# Now build cartesian product of x and y
xy = np.array([np.tile(x, len(y)), np.repeat(y, len(x))])
xy
array([[ 1, 2, 1, 2, 1, 2, 1, 2],
[-2, -2, -1, -1, 0, 0, 1, 1]])
x1 = f1(xy[0,])
x1
array([1, 4, 1, 4, 1, 4, 1, 4], dtype=int32)
x2 = f2(xy[0,])
x2
array([1, 2, 1, 2, 1, 2, 1, 2])
y1 = f3(xy[0,], xy[1,])
y1
array([5, 8, 2, 5, 1, 4, 2, 5], dtype=int32)
I have a file containing 3 columns, where the first two are coordinates (x,y) and the third is a value (z) corresponding to that position. Here's a short example:
x y z
0 1 14
0 2 17
1 0 15
1 1 16
2 1 18
2 2 13
I want to create a 2D array of values from the third row based on their x,y coordinates in the file. I read in each column as an individual array, and I created grids of x values and y values using numpy.meshgrid, like this:
x = [[0 1 2] and y = [[0 0 0]
[0 1 2] [1 1 1]
[0 1 2]] [2 2 2]]
but I'm new to Python and don't know how to produce a third grid of z values that looks like this:
z = [[Nan 15 Nan]
[14 16 18]
[17 Nan 13]]
Replacing Nan with 0 would be fine, too; my main problem is creating the 2D array in the first place. Thanks in advance for your help!
Assuming the x and y values in your file directly correspond to indices (as they do in your example), you can do something similar to this:
import numpy as np
x = [0, 0, 1, 1, 2, 2]
y = [1, 2, 0, 1, 1, 2]
z = [14, 17, 15, 16, 18, 13]
z_array = np.nan * np.empty((3,3))
z_array[y, x] = z
print z_array
Which yields:
[[ nan 15. nan]
[ 14. 16. 18.]
[ 17. nan 13.]]
For large arrays, this will be much faster than the explicit loop over the coordinates.
Dealing with non-uniform x & y input
If you have regularly sampled x & y points, then you can convert them to grid indices by subtracting the "corner" of your grid (i.e. x0 and y0), dividing by the cell spacing, and casting as ints. You can then use the method above or in any of the other answers.
As a general example:
i = ((y - y0) / dy).astype(int)
j = ((x - x0) / dx).astype(int)
grid[i,j] = z
However, there are a couple of tricks you can use if your data is not regularly spaced.
Let's say that we have the following data:
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(1977)
x, y, z = np.random.random((3, 10))
fig, ax = plt.subplots()
scat = ax.scatter(x, y, c=z, s=200)
fig.colorbar(scat)
ax.margins(0.05)
That we want to put into a regular 10x10 grid:
We can actually use/abuse np.histogram2d for this. Instead of counts, we'll have it add the value of each point that falls into a cell. It's easiest to do this through specifying weights=z, normed=False.
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(1977)
x, y, z = np.random.random((3, 10))
# Bin the data onto a 10x10 grid
# Have to reverse x & y due to row-first indexing
zi, yi, xi = np.histogram2d(y, x, bins=(10,10), weights=z, normed=False)
zi = np.ma.masked_equal(zi, 0)
fig, ax = plt.subplots()
ax.pcolormesh(xi, yi, zi, edgecolors='black')
scat = ax.scatter(x, y, c=z, s=200)
fig.colorbar(scat)
ax.margins(0.05)
plt.show()
However, if we have a large number of points, some bins will have more than one point. The weights argument to np.histogram simply adds the values. That's probably not what you want in this case. Nonetheless, we can get the mean of the points that fall in each cell by dividing by the counts.
So, for example, let's say we have 50 points:
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(1977)
x, y, z = np.random.random((3, 50))
# Bin the data onto a 10x10 grid
# Have to reverse x & y due to row-first indexing
zi, yi, xi = np.histogram2d(y, x, bins=(10,10), weights=z, normed=False)
counts, _, _ = np.histogram2d(y, x, bins=(10,10))
zi = zi / counts
zi = np.ma.masked_invalid(zi)
fig, ax = plt.subplots()
ax.pcolormesh(xi, yi, zi, edgecolors='black')
scat = ax.scatter(x, y, c=z, s=200)
fig.colorbar(scat)
ax.margins(0.05)
plt.show()
With very large numbers of points, this exact method will become slow (and can be sped up easily), but it's sufficient for anything less than ~1e6 points.
Kezzos beat me to it but I had a similar approach,
x = np.array([0,0,1,1,2,2])
y = np.array([1,2,0,1,1,2])
z = np.array([14,17,15,16,18,13])
Z = np.zeros((3,3))
for i,j in enumerate(zip(x,y)):
Z[j] = z[i]
Z[np.where(Z==0)] = np.nan
You could try something like:
import numpy as np
x = [0, 0, 1, 1, 2, 2]
y = [1, 2, 0, 1, 1, 2]
z = [14, 17, 15, 16, 18, 13]
arr = np.zeros((3,3))
yx = zip(y,x)
for i, coord in enumerate(yx):
arr[coord] = z[i]
print arr
>>> [[ 0. 15. 0.]
[ 14. 16. 18.]
[ 17. 0. 13.]]
If you have scipy installed, you could take advantage of its sparse matrix module. Get the values from the text file with genfromtxt, and plug those 'columns' directly into a sparse matrix creator.
In [545]: txt=b"""x y z
0 1 14
0 2 17
1 0 15
1 1 16
2 1 18
2 2 13
"""
In [546]: xyz=np.genfromtxt(txt.splitlines(),names=True,dtype=int)
In [547]: sparse.coo_matrix((xyz['z'],(xyz['y'],xyz['x']))).A
Out[547]:
array([[ 0, 15, 0],
[14, 16, 18],
[17, 0, 13]])
But Joe's z_array=np.zeros((3,3),int); z_array[xyz['y'],xyz['x']]=xyz['z'] is considerably faster.
Nice answers by others. Thought this might be a useful snippet for someone else who might need this.
def make_grid(x, y, z):
'''
Takes x, y, z values as lists and returns a 2D numpy array
'''
dx = abs(np.sort(list(set(x)))[1] - np.sort(list(set(x)))[0])
dy = abs(np.sort(list(set(y)))[1] - np.sort(list(set(y)))[0])
i = ((x - min(x)) / dx).astype(int) # Longitudes
j = ((y - max(y)) / dy).astype(int) # Latitudes
grid = np.nan * np.empty((len(set(j)),len(set(i))))
grid[-j, i] = z # if using latitude and longitude (for WGS/West)
return grid