I have two columns of input data, that I want as my x and y axis, and a third column of results data relating to the inputs. I have 36 combinations of inputs and then 36 results
I want to achieve something like this plot
I have tried using a cmap but get told the z data is in 1D and needs to be 2D and don't understand how I get get around this issue
Also attached another method below
data = excel[['test','A_h','f_h','fore C_T','hind C_T','fore eff','hind eff','hind C_T ratio','hind eff ratio']]
x = data['A_h']
y = data['f_h']
z = data['hind C_T ratio']
X,Y = np.meshgrid(x,y)
Z = z
plt.pcolor(x,y,z)
If you have arrays [1, 2, 3] and [4, 5, 6] then meshgrid will will give you two arrays of 3x3 each: [[1, 1, 1], [2, 2, 2], [3, 3, 3]] and [[4, 5, 6], [4, 5, 6], [4, 5, 6]]. In your case, you seem to have this already taken care of, since you have 36 each of x, y, z, values. So meshgrid won't be necessary.
If your arrays are well defined (already in the 11122233 and 456456456 format above), then you can just reshape them:
x = np.reshape(data['A_h'], (6,6))
y = np.reshape(data['f_h'], (6,6))
z = np.reshape(data['hind C_T ratio'], (6,6))
plt.contourf(x, y, z)
You can see more help about contourf for details.
On the other hand, if your data are irregular (the 36 points do not form a grid), then you will have to use griddata as #obchardon suggested above.
Related
Suppose I have a Numpy array of a bunch of coordinates [x, y].
I want to filter this array.
For all coordinates in the array with a same x-value, I want to keep only one coordinate: The coordinate with the maximum for the y.
What is the most efficient or Pythonic way to do this.
I will explain with an example below.
coord_arr= array([[10,5], [11,6], [12,6], [10,1], [11,0],[12,2]])
[10, 5] and [10,1] have the same x-value: x=10
maximum for y-values: max(5,1) = 5
So I only keep coordinate [10,5]
Same procedure for x=11 and x=12
So I finally end up with:
filtered_coord_arr= array([[10,5],[11,6],[12,6]])
I have a solution by converting to a list and using list comprehension (see below). But I am looking for a more efficient and elegant solution. (The actual arrays are much larger than in this example.)
My solution:
coord_list = coord_arr.tolist()
x_set = set([coord[0] for coord in coord_list])
coord_max_y_list= []
for x in x_set:
compare_list=[coord for coord in coord_list if coord[0]==x]
coord_max = compare_list[compare_list.index(max([coord[1] for coord[1] in compare_list]))]
coord_max_y_list.append(coord_max)
filtered_coord_arr= np.array(coord_max_y_list)
if your array in small you can just do it one line:
np.array([[x, max(coord[coord[:,0] == x][:,1])] for x in set(coord[:,0])])
however that is not correct complexity, if array is big and you care about correct complexity , do like this:
d = {}
for x, y in coord:
d[x] = max(d.get(x, float('-Inf')), y)
np.array([[x, y] for x,y in d.items()])
you can refer below answer :
Solution :
coord_arr= np.array([[10, 5], [11, 6], [12, 6], [13,7], [10,1], [10,7],[12,2], [13,0]])
df = pd.DataFrame(coord_arr,columns=['a','b'])
df = df.groupby(['a']).agg({'b': ['max']})
df.columns = ['b']
df = df.reset_index()
filtered_coord_arr = np.array(df)
filtered_coord_arr
Output :
array([[10, 7],
[11, 6],
[12, 6],
[13, 7]], dtype=int64)
I'm trying to make a Python app that shows a graph after the input of the data by the user, but the problem is that the y_array and the x_array do not have the same dimensions. When I run the program, this error is raised:
ValueError: x and y must have same first dimension, but have shapes () and ()
How can I draw a graph with the X and Y axis of different length?
Here is a minimal example code that will lead to the same error I got
:
import matplotlib.pyplot as plt
y = [0, 8, 9, 3, 0]
x = [1, 2, 3, 4, 5, 6, 7]
plt.plot(x, y)
plt.show()
This is virtually a copy/paste of the answer found here, but I'll show what I did to get these to match.
First, we need to decide which array to use- the x_array of length 7, or the y_array of length 5. I'll show both, starting with the former. Note that I am using numpy arrays, not lists.
Let's load the modules
import numpy as np
import matplotlib.pyplot as plt
import scipy.interpolate as interp
and the arrays
y = np.array([0, 8, 9, 3, 0])
x = np.array([1, 2, 3, 4, 5, 6, 7])
In both cases, we use interp.interp1d which is described in detail in the documentation.
For the x_array to be reduced to the length of the y_array:
x_inter = interp.interp1d(np.arange(x.size), x)
x_ = x_inter(np.linspace(0,x.size-1,y.size))
print(len(x_), len(y))
# Prints 5,5
plt.plot(x_,y)
plt.show()
Which gives
and for the y_array to be increased to the length of the x_array:
y_inter = interp.interp1d(np.arange(y.size), y)
y_ = y_inter(np.linspace(0,y.size-1,x.size))
print(len(x), len(y_))
# Prints 7,7
plt.plot(x,y_)
plt.show()
Which gives
Imagine this piece of code where X is the independent variable, and Y is the dependent variable and is equal to X ** 2:
X = [1, 2, 3]
Y = [1, 4, 9]
plt.plot(X, Y)
plt.show()
What if both of my variables were in the same list:
li = [[1, 1], [2, 4], [3, 9]]
The first element of every nested list is X, and the second one is Y; how should I plot this?
I'm pretty sure someone already asked this question, but I didn't know what to search and didn't found an answer.
You can transpose the list of lists prior to plotting it:
X, Y = list(zip(*li))
plt.plot(X, Y)
In Python, I have a list of tuples, each of them containing two nx1 vectors.
data = [(np.array([0,0,3]), np.array([0,1])),
(np.array([1,0,4]), np.array([1,1])),
(np.array([2,0,5]), np.array([2,1]))]
Now, I want to split this list into two matrices, with the vectors as columns.
So I'd want:
x = np.array([[0,1,2],
[0,0,0],
[3,4,5]])
y = np.array([[0,1,2],
[1,1,1]])
Right now, I have the following:
def split(data):
x,y = zip(*data)
np.asarray(x)
np.asarray(y)
x.transpose()
y.transpose()
return (x,y)
This works fine, but I was wondering whether a cleaner method exists, which doesn't use the zip(*) function and/or doesn't require to convert and transpose the x and y matrices.
This is for pure entertainment, since I'd go with the zip solution if I were to do what you're trying to do.
But a way without zipping would be vstack along your axis 1.
a = np.array(data)
f = lambda axis: np.vstack(a[:, axis]).T
x,y = f(0), f(1)
>>> x
array([[0, 1, 2],
[0, 0, 0],
[3, 4, 5]])
>>> y
array([[0, 1, 2],
[1, 1, 1]])
Comparing the best elements of all previously proposed methods, I think it's best as follows*:
def split(data):
x,y = zip(*data) #splits the list into two tuples of 1xn arrays, x and y
x = np.vstack(x[:]).T #stacks the arrays in x vertically and transposes the matrix
y = np.vstack(y[:]).T #stacks the arrays in y vertically and transposes the matrix
return (x,y)
* this is a snippet of my code
I have a 2D grid with radioactive beta-decay rates. Each vale corresponds to a rate on a specific pair of temperature and density (both on logarithmic scale). What I would like to do, is when I have a temperature and density data pair (after getting their logarithms), to find the matching values in the table. I tried using the scipy interpolate interpn function, but I got a little confused, I would be grateful for the help.
What I have so far:
pointsx = np.array([7+0.2*i for i in range(0,16)]) #temperature range
pointsy = np.array([i for i in range(0,11) ]) #rho_el range
data = numpy.loadtxt(filename) #getting data from file
logT = np.log10(T) #wanted temperature logarithmic
logrho = np.log10(rho) #wanted rho logarithmic
The interpn function has the following arguments: points, values, xi, method='linear', bounds_error=True, fill_value=nan. I figure that the points will be the pointsx and pointsy I have, the data is quite obvious, and xi will be the (T,rho) I'm looking for. But I'm not sure, what dimensions they should have? The points is the same size, as the data? So I have to make an array of the corresponding pairs of T and rho, which will be the points part, and then have a (T, rho) pair as xi?
When you aren't certain about how a function works, it's always a good idea to open up a REPL and test it yourself. In this case, the function works exactly as expected, given your understanding of the documentation.
>>> points = [[1, 2, 3, 4], [1, 2, 3, 4]] # Input values for each grid dimension
>>> values = [[1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 6], [4, 5, 6, 7]] # The grid itself
>>> xi = (1, 1.5)
>>> scipy.interpolate.interpn(points, values, xi)
array([ 1.5])
>>> xi = [[1, 1.5], [2, 1.5], [2, 2.5], [3, 2.5], [3, 3.5], [4, 3.5]]
>>> scipy.interpolate.interpn(points, values, xi)
array([ 1.5, 2.5, 3.5, 4.5, 5.5, 6.5])
The only thing you missed was that points is supposed to be a tuple. But as you can see from the above, it works even if points ins't a tuple.