Python scatter plot 2 dimensional array - python

I'm trying to do something that I think should be pretty straight forward but I can't seem to get it to work.
I'm trying to plot 16 byte values measured over time to see how they change. I'm trying to use a scatter plot to do this with:
x axis being the measurement index
y axis being the index of the byte
and the color indicating the value of the byte.
I have the data stored in a numpy array where data[2][14] would give me the value of the 14th byte in the 2nd measurement.
Every time I try to plot this, I'm getting either:
ValueError: x and y must be the same size
IndexError: index 10 is out of bounds for axis 0 with size 10
Here is the sample test I'm using:
import numpy
import numpy.random as nprnd
import matplotlib.pyplot as plt
#generate random measurements
# 10 measurements of 16 byte values
x = numpy.arange(10)
y = numpy.arange(16)
test_data = nprnd.randint(low=0,high=65535, size=(10, 16))
#scatter plot the measurements with
# x - measurement index (0-9 in this case)
# y - byte value index (0-15 in this case)
# c = test_data[x,y]
plt.scatter(x,y,c=test_data[x][y])
plt.show()
I'm sure it is something stupid I'm doing wrong but I can't seem to figure out what.
Thanks for the help.

Try using a meshgrid to define your point locations, and don't forget to index into your NumPy array properly (with [x,y] rather than [x][y]):
x, y = numpy.meshgrid(x,y)
plt.scatter(x,y,c=test_data[x,y])
plt.show()

Related

Plotting per-point alpha values in 3D scatterplot throws ValueError

I have data in form of a 3D array, with "intensities" at every point. Depending on the intensity, I want to plot the point with a higher alpha. There are a lot of low-value outliers, so color coding (with scalar floats) won't work since they eclipse the real data.
What I have tried:
#this generates a 3D array with higher values around the center
a = np.array([0,1,2,3,4,5,4,3,2,1])
aa = np.outer(a,a)
aaa = np.einsum("ij,jk,jl",aa,aa,aa)
x_,y_,z_,v_ = [],[],[],[]
from matplotlib.colors import to_rgb,to_rgba
for x in range(aaa.shape[0]):
for y in range(aaa.shape[1]):
for z in range(aaa.shape[2]):
x_.append(x)
y_.append(y)
z_.append(z)
v_.append(aaa[x,y,z])
r,g,b = to_rgb("blue")
color = np.array([[r,g,b,a] for a in v_])
fig = plt.figure()
ax = fig.add_subplot(projection = '3d')
ax.scatter(x_,y_,z_,c =color)
plt.show()
the scatterplot documentation says that color can be a 2D array of RGBA, which I do pass. Hoever when I try to run the code, I get the following error:
ValueError: 'c' argument has 4000 elements, which is inconsistent with 'x' and 'y' with size 1000.
I just found my own answer.
The "A 2D array in which the rows are RGB or RGBA." statement in the documentation was a bit confusing - one needs to convert the RGBA rows to RGBA objects first, so that list comprehension should read:
color = np.array([to_rgba([r,g,b,a]) for a in v_])

ValueError: x and y must have same first dimension, but have shapes (101,) and (1,) [duplicate]

This question already has answers here:
Plotting: ValueError: x and y must have same first dimension
(2 answers)
Closed 1 year ago.
enter image description hereI am new in coding and in using JupyterNotebook and I wanted to ask how will I graph x(as any time t)=(0,10,101) and y(as acceleration)=-2.2 . those are the values given to us by our professor but when I try to plot, it gives me an error and it says that ValueError: x and y must have same first dimension, but have shapes (101,) and (1,). thank you.
Your description wasn't clear, I highly suggest next time you post to provide an example of the code that you are facing a problem. Have a look at how others frame their questions. Anyways I will try my best to help you.
We know that:
x = 0.5at^2 +V0t
Where:
x: position
a: acceleration
V0: initial velocity
t: time
In real life time is continous, however having an absolutley continous variable in programming is impossible, therefore the next best thing to do is use a range with a very small step size.
Let's start with assuming that the initial velocity is zero --> x = 0.5at*t
Now that we have simplified the equation let's tackle the problem of time.
import numpy as np
import matplotlib.pyplot as plt
# acceleration is a constant variable
a = -2.2
# get array for the time
t = np.arange(0,10,0.1)
# calculate position at each time and store in array
x = 0.5*a*t*t
plt.plot(t,x)
plt.show()
out:
[]
Above we calculated each value of x for the list of values in time, as you can see, in order to plot the values of position vs time, their the lengths of the arrays need to be the same. we can check the lengths of the arrays using the len function:
print(f"length of time: {len(t)} ")
print(f"length of position: {len(x)}" )
out:
length of time: 100
length of position: 100
Here are some sources to help you get started with learning python:
Great free Course covering all the basics by Microsoft
List Comprehension
Functions in python
Some channels on Youtube that I recommend:
Real Python
Corey Schafer
DataCamp
When you want to plot x versus y data you need to have matching shapes for x and y data.
So in order to plot horziontal line at y = -2.2 for x from 0 to 10 with 101 points instead of
y = (-2.2)
You need to use
y = np.full(101, -2.2)
Or better
y = np.full(x.shape, -2.2)
So that y would be of shape 101 matching x shape
Use this:
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0,10,101)
y = np.repeat(-2.2,101) # map y constant value
plt.plot(x,y)
plt.show()

Plotting in matplotlib

I wrote the following code,
import numpy as np
from random import gauss
from random import seed
from pandas import Series
import matplotlib.pyplot as plt
import math
###variable declaration
R=0.000001 #radiaus of particle
beta=0.23 # shape factor
Ad=9.2#characteristic nanoscale defect area
gamma=4*10**-2 #surface tension
tau=.0001 #line tension
phi=-(math.pi/2)#spatial perturbation
lamda=((Ad)/(2*3.14*R)) #averge contcat line position
mu=0.001#viscosity of liquid
lamda_m=10**-9# characteristic size of adsorption site
KbT=(1.38**-24)*293 # boltzman constant with tempartaure
nu=0.001#moleculer volume in liquid phase 1
khi=3 #scaling factor
#deltaF=(beta*gamma*Ad)#surface energy perturbation
deltaF=19*KbT
# seed random number generator
seed(0)
# create white noise series
series = [gauss(0.0, 1.0) for i in range(1)]
series = Series(series)
#########################################
Z=0.0000001 #particle position
time=1
dt=1
for time in np.arange(1, 100, dt):
#####simulation loop#######
theta=np.arccos(-Z/R) #contact angle
theta_e=((math.pi*110)/180) #equilibrium contact angle
Z_e=-R*np.cos(theta_e)#equilibrium position of particle
C=3.14*gamma*(R-Z_e) #additive constant
Fsz= (gamma*math.pi*(Z-Z_e)**2)+(tau*2*math.pi*math.sqrt(R**2-Z**2))+C
Fz=Fsz+(0.5*deltaF*np.sin((2*math.pi/lamda)*(Z-Z_e)-phi))#surface force
#dFz=(((gamma*Ad)/2)*np.sin(2*math.pi/lamda))+((Z-Z_e)*(2*gamma*math.pi))-((tau*2*math.pi*Z)/(math.sqrt(R**2-Z**2)))
dFz=(deltaF*np.sin(2*math.pi/lamda))+((Z-Z_e)*(2*gamma*math.pi))-((tau*2*math.pi*Z)/(math.sqrt(R**2-Z**2)))
w_a=gamma*lamda_m**2*(1-np.cos(theta_e)) #work of adhesion
epsilon_z=2*math.pi*R*np.sin(theta)*mu*(nu/(lamda_m**3))*np.exp(w_a/KbT)#transitional drag
epsilon_s=khi*mu*((4*math.pi**2*R**2)/math.sqrt(Ad))*(1-(Z/R)**2)
epsilon=epsilon_z+epsilon_s
Ft=math.sqrt(2*KbT*epsilon)*series #thermal force
v=(dFz+Ft)/epsilon ##new velocity
Z=Z+v*dt #new position
print('z=',Z)
print('v=',v)
print('Fz=',Fz)
print('dFz',dFz)
print('time',time)
plt.plot(Z,time)
plt.show()
According to my code I suppose to have 99 values for everything (Fz, Z, v , time). While I print , I can see all the values but while I was trying to plot them with different parameters with respect to each other for analyzing, I never get any graph. Can anyone tell me, what is missing from my code with explanation?
#AnttiA's answer is basically correct, but can be easily misunderstood, as can be seen from the OP's comment. Therefore here the complete code altered such that a plot is actually produced. Instead of making Z a list, define another variable as list, say Z_all = [], and then append the updated Z-values to that list. The same can be done for the time variable, i.e. time_all = np.arange(1,100,dt). Finally, take the plot command out of the loop and plot the entire data series at once.
Note that in your example you don't really have a series of random numbers, you pull one fixed number for one fixed seed, thus the plot is not really meaningful (it appears to be producing a straight line). Trying to interpret your intentions correctly, you probably want a series of random numbers that is as long as your time series. This is most easily done using np.random.normal
There are a lot of other ways that your code could be optimised. For instance, all mathematical functions from the math module are also found in numpy, so you could just not import math at all. The same goes for pandas. Also, you define some constant values inside the for-loop, which could be computed once before the loop. Lastly, #AnttiA is probably right, that you want time on the x axis and Z on the y axis. I therefore generate two plots -- on the left time over Z and on the right Z over time. Now finally the altered code:
import numpy as np
#from random import gauss
#from random import seed
#from pandas import Series
import matplotlib.pyplot as plt
#import math
###variable declaration
R=0.000001 #radiaus of particle
beta=0.23 # shape factor
Ad=9.2#characteristic nanoscale defect area
gamma=4*10**-2 #surface tension
tau=.0001 #line tension
phi=-(np.pi/2)#spatial perturbation
lamda=((Ad)/(2*3.14*R)) #averge contcat line position
mu=0.001#viscosity of liquid
lamda_m=10**-9# characteristic size of adsorption site
KbT=(1.38**-24)*293 # boltzman constant with tempartaure
nu=0.001#moleculer volume in liquid phase 1
khi=3 #scaling factor
#deltaF=(beta*gamma*Ad)#surface energy perturbation
deltaF=19*KbT
##quantities moved out of the for-loop:
theta_e=((np.pi*110)/180) #equilibrium contact angle
Z_e=-R*np.cos(theta_e)#equilibrium position of particle
C=3.14*gamma*(R-Z_e) #additive constant
w_a=gamma*lamda_m**2*(1-np.cos(theta_e)) #work of adhesion
#########################################
Z=0.0000001 #particle position
##time=1
dt=1
Z_all = []
time_all = np.arange(1, 100, dt)
# seed random number generator
# seed(0)
np.random.seed(0)
# create white noise series
##series = [gauss(0.0, 1.0) for i in range(1)]
##series = Series(series)
series = np.random.normal(0.0, 1.0, len(time_all))
for time, S in zip(time_all,series):
#####simulation loop#######
Z_all.append(Z)
theta=np.arccos(-Z/R) #contact angle
Fsz= (gamma*np.pi*(Z-Z_e)**2)+(tau*2*np.pi*np.sqrt(R**2-Z**2))+C
Fz=Fsz+(0.5*deltaF*np.sin((2*np.pi/lamda)*(Z-Z_e)-phi))#surface force
#dFz=(((gamma*Ad)/2)*np.sin(2*np.pi/lamda))+((Z-Z_e)*(2*gamma*np.pi))-((tau*2*np.pi*Z)/(np.sqrt(R**2-Z**2)))
dFz=(deltaF*np.sin(2*np.pi/lamda))+((Z-Z_e)*(2*gamma*np.pi))-((tau*2*np.pi*Z)/(np.sqrt(R**2-Z**2)))
epsilon_z=2*np.pi*R*np.sin(theta)*mu*(nu/(lamda_m**3))*np.exp(w_a/KbT)#transitional drag
epsilon_s=khi*mu*((4*np.pi**2*R**2)/np.sqrt(Ad))*(1-(Z/R)**2)
epsilon=epsilon_z+epsilon_s
Ft=np.sqrt(2*KbT*epsilon)*S #series #thermal force
v=(dFz+Ft)/epsilon ##new velocity
Z=Z+v*dt #new position
print('z=',Z)
print('v=',v)
print('Fz=',Fz)
print('dFz',dFz)
print('time',time)
fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(8,4))
axes[0].plot(Z_all,time_all)
axes[0].set_xlabel('Z')
axes[0].set_ylabel('t')
axes[1].plot(time_all, Z_all)
axes[1].set_xlabel('t')
axes[1].set_ylabel('Z')
fig.tight_layout()
plt.show()
The result looks like this:
I suppose, you will get plot anyway, y-values are probably from 94 to 104.
Now you are plotting line with one point. Its length is zero, that's why you cannot see it, try: plt.plot(Z,time,'*').
Now you should get graph with an asterix in the middle.
As Thomas suggested, you should use arrays instead of using last calculated value. If you prefer loops (sometimes they are easier to modify), modify the lines...
Before loop:
Z = [0.0000001] # Initialize Z for time 0
time_vec = np.arange(1, 100, dt)
Inside loop:
Z.append(Z[-1] + v*dt) # new position
After loop:
plt.plot(Z[1:], time_vec)
Have no time to test it, hopefully works...
Note that first argument in plot command is x-axis values and second y-axis, I'd prefer time in x-axis.

BeamDeflection Plot

I'm having trouble with my script not showing a plot.
The plot must show the deflection of the beam as a function of the x-coordinate of the entire beam. I don't know if I can make the statements: "x[i]>a[v]" if x is not given...
import numpy as np #Imports NumPy
import matplotlib.pyplot as plt
def beamPlot(beamLength, loadPositions, loadForces, beamSupport):
l=beamLength #Scalar
a=loadPositions #Vector
W=loadForces #Vector
x=np.array(range(0,l))
E=200*10**9 #Constant [N/m^2]
I=0.001 #Constant [m^4]
#Makes an empty vector with the same size as x
y=np.empty_like(x)
for i in range(np.size(x)): #Continues as long as the vector x
for v in range(np.size(a)):
if a[v]==[ ] and W[v]==[ ]:
return np.zeros(np.size(x))
elif beamSupport=="both" and x[i]<a[v]:
y[i]=np.sum(((W[v]*(l-a[v])*x[i])/(6*E*I*l))*(l**2-x[i]**2-(l-a[v])**2))
elif beamSupport=="both" and x[i]>=a[v]:
y[i]=np.sum(W[v]*a[v]*(l-x[i])/(6*E*I*l)*(l**2-(l-x[i])**2-a[v]**2))
elif beamSupport=="cantilever" and x[i]<a[v]:
y[i]=np.sum((W[v]*x[i]**2)/(6*E*I)*(3*a[v]-x[i]))
elif beamSupport=="cantilever" and x[i]>=a[v]:
y[i]=np.sum((W[v]*a[v]**2)/(6*E*I)*(3*x[i]-a[v]))
deflection=y
plt.ylim([0,10000])
plt.xlim([0,l])
plt.title("Beam deflection")
plt.plot(x, deflection)
plt.show()
Your array x is created with a list of integers from range(0,l), which means that the elements in the array are of type int. You create the y array using np.epty_like() which means that it also has elements of type int. Unless you are using huge values for the loads, the float values created by your calculations get rounded to 0 when converted to int, so the plot is a flat line at y=0.
You can fix this by specifying that y should contain float values when it is created by adding dtype=float to:
y=np.empty_like(x, dtype=float)
You should also remove the plt.ylim(0,10000) and instead let matplotlib autoscale your y-axis, since the displacements are probably not going to be this large for any reasonable values of loads (given your stiffness)

Personalised colourmap plot using set numbers using matplotlib

I have a data which looks like (example)
x y d
0 0 -2
1 0 0
0 1 1
1 1 3
And I want to turn this into a coloumap plot which looks like one of these:
where x and y are in the table and the color is given by 'd'. However, I want a predetermined color for each number, for example:
-2 - orange
0 - blue
1 - red
3 - yellow
Not necessarily these colours but I need to address a number to a colour and the numbers are not in order or sequence, the are just a set of five or six random numbers which repeat themselves across the entire array.
Any ideas, I haven't got a code for that as I don't know where to start. I have however looked at the examples in here such as:
Matplotlib python change single color in colormap
However they only show how to define colours and not how to link those colours to an specific value.
It turns out this is harder than I thought, so maybe someone has an easier way of doing this.
Since we need to create an image of the data, we will store them in a 2D array. We can then map the data to the integers 0 .. number of different data values and assign a color to each of them. The reason is that we want the final colormap to be equally spaced. So
value -2 --> integer 0 --> color orange
value 0 --> integer 1 --> color blue
and so on.
Having nicely spaced integers, we can use a ListedColormap on the image of newly created integer values.
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.colors
# define the image as a 2D array
d = np.array([[-2,0],[1,3]])
# create a sorted list of all unique values from d
ticks = np.unique(d.flatten()).tolist()
# create a new array of same shape as d
# we will later use this to store values from 0 to number of unique values
dc = np.zeros(d.shape)
#fill the array dc
for i in range(d.shape[0]):
for j in range(d.shape[1]):
dc[i,j] = ticks.index(d[i,j])
# now we need n (= number of unique values) different colors
colors= ["orange", "blue", "red", "yellow"]
# and put them to a listed colormap
colormap = matplotlib.colors.ListedColormap(colors)
plt.figure(figsize=(5,3))
#plot the newly created array, shift the colorlimits,
# such that later the ticks are in the middle
im = plt.imshow(dc, cmap=colormap, interpolation="none", vmin=-0.5, vmax=len(colors)-0.5)
# create a colorbar with n different ticks
cbar = plt.colorbar(im, ticks=range(len(colors)) )
#set the ticklabels to the unique values from d
cbar.ax.set_yticklabels(ticks)
#set nice tickmarks on image
plt.gca().set_xticks(range(d.shape[1]))
plt.gca().set_yticks(range(d.shape[0]))
plt.show()
As it may not be intuitively clear how to get the array d in the shape needed for plotting with imshow, i.e. as 2D array, here are two ways of converting the input data columns:
import numpy as np
x = np.array([0,1,0,1])
y = np.array([ 0,0,1,1])
d_original = np.array([-2,0,1,3])
#### Method 1 ####
# Intuitive method.
# Assumption:
# * Indexing in x and y start at 0
# * every index pair occurs exactly once.
# Create an empty array of shape (n+1,m+1)
# where n is the maximum index in y and
# m is the maximum index in x
d = np.zeros((y.max()+1 , x.max()+1), dtype=np.int)
for k in range(len(d_original)) :
d[y[k],x[k]] = d_original[k]
print d
#### Method 2 ####
# Fast method
# Additional assumption:
# indizes in x and y are ordered exactly such
# that y is sorted ascendingly first,
# and for each index in y, x is sorted.
# In this case the original d array can bes simply reshaped
d2 = d_original.reshape((y.max()+1 , x.max()+1))
print d2

Categories