Making a density plot in Python from imported data file - python

I have a .dat file whose structure is given by three columns that I suppose to refer to be x, y and z = f(x,y), respectively.
I want to make a density plot out of this data. While looking for some example that could help me out, I came across the following posts:
How to plot a density map in python?
matplotlib plot X Y Z data from csv as pcolormesh
What I have tried so far is the following:
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
x, y, z = np.loadtxt('data.dat', unpack=True, delimiter='\t')
N = int(len(z)**.5)
z = z.reshape(N, N)
plt.imshow(z, extent=(np.amin(x), np.amax(x), np.amin(y), np.amax(y)),cmap=cm.hot)
plt.colorbar()
plt.show()
The file data can be accessed here: data.dat.
When I run the script above, it returns me the following error message:
cannot reshape array of size 42485 into shape (206,206)
Can someone help me to understand what I have done wrong and how to fix it?

The reason is that your data is not exactly 260*260, but your z is larger.
One option is to slice the z, but you are missing data when you are doing that.
And if that is what you want you are no longer using your x,y values.
z = z[:N**2].reshape(N,N)
In the link you posted I saw this statement:
I assume here that your data can be transformed into a 2d array by a simple reshape. If this is not the case than you need to work a bit harder on getting the data in this form.
The assumption does not hold for your data.

Related

How can I plot only particular values in xarray?

I am using data from cdasws to plot dynamic spectra. I am following the example found here https://cdaweb.gsfc.nasa.gov/WebServices/REST/jupyter/CdasWsExample.html
This is my code which I have modified to obtain a dynamic spectra for STEREO.
from cdasws import CdasWs
from cdasws.datarepresentation import DataRepresentation
import matplotlib.pyplot as plt
cdas = CdasWs()
import numpy as np
datasets = cdas.get_datasets(observatoryGroup='STEREO')
for index, dataset in enumerate(datasets):
print(dataset['Id'], dataset['Label'])
variables = cdas.get_variables('STEREO_LEVEL2_SWAVES')
for variable_1 in variables:
print(variable_1['Name'], variable_1['LongDescription'])
data = cdas.get_data('STEREO_LEVEL2_SWAVES', ['avg_intens_ahead'],
'2020-07-11T02:00:00Z', '2020-07-11T03:00:00Z',
dataRepresentation = DataRepresentation.XARRAY)[1]
print(data)
plt.figure(figsize = (15,7))
# plt.ylim(100,1000)
plt.xticks(fontsize=18)
plt.yticks(fontsize=18)
plt.yscale('log')
sorted_data.transpose().plot()
plt.xlabel("Time",size=18)
plt.ylabel("Frequency (kHz)",size=18)
plt.show()
Using this code gives a plot that looks something like this,
My question is, is there anyway of plotting this spectrum only for a particular frequency? For example, I want to plot just the intensity values at 636 kHz, is there any way I can do that?
Any help is greatly appreciated, I dont understand xarray, I have never worked with it before.
Edit -
Using the command,
data_stereo.avg_intens_ahead.loc[:,625].plot()
generates a plot that looks like,
While this is useful, what I needed is;
for the dynamic spectrum, if i choose a particular frequency like 600khz, can it display something like this (i have just added white boxes to clarify what i mean) -
If you still want the plot to be 2D, but to include a subset of your data along one of the dimensions, you can provide an array of indices or a slice object. For example:
data_stereo.avg_intens_ahead.sel(
frequency=[625]
).plot()
Or
# include a 10% band on either side
data_stereo.avg_intens_ahead.sel(
frequency=slice(625*0.9, 625*1.1)
).plot()
Alternatively, if you would actually like your plot to show white space outside this selected area, you could mask your data with where:
data_stereo.avg_intens_ahead.where(
data_stereo.frequency==625
).plot()

Using `dask.array.map_block()` to parallelize line fitting on a 3-D `dask.array`

I have a series of N images that are recorded at different times. I have stacked the images into a 3-D dask array and rechunked them along the time axis. I would now like to perform a linear fit at each pixel position across the image, but I am running into the following error when using da.map_blocks as I try to scale up: TypeError: expected 1D or 2D array for y
I found one other post, applying-a-function-along-an-axis-of-a-dask-array, related to this but it didn't address an issue with specifically setting the chunk size. When using da.apply_along_axis I found an issue similar to the one reported in dask-performance-apply-along-axis wherein only one CPU seems to be utilized during the computation (even for chunked data).
MWE: Works properly
import dask.array as da
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('ggplot')
def f(y, args, axis=None):
return np.polyfit(args[0], y.squeeze(), args[1])[:, None, None]
deg = 1
nsamp=20*10*10
shape=(20,10,10)
chunk_size=(20,1,1)
a = da.linspace(1, nsamp, nsamp).reshape(shape)
chunked = a.rechunk(chunk_size)
times = da.linspace(1, shape[0], shape[0])
results = chunked.map_blocks(f, chunks=(20,1,1), args=[times, deg], dtype='float').compute()
m_fit = results[0]
b_fit = results[1]
# Plot a few fits to visually examine them
fig, ax = plt.subplots(nrows=1, ncols=1)
for (x,y) in zip([1,9], [1,9]):
ax.scatter(times, chunked[:,x,y])
ax.plot(times, np.polyval([m_fit[x, y], b_fit[x,y]], times))
The array, chunked, looks like this:
The resulting plot looks like this,
Which is exactly what I would expect and so all is well! However, the issue arises whenever I try to use a chunksize larger than one.
MWE: Raises TypeError
nsamp=20*10*10
shape=(20,10,10)
chunk_size=(20,5,5) # Chunking the data now
a = da.linspace(1,nsamp, nsamp).reshape(shape)
chunked = a.rechunk(chunk_size)
times = da.linspace(1, shape[0], shape[0])
results = chunked.map_blocks(f, chunks=(20,1,1), args=[times, 1], dtype='float') # error
Does anyone have any ideas as to what is happening here?
It looks like maybe your function expects single-dimensional inputs. I wonder if there is a way that you can write a Python function that wraps your function and handles the unpacking and then repacking of one-dimensional inputs. If you can get that function to work on a single numpy array of shape (20, 2, 2) for example then you can probably use Dask to then apply that function across many similarly sized chunks

Error when trying to interpolate using SmoothSphereBivariateSpline(): "ValueError: Error code returned by bispev: 10"

I want to interpolate data, which is randomly scattered on the surface of a sphere, onto a regular longitude/latitude grid. I tried to do this with SmoothSphereBivariateSpline() from the scipy.interpolate package (see the code below).
import numpy as np
from scipy.interpolate import SmoothSphereBivariateSpline
#Define the input data and the original sampling points
NSamp = 2000
Theta = np.random.uniform(0,np.pi,NSamp)
Phi = np.random.uniform(0,2*np.pi, NSamp)
Data = np.ones(NSamp)
Interpolator = SmoothSphereBivariateSpline(Theta, Phi, Data, s=3.5)
#Prepare the grid to which the input shall be interpolated
NLon = 64
NLat = 32
GridPosLons = np.arange(NLon)/NLon * 2 * np.pi
GridPosLats = np.arange(NLat)/NLat * np.pi
LatsGrid, LonsGrid = np.meshgrid(GridPosLats, GridPosLons)
Lats = LatsGrid.ravel()
Lons = LonsGrid.ravel()
#Interpolate
Interpolator(Lats, Lons)
However, when I execute this code it gives me the following error:
ValueError: Error code returned by bispev: 10
Does anyone know what the problem is and how to fix it? Is this a bug or am I doing something wrong?
In the documentation of __call__ method of SmoothSphereBivariateSpline, note the grid flag (some other interpolators have it too). If True, it's understood that you are entering one-dimensional arrays from which a grid is to be formed. This is the default value. But you already made a meshgrid from your one-dimensional arrays, so this default behavior doesn't work for your input.
Solution: use either
Interpolator(Lats, Lons, grid=False)
or, which is simpler and better:
Interpolator(GridPosLats, GridPosLons)
The latter will return the data in grid form (2D array), which makes more sense than the flattened data you would get with the first version.

Python - Plotting Fourier transform from text file

I have this text file that has columns of different recorded values to where the first column is of values of time and columns 2, 3, and 4, are of position x, y, and z, respectively, to where that if I were to plot time vs its position of x, y, or z, it will be shown to oscillate.
I want to take Fourier transform of this data and plot it to where the x-axis is frequency.
I'm have trouble following along from examples from other posts, so maybe somebody can give me advice to go in the correct direction.
Having my text file,
with open('SampleData.txt') as f:
data = f.read()
data = data.split('\n')
t = [float(row.split()[0]) for row in data]
x1 = [float(row.split()[1]) for row in data]
Now using, the numpy function of the Fourier Transform, I have no idea where to go from there.
from matplotlib.pyplot import *
import numpy
spectrum =numpy.fft.fft(x1)
spectrum = abs(spectrum[:len(spectrum)/2]) # Just first half of the spectrum, as the second is the negative copy
figure()
plot(spectrum)
show()
I'll edit the answer according to your need, as your question is not very clear.
Fast Fourier Transforms in Numpy are pretty straightforward:
fft = np.fft.fft(x)
See here for more details - Link
Plotting a simple line is straightforward too:
import matplotlib.pyplot as plt
plt.plot(fft)
See more here - Click
Edit - may be worth reading your files in in a more efficient way - numpy has a text reader which will save you a bit of time and effort. Click
Essentially;
x = np.loadtxt(fname, dtype=<type 'float'>, delimiter=None)

How to plot a graph with text file (2 columns of data) in Python

I have a text file with lots of data that is arranged in 2 columns. I need to use the data in the 2nd column in a formula (which outputs Energy). I need to plot that energy against the time which is all the data in the first column.
So far I have this, and it prints a very weird graph. I know that the energy should be oscillating and decaying exponentially.
import numpy as np
import matplotlib.pyplot as plt
m = 0.090
l = 0.089
g = 9.81
H = np.loadtxt("AngPosition_3p5cmSeparation.txt")
x, y = np.hsplit(H,2)
Ep = m*g*l*(1-np.cos(y))
plt.plot(x, Ep)
plt.show()
I'm struggling to see where I have gone wrong, but then again I am somewhat new to Python. Any help is much appreciated.
I managed to get it to work. My problem was that the angle data had to be converted into radians.
I couldn't do that automatically in Python using math.radians for some reason so I just edited the data in Excel and then back into Notepad.

Categories