Geographical data plot/map with lines in python and matplotlib - python

I remember seeing on a blog post a nice technique to visualize geographical data. It was just lines representing latitude and the high of the lines the variable to be shown. I tried to sketch it on the following picture:
Does some of you remember the library or even the blog post which explained how to generate these maps?
(I vaguely remember it being matplotlib & python, but I could very well be wrong)

I think this is the kind of thing you want - plotting lines of constant latitude on a 3d axis. I've explained what each section does in comments
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
import itertools
#read in data from csv organised in columns labelled 'lat','lon','elevation'
data = np.recfromcsv('elevation-sample.csv', delimiter=',')
# create a 3d axis on a figure
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
# Find unique (i.e. constant) latitude points
id_list = np.unique(data['lat'])
# stride is how many lines to miss. set to 1 to get every line
# higher to miss more
stride = 5
# Extract each line from the dataset and plot it on the axes
for id in id_list[::stride]:
this_line_data = data[np.where(data['lat'] == id)]
lat,lon,ele = zip(*this_line_data)
ax.plot(lon,lat,ele, color='black')
# set the viewpoint so we're looking straight at the longitude (x) axis
ax.view_init(elev=45., azim=90)
ax.set_xlabel('Longitude')
ax.set_ylabel('Latitude')
ax.set_zlabel('Elevation')
ax.set_zlim([0,1500])
plt.show()
The data set I used to test is not mine, but I found it on github here.
This gives output as follows:
Note - you can swap latitude and longitude if I've misinterpreted the axis labels in your sketch.

Are you thinking a 3D plot similar to this? Possibly you could also do a cascade plot like this? The code for the last type of plot is something like this:
# Input parameters:
padding = 1 # Relative distance between plots
ax = gca() # Matplotlib axes to plot in
spectra = np.random.rand((10, 100)) # Series of Y-data
x_data = np.arange(len(spectra[0])) # X-data
# Figure out distance between plots:
max_value = 0
for spectrum in spectra:
spectrum_yrange = (np.nanmax(spectrum) -
np.nanmin(spectrum))
if spectrum_yrange > max_value:
max_value = spectrum_yrange
# Plot the individual lines
for i, spectrum in enumerate(spectra):
# Normalize the data to max_value
data = (spectrum - spectrum.min()) / float(max_value)
# Offset the individual lines
data += i * padding
ax.plot(x_data, data)

Related

Radial heatmap from similarity matrix in Python

Summary
I have a 2880x2880 similarity matrix (8.5 mil points). My attempt with Holoviews resulted in a 500 MB HTML file which never finishes "opening". So how do I make a round heatmap of the matrix?
Details
I had data from 10 different places, measured over 1 whole year. The hours of each month were turned into arrays, so each month had 24 arrays (one for all 00:00, one for all 01:00 ... 22:00, 23:00).
These were about 28-31 cells long, and each cell had the measurement of the thing I'm trying to analyze. So there are these 24 arrays for each month of 1 whole year, i.e. 24x12 = 288 arrays per place. And there are measurements from 10 places. So a total of 2880 arrays were created and all compared to each other, and saved in a 2880x2880 matrix with similarity coefficients.
I'm trying to turn it into a radial similarity matrix like the one from holoviews, but without the ticks and tags (since the format Place01Jan0800 would be cumbersome to look at for 2880 rows), just the shape and colors and divisions:
I managed to create the HTML file itself, but it ended up being 500 MB big, so it never shows up when I open it up. It's just blank. I've added a minimal example below of what I have, and replaced the loading of the datafile with some randomly generated data.
import sys
sys.setrecursionlimit(10000)
import random
import numpy as np
import pandas as pd
import holoviews as hv
from holoviews import opts
from bokeh.plotting import show
import gc
# Function creating dummy data for this example
def transformer():
dimension = 2880
dummy_matrix = ([[ random.random() for i in range(dimension) ] for j in range(dimension)]) #Fake, similar data
col_vals = [str(i) for i in range(dimension*dimension)] # Placeholder
row_vals = [str(i) for i in range(dimension*dimension)] # Placeholder
val_vals = (np.reshape(np.array(dummy_matrix), -1)).tolist() # Turn matrix into an array
idx_vals = [i for i in range(dimension*dimension)] # Placeholder
return idx_vals, val_vals, row_vals, col_vals
idx_arr, val_arr, row_arr, col_arr = transformer()
df = pd.DataFrame({"values": val_arr, "x-label": row_arr, "y-label": col_arr}, index=idx_arr)
hv.extension('bokeh')
heatmap = hv.HeatMap(df, ["x-label", "y-label"])
heatmap.opts(opts.HeatMap(cmap="viridis", radial=True))
gc.collect() # Attempt to save memory, because this thing is huge
show(hv.render(heatmap))
I had a look at datashader to see if it would help, but I have no idea how to plug it in (if it's possible for this case) to this radial heatmap, since it seems like the radial heatmap doesn't have that datashade-feature.
So I have no idea how to tackle this. I would be content with a broad overview too, I don't need the details nor the hover-infobox nor ability to zoom or any fancy extra features, I just need the general overview for a presentation. I'm open to any solution really.
I recommend you to use heatmp instead of radial heatamp for showing the similarity matrix. The reasons are:
The radial heatmap is designed for periodic variable. The time varible(288 hours) can be considered to be periodic data, however, I think the 288*10(288 hours, 10 places) is no longer periodic because of the existence of the "place".
Near the center of the radial heatmap, the color points will be too dense to be understood by the human.
The following is a simple code to show a heatmap.
import matplotlib.cm
import matplotlib.pyplot as plt
from matplotlib.colors import Normalize
import numpy as np
n = 2880
m = 2880
dummy_matrix = np.random.rand(m, n)
fig = plt.figure(figsize=(50,50)) # change the figsize to control the resolution
ax = fig.add_subplot(111)
cmap = matplotlib.cm.get_cmap("Blues") # you may use other build-in colormap or define you own colormap
# if your data is not in range[0,1], use a normalization. Here is normalized by min and max values.
norm = Normalize(vmin=np.amin(dummy_matrix), vmax=np.amax(dummy_matrix))
image = ax.imshow(dummy_matrix, cmap=cmap, norm=norm)
plt.colorbar(image)
plt.show()
Which gives:
Another idea that comes to me is that, perhaps the computation of similarity matrix is unnecessary, and you can plot the orginial 288 * 10 data using radial heat map or just a normal heatmap, and one can get to know the data similarity from the color distribution directly.
Plain Matplotlib seems to be able to handle it, based on answers from here: How do I create radial heatmap in matplotlib?
import random
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
fig = plt.figure()
ax = Axes3D(fig)
n = 2880
m = 2880
rad = np.linspace(0, 10, m)
a = np.linspace(0, 2 * np.pi, n)
r, th = np.meshgrid(rad, a)
dummy_matrix = ([[ random.random() for i in range(n) ] for j in range(m)])
plt.subplot(projection="polar")
plt.pcolormesh(th, r, dummy_matrix, cmap = 'Blues')
plt.plot(a, r, ls='none', color = 'k')
plt.grid()
plt.colorbar()
plt.savefig("custom_radial_heatmap.png")
plt.show()
And it didn't even take an eternity, took only about 20 seconds max.
You would think it would turn out monstrous like that
But the sheer amount of points drowns out the jaggedness, WOOHOO!
There's some things left to be desired, like tags and ticks, but I think I'll figure that out.

Color Coding Scatterplot Based on Defined Names

I have some excel data that I wanted to graph viz. a circular scatterplot. The data looks like this:
I began writing the script in Python as follows:
# Initializing
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
# Importing dataframe and defining main variables
tdtData = pd.read_excel(r'C:\Users\Bri-Guy\Desktop\tdtData.xlsx', sheet_name='Sheet1')
tdtData.head()
probe=tdtData.Probe
area=tdtData.Area
light=tdtData.Lightavg
np.random.seed(19680801)
# Compute areas and colors
N = 234
r = area
theta = 2 * np.pi * np.random.rand(N)
size = light
colors = theta
fig = plt.figure()
ax = fig.add_subplot(projection='polar')
c = ax.scatter(theta, r, c=colors, s=size, cmap='hsv', alpha=0.75)
ax.set_thetamin(0)
ax.set_thetamax(310)
Right now I have it such that the colors change based on the angle parameter theta. However, I want to color the scatterplot based on the specific probes associated with each datapoint (S100b+, S100b-, scn10a, fxyd2, fxyd2/scn10a). It would also be nice to implement a legend that goes with these names.
Additionally, the plot generates with labels for the r-axis. I want to see those values because they help to show the range of the area variable. However, they're only visible if I restrict the range of theta. I'm wondering if there's a better way to go about it? Potentially increase the size of the figure, rotate the r-axis labels, or something else entirely?
Thank you for the help.
*Edit: I just realized that setting a defined range on theta actually obscures part of the data plotted for theta in(310,360). However, increasing figure size seems to be enough.
fig.set_size_inches(18.5, 10.5)

2D Color coded scatter plot with user defined color range and static colormap

I have 3 vectors - x,y,vel each having some 8k values. I also have quite a few files containing these 3 vectors. All the files have different x,y,vel. I want to get multiple scatter plots with the following conditions:
Color coded according to the 3rd variable i.e vel.
Once the ranges have been set for the colors (for the data from the 1st file), they should remain constant for all the remaining files. i don't want a dynamically changing (color code changing with each new file).
Want to plot a colorbar.
I greatly appreciate all your thoughts!!
I have attached the code for a single file.
import numpy as np
import matplotlib.pyplot as plt
# Create Map
cm = plt.cm.get_cmap('RdYlBu')
x,y,vel = np.loadtxt('finaldata_temp.txt', skiprows=0, unpack=True)
vel = [cm(float(i)/(8000)) for i in xrange(8000)] # 8000 is the no. of values in each of x,y,vel vectors.
# 2D Plot
plt.scatter(x, y, s=27, c=vel, marker='o')
plt.axis('equal')
plt.savefig('testfig.png', dpi=300)
plt.show()
quit()
You will have to iterate over all your data files to get the maximum value for vel, I have added a few lines of code (that need to be adjusted to fit your case) that will do that.
Therefore, your colorbar line has been changed to use the max_vel, allowing you to get rid of that code using the fixed value of 8000.
Additionally, I took the liberty to remove the black edges around the points, because I find that they 'obfuscate' the color of the point.
Lastly, I have added adjusted your plot code to use an axis object, which is required to have a colorbar.
import numpy as np
import matplotlib.pyplot as plt
# This is needed to iterate over your data files
import glob
# Loop over all your data files to get the maximum value for 'vel'.
# You will have to adjust this for your code
"""max_vel = 0
for i in glob.glob(<your files>,'r') as fr:
# Iterate over all lines
if <vel value> > max_vel:
max_vel = <vel_value>"""
# Create Map
cm = plt.cm.get_cmap('RdYlBu')
x,y,vel = np.loadtxt('finaldata_temp.txt', skiprows=0, unpack=True)
# Plot the data
fig=plt.figure()
fig.patch.set_facecolor('white')
# Here we switch to an axis object
# Additionally, you can plot several of your files in the same figure using
# the subplot option.
ax=fig.add_subplot(111)
s = ax.scatter(x,y,c=vel,edgecolor=''))
# Here we assign the color bar to the axis object
cb = plt.colorbar(mappable=s,ax=ax,cmap=cm)
# Here we set the range of the color bar based on the maximum observed value
# NOTE: This line only changes the calculated color and not the display
# 'range' of the legend next to the plot, for that we need to switch to
# ColorbarBase (see second code snippet).
cb.setlim(0,max_vel)
cb.set_label('Value of \'vel\'')
plt.show()
Snippet, demonstrating ColorbarBase
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
cm = plt.cm.get_cmap('RdYlBu')
x = [1,5,10]
y = [2,6,9]
vel = [7,2,1]
# Plot the data
fig=plt.figure()
fig.patch.set_facecolor('white')
ax=fig.add_subplot(111)
s = ax.scatter(x,y,c=vel,edgecolor=''))
norm = mpl.colors.Normalize(vmin=0, vmax=10)
ax1 = fig.add_axes([0.95, 0.1, 0.01, 0.8])
cb = mpl.colorbar.ColorbarBase(ax1,norm=norm,cmap=cm,orientation='vertical')
cb.set_clim(vmin = 0, vmax = 10)
cb.set_label('Value of \'vel\'')
plt.show()
This produces the following plot
For more examples of what you can do with the colorbar, specifically the more flexible ColorbarBase, I would suggest that you check the documentation -> http://matplotlib.org/examples/api/colorbar_only.html

Changing axis options for Polar Plots in Matplotlib/Python

I have a problem changing my axis labels in Matplotlib. I want to change the radial axis options in my Polar Plot.
Basically, I'm computing the distortion of a cylinder, which is nothing but how much the radius deviates from the original (perfectly circular) cylinder. Some of the distortion values are negative, while some are positive due to tensile and compressive forces. I'm looking for a way to represent this in cylindrical coordinates graphically, so I thought that a polar plot was my best bet. Excel gives me a 'radar chart' option which is flexible enough to let me specify minimum and maximum radial axis values. I want to replicate this on Python using Matplotlib.
My Python script for plotting on polar coordinates is as follows.
#!usr/bin/env python
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(-180.0,190.0,10)
theta = (np.pi/180.0 )*x # in radians
offset = 2.0
R1 = [-0.358,-0.483,-0.479,-0.346,-0.121,0.137,0.358,0.483,0.479,0.346,0.121,\
-0.137,-0.358,-0.483,-0.479,-0.346,-0.121,0.137,0.358,0.483,0.479,0.346,0.121,\
-0.137,-0.358,-0.483,-0.479,-0.346,-0.121,0.137,0.358,0.483,0.479,0.346,0.121,\
-0.137,-0.358]
fig1 = plt.figure()
ax1 = fig1.add_axes([0.1,0.1,0.8,0.8],polar=True)
ax1.set_rmax(1)
ax1.plot(theta,R1,lw=2.5)
My plot looks as follows:
But this is not how I want to present it. I want to vary my radial axis, so that I can show the data as a deviation from some reference value, say -2. How do I ask Matplotlib in polar coordinates to change the minimum axis label? I can do this VERY easily in Excel. I choose a minimum radial value of -2, to get the following Excel radar chart:
On Python, I can easily offset my input data by a magnitude of 2. My new dataset is called R2, as shown:
#!usr/bin/env python
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(-180.0,190.0,10)
theta = (np.pi/180.0 )*x # in radians
offset = 2.0
R2 = [1.642,1.517,1.521,1.654,1.879,2.137,2.358,2.483,2.479,2.346,2.121,1.863,\
1.642,1.517,1.521,1.654,1.879,2.137,2.358,2.483,2.479,2.346,2.121,1.863,1.642,\
1.517,1.521,1.654,1.879,2.137,2.358,2.483,2.479,2.346,2.121,1.863,1.642]
fig2 = plt.figure()
ax2 = fig2.add_axes([0.1,0.1,0.8,0.8],polar=True)
ax2.plot(theta,R2,lw=2.5)
ax2.set_rmax(1.5*offset)
plt.show()
The plot is shown below:
Once I get this, I can MANUALLY add axis labels and hard-code it into my script. But this is a really ugly way. Is there any way I can directly get a Matplotlib equivalent of the Excel radar chart and change my axis labels without having to manipulate my input data?
You can just use the normal way of setting axis limits:
#!usr/bin/env python
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(-180.0,190.0,10)
theta = (np.pi/180.0 )*x # in radians
offset = 2.0
R1 = [-0.358,-0.483,-0.479,-0.346,-0.121,0.137,0.358,0.483,0.479,0.346,0.121,\
-0.137,-0.358,-0.483,-0.479,-0.346,-0.121,0.137,0.358,0.483,0.479,0.346,0.121,\
-0.137,-0.358,-0.483,-0.479,-0.346,-0.121,0.137,0.358,0.483,0.479,0.346,0.121,\
-0.137,-0.358]
fig1 = plt.figure()
ax1 = fig1.add_axes([0.1,0.1,0.8,0.8],polar=True)
ax1.set_ylim(-2,2)
ax1.set_yticks(np.arange(-2,2,0.5))
ax1.plot(theta,R1,lw=2.5)

Overlaying a lineCollection on a plot in matplotlib - how to get the two to line up.

I'm trying to do a heat map over a shape file in python. I need to make quite a few of these so don't want to read in the .shp every time.
Instead, I thought I could create a lineCollection instance of the map boundaries and overlay the two images. Problem is - I can't seem to get the two to line up correctly.
Here is the code, where linecol is the lineCollection object.
fig = plt.figure()
ax = fig.add_subplot(111)
ax.contourf(xi,yi,zi)
ax.add_collection(linecol, autolim = False)
plt.show()
Is there an easy way to fix the limits of linecol to match those of the other plot? I've had a play with set_xlim and transforms.Bbox, but can't seem to manage it.
Thank you very much for your help!
Transforms are tricky because of the various coordinate systems involved. See http://matplotlib.sourceforge.net/users/transforms_tutorial.html.
I managed to scale a LineCollection to the appropriate size like this. The key was to realize that I needed to add + ax.transData to the new transform I set on the LineCollection. (When you don't set any transform on an artist object, ax.transData is the default. It converts data coordinates into display coordinates.)
from matplotlib import cm
import matplotlib.pyplot as plt
import matplotlib.collections as mc
import matplotlib.transforms as tx
import numpy as np
fig = plt.figure()
# Heat map spans 1 x 1.
ax = fig.add_subplot(111)
xs = ys = np.arange(0, 1.01, 0.01)
zs = np.random.random((101,101))
ax.contourf(xs, ys, zs, cmap=cm.autumn)
lines = mc.LineCollection([[(5,1), (9,5), (5,9), (1,5), (5,1)]])
# Shape spans 10 x 10. Resize it to 1 x 1 before applying the transform from
# data coords to display coords.
trans = tx.Affine2D().scale(0.1) + ax.transData
lines.set_transform(trans)
ax.add_collection(lines)
plt.show()
(Output here: http://i.stack.imgur.com/hDNN8.png Not enough reputation to post inline.)
It should be easy to modify this if you need the shape translated or scaled unequally on x and y.

Categories