I'm trying to use the Basemap function to create a plot like the one shown here, but using this data.
This is my code:
west, south, east, north = -74.26, 40.50, -73.70, 40.92
fig = plt.figure(figsize=(14,10))
m = Basemap(projection='merc', llcrnrlat=south, urcrnrlat=north,
llcrnrlon=west, urcrnrlon=east, lat_ts=south, resolution='c')
x, y = m(df['pickup_longitude'].values, df['pickup_latitude'].values)
m.hexbin(x, y, gridsize=1900, cmap=cm.YlOrRd_r)
However, my result is nothing but weird.
I'm wondering what I'm missing.
Thanks.
It seems the data comprises much more data than in the range inside the Basemap plot.
You will get the desired plot by using a lot more gridpoints, e.g. gridsize=10000. This will however cost a lot of memory.
A better option would probably be to first select from the dataframe those values that are in the range to be shown in the map.
import pandas as pd
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
from matplotlib import cm
df = pd.read_csv("train.csv")
west, south, east, north = -74.26, 40.50, -73.70, 40.92
df = df[(df['pickup_longitude'] > west) & (df['pickup_longitude'] < east)]
df = df[(df['pickup_latitude'] > south) & (df['pickup_latitude'] < north)]
fig = plt.figure(figsize=(14,8))
m = Basemap(projection='merc', llcrnrlat=south, urcrnrlat=north,
llcrnrlon=west, urcrnrlon=east, lat_ts=south, resolution='c')
x, y = m(df['pickup_longitude'].values, df['pickup_latitude'].values)
m.hexbin(x, y, gridsize=100, bins='log', cmap=cm.YlOrRd_r, lw=0.4)
plt.show()
Using a more gridpoints then allows for even finer resolution. E.g. gridsize=1000:
Related
I am trying to plot NASA GISS gridded temperature data but my maps keep showing up blank. Below is my code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
import geopandas as gpd
import xarray as xr
ncin = xr.open_dataset('GriddedAir250.nc')
lons = ncin.variables['lon'][:]
lats = ncin.variables['lat'][:]
air = ncin.air
MeanTmax=air.mean(dim='time')
m=Basemap(projection='merc',
llcrnrlon= -123.416059,
llcrnrlat=18.954443,
urcrnrlon=-61.285950,
urcrnrlat= 47.536340,
resolution='i')
lon, lat = np.meshgrid(lons, lats)
xi, yi = m(lon, lat)
# Add Coastlines, States, and Country Boundaries
m.drawcoastlines()
m.drawstates()
m.drawcountries()
# Plot Data
cs = m.pcolor(xi,yi,np.squeeze(MeanTmax))
# Add Colorbar
cbar = m.colorbar(cs, location='bottom', pad="10%")
cbar.set_label('winter')
# Add Title
plt.title('DJF Maximum Temperature')
plt.show()
All I get is a blank map that looks like this. Why isn't the temperature data showing up?
The longitude grid in the source data is from 0 to 360 rather than -180 to 180. Because of this, it's likely that you've filtered out all of the data in your basemap projection command. I haven't tested because I don't have the deprecated basemap package.
I'm trying to plot data onto a map. I would like to generate data for specific points on the map (e.g. transit times to one or more prespecified location) for a specific city.
I found data for New York City here: https://data.cityofnewyork.us/City-Government/Borough-Boundaries/tqmj-j8zm
It looks like they have a shapefile available. I'm wondering if there is a way to sample a latitude-longitude grid within the bounds of the shapefile for each borough (perhaps using Shapely package, etc).
Sorry if this is naive, I'm not very familiar with working with these files--I'm doing this as a fun project to learn about them
I figured out how to do this. Essentially, I just created a full grid of points and then removed those that did not fall within the shape files corresponding to the boroughs. Here is the code:
import geopandas
from geopandas import GeoDataFrame, GeoSeries
import matplotlib.pyplot as plt
from matplotlib.colors import Normalize
import matplotlib.cm as cm
%matplotlib inline
import seaborn as sns
from shapely.geometry import Point, Polygon
import numpy as np
import googlemaps
from datetime import datetime
plt.rcParams["figure.figsize"] = [8,6]
# Get the shape-file for NYC
boros = GeoDataFrame.from_file('./Borough Boundaries/geo_export_b641af01-6163-4293-8b3b-e17ca659ed08.shp')
boros = boros.set_index('boro_code')
boros = boros.sort_index()
# Plot and color by borough
boros.plot(column = 'boro_name')
# Get rid of are that you aren't interested in (too far away)
plt.gca().set_xlim([-74.05, -73.85])
plt.gca().set_ylim([40.65, 40.9])
# make a grid of latitude-longitude values
xmin, xmax, ymin, ymax = -74.05, -73.85, 40.65, 40.9
xx, yy = np.meshgrid(np.linspace(xmin,xmax,100), np.linspace(ymin,ymax,100))
xc = xx.flatten()
yc = yy.flatten()
# Now convert these points to geo-data
pts = GeoSeries([Point(x, y) for x, y in zip(xc, yc)])
in_map = np.array([pts.within(geom) for geom in boros.geometry]).sum(axis=0)
pts = GeoSeries([val for pos,val in enumerate(pts) if in_map[pos]])
# Plot to make sure it makes sense:
pts.plot(markersize=1)
# Now get the lat-long coordinates in a dataframe
coords = []
for n, point in enumerate(pts):
coords += [','.join(__ for __ in _.strip().split(' ')[::-1]) for _ in str(point).split('(')[1].split(')')[0].split(',')]
which results in the following plots:
I also got a matrix of lat-long coordinates I used to make a transport-time map for every point in the city to Columbia Medical Campus. Here is that map:
and a zoomed-up version so you can see how the map is made up of the individual points:
How can I rotate a seaborn.lineplot so that the result will be as a function of y and not a function of x.
For example, this code:
import pandas as pd
import seaborn as sns
df = pd.DataFrame([[0,1],[0,2],[0,1.5],[1,1],[1,5]], columns=['group','val'])
sns.lineplot(x='group',y='val',data=df)
Create this figure:
But is there a way to rotate the figure in 90° ? so that in the X we will have "val" and in Y we will have "group" and the std will go from left to right and not from bottom to up.
Thanks
EDIT: I've opened a ticket in seaborn to ask for this feature: https://github.com/mwaskom/seaborn/issues/1661
Per the seaborn docs on lineplot, the dataframe passed to data must be
Tidy (“long-form”) dataframe where each column is a variable and each row is an observation.
Which seems to imply there is no way to force the axes to switch, even by manipulating the data. If there is a way to do that I haven't found it - I'm sure there is a more elegant way to do this, but one way you could go about it is to do it by hand so to speak. Something like this would do the trick
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
df = pd.DataFrame([[0,1],[0,2],[0,1.5],[1,1],[1,5]], columns=['group','val'])
group = df['group'].tolist()
val = df['val'].tolist()
yl = list()
yu = list()
avg = list()
ii = 0
while ii < len(group): #Loop through all the groups
g = group[ii]
y0 = val[ii]
y1 = val[ii]
s = 0
jj = ii
while (jj < len(group) and group[jj] == g):
s += val[jj]
#This takes the min and max, but could easily take the standard deviation
if val[jj] > y1:
y1 = val[jj]
if val[jj] < y0:
y0 = val[jj]
jj += 1
avg.append(s/(jj - ii))
ii = jj
yl.append(y0)
yu.append(y1)
x = np.linspace(min(group), max(group), len(yl))
plt.ylabel(df.columns[0])
plt.xlabel(df.columns[1])
plt.plot(avg, x, color="#5a9edd", linestyle="-", linewidth=1.5)
plt.fill_betweenx(x, yl, yu, alpha=0.3)
This will give you the following plot:
For brevity this uses the minimum and maximum from each group to give the error band, but that can be easily changed to standard error or standard deviation as needed.
Consider what you'd do if not using seaborn. You would calculate the mean and standard deviation and plot those as a function of the group. Now it is quite straight forward to exchange x and y for a plot(x,y): plot(y,x). For the filled region, you can use fill_betweenx instead of fill_between.
Below the two cases for comparisson.
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame([[0,1],[0,2],[0,1.5],[1,1],[1,5]], columns=['group','val'])
mean = df.groupby("group").mean()
std = df.groupby("group").std()
fig, (ax, ax2) = plt.subplots(ncols=2)
ax.plot(mean.index, mean["val"].values)
ax.fill_between(mean.index, (mean-std)["val"].values, (mean+std)["val"].values, alpha=.5)
ax.set(xlabel="group", ylabel="val")
ax2.plot(mean["val"].values, mean.index)
ax2.fill_betweenx(mean.index, (mean-std)["val"].values, (mean+std)["val"].values, alpha=.5)
ax2.set(ylabel="group", xlabel="val")
fig.tight_layout()
plt.show()
This question already has answers here:
Plot only on continent in matplotlib
(5 answers)
Closed 5 years ago.
I am trying to plot 1x1 degree data on a matplotlib.Basemap, and I want to fill the ocean with white. However, in order for the boundaries of the ocean to follow the coastlines drawn by matplotlib, the resolution of the white ocean mask should be much higher than the resolution of my data.
After searching around for a long time I tried the two possible solutions:
(1) maskoceans() and is_land() functions, but since my data is lower resolution than the map drawn by basemap it does not look good on the edges. I do not want to interpolate my data to higher resolution either.
(2) m.drawlsmask(), but since zorder cannot be assigned the pcolormesh plot always overlays the mask.
This code
import numpy as np
import matplotlib.pyplot as plt
import mpl_toolkits.basemap as bm
#Make data
lon = np.arange(0,360,1)
lat = np.arange(-90,91,1)
data = np.random.rand(len(lat),len(lon))
#Draw map
plt.figure()
m = bm.Basemap(resolution='i',projection='laea', width=1500000, height=2900000, lat_ts=60, lat_0=72, lon_0=319)
m.drawcoastlines(linewidth=1, color='white')
data, lon = bm.addcyclic(data,lon)
x,y = m(*np.meshgrid(lon,lat))
plt.pcolormesh(x,y,data)
plt.savefig('1.png',dpi=300)
Produces this image:
Adding m.fillcontinents(color='white') produces the following image, which is what I need but to fill the ocean and not the land.
Edit:
m.drawmapboundary(fill_color='lightblue') also fills over land and can therefore not be used.
The desired outcome is that the oceans are white, while what I plotted with plt.pcolormesh(x,y,data) shows up over the lands.
I found a much nicer solution to the problem which uses the polygons defined by the coastlines in the map to produce a matplotlib.PathPatch that overlays the ocean areas. This solution has a much better resolution and is much faster:
from matplotlib import pyplot as plt
from mpl_toolkits import basemap as bm
from matplotlib import colors
import numpy as np
import numpy.ma as ma
from matplotlib.patches import Path, PathPatch
fig, ax = plt.subplots()
lon_0 = 319
lat_0 = 72
##some fake data
lons = np.linspace(lon_0-60,lon_0+60,10)
lats = np.linspace(lat_0-15,lat_0+15,5)
lon, lat = np.meshgrid(lons,lats)
TOPO = np.sin(np.pi*lon/180)*np.exp(lat/90)
m = bm.Basemap(resolution='i',projection='laea', width=1500000, height=2900000, lat_ts=60, lat_0=lat_0, lon_0=lon_0, ax = ax)
m.drawcoastlines(linewidth=0.5)
x,y = m(lon,lat)
pcol = ax.pcolormesh(x,y,TOPO)
##getting the limits of the map:
x0,x1 = ax.get_xlim()
y0,y1 = ax.get_ylim()
map_edges = np.array([[x0,y0],[x1,y0],[x1,y1],[x0,y1]])
##getting all polygons used to draw the coastlines of the map
polys = [p.boundary for p in m.landpolygons]
##combining with map edges
polys = [map_edges]+polys[:]
##creating a PathPatch
codes = [
[Path.MOVETO] + [Path.LINETO for p in p[1:]]
for p in polys
]
polys_lin = [v for p in polys for v in p]
codes_lin = [c for cs in codes for c in cs]
path = Path(polys_lin, codes_lin)
patch = PathPatch(path,facecolor='white', lw=0)
##masking the data:
ax.add_patch(patch)
plt.show()
The output looks like this:
Original solution:
You can use an array with greater resolution in basemap.maskoceans, such that the resolution fits the continent outlines. Afterwards, you can just invert the mask and plot the masked array on top of your data.
Somehow I only got basemap.maskoceans to work when I used the full range of the map (e.g. longitudes from -180 to 180 and latitudes from -90 to 90). Given that one needs quite a high resolution to make it look nice, the computation takes a while:
from matplotlib import pyplot as plt
from mpl_toolkits import basemap as bm
from matplotlib import colors
import numpy as np
import numpy.ma as ma
fig, ax = plt.subplots()
lon_0 = 319
lat_0 = 72
##some fake data
lons = np.linspace(lon_0-60,lon_0+60,10)
lats = np.linspace(lat_0-15,lat_0+15,5)
lon, lat = np.meshgrid(lons,lats)
TOPO = np.sin(np.pi*lon/180)*np.exp(lat/90)
m = bm.Basemap(resolution='i',projection='laea', width=1500000, height=2900000, lat_ts=60, lat_0=lat_0, lon_0=lon_0, ax = ax)
m.drawcoastlines(linewidth=0.5)
x,y = m(lon,lat)
pcol = ax.pcolormesh(x,y,TOPO)
##producing a mask -- seems to only work with full coordinate limits
lons2 = np.linspace(-180,180,10000)
lats2 = np.linspace(-90,90,5000)
lon2, lat2 = np.meshgrid(lons2,lats2)
x2,y2 = m(lon2,lat2)
pseudo_data = np.ones_like(lon2)
masked = bm.maskoceans(lon2,lat2,pseudo_data)
masked.mask = ~masked.mask
##plotting the mask
cmap = colors.ListedColormap(['w'])
pcol = ax.pcolormesh(x2,y2,masked, cmap=cmap)
plt.show()
The result looks like this:
I would like to plot a trajectory on a Basemap, and have country labels (names) shown as an overlay.
Here is the current code and the map it produces:
import pandas as pd
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
path = "path\\to\\data"
animal_data = pd.DataFrame.from_csv(path, header=None)
animal_data.columns = ["date", "time", "gps_lat", "gps_long"]
# data cleaning omitted for clarity
params = {
'projection':'merc',
'lat_0':animal_data.gps_lat.mean(),
'lon_0':animal_data.gps_long.mean(),
'resolution':'h',
'area_thresh':0.1,
'llcrnrlon':animal_data.gps_long.min()-10,
'llcrnrlat':animal_data.gps_lat.min()-10,
'urcrnrlon':animal_data.gps_long.max()+10,
'urcrnrlat':animal_data.gps_lat.max()+10
}
map = Basemap(**params)
map.drawcoastlines()
map.drawcountries()
map.fillcontinents(color = 'coral')
map.drawmapboundary()
x, y = map(animal_data.gps_long.values, animal_data.gps_lat.values)
map.plot(x, y, 'b-', linewidth=1)
plt.show()
This results in the map:
This is a map of the trajectory of a migrating bird. While this is a very nice map (!), I need country-name labels so it is easy to determine the countries the bird is flying through.
Is there a straight-forward way of adding the country names?
My solution relies on an external data file that may or may not be available in the future. However, similar data can be found elsewhere, so that should not be too much of a problem.
First, the code for printing the country-name labels:
import pandas as pd
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
class MyBasemap(Basemap):
def printcountries(self, d=3, max_len=12):
data = pd.io.parsers.read_csv("http://opengeocode.org/cude/download.php?file=/home/fashions/public_html/opengeocode.org/download/cow.txt",
sep=";", skiprows=28 )
data = data[(data.latitude > self.llcrnrlat+d) & (data.latitude < self.urcrnrlat-d) & (data.longitude > self.llcrnrlon+d) & (data.longitude < self.urcrnrlon-d)]
for ix, country in data.iterrows():
plt.text(*self(country.longitude, country.latitude), s=country.BGN_name[:max_len])
All this does is to download a country-location database from here, then select countries that are currently on the map, and label them.
The complete code:
import pandas as pd
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
class MyBasemap(Basemap):
def printcountries(self, d=3, max_len=12):
data = pd.io.parsers.read_csv("http://opengeocode.org/cude/download.php?file=/home/fashions/public_html/opengeocode.org/download/cow.txt",
sep=";", skiprows=28 )
data = data[(data.latitude > self.llcrnrlat+d) & (data.latitude < self.urcrnrlat-d) & (data.longitude > self.llcrnrlon+d) & (data.longitude < self.urcrnrlon-d)]
for ix, country in data.iterrows():
plt.text(*self(country.longitude, country.latitude), s=country.BGN_name[:max_len])
path = "path\\to\\data"
animal_data = pd.DataFrame.from_csv(path, header=None)
animal_data.columns = ["date", "time", "gps_lat", "gps_long"]
params = {
'projection':'merc',
'lat_0':animal_data.gps_lat.mean(),
'lon_0':animal_data.gps_long.mean(),
'resolution':'h',
'area_thresh':0.1,
'llcrnrlon':animal_data.gps_long.min()-10,
'llcrnrlat':animal_data.gps_lat.min()-10,
'urcrnrlon':animal_data.gps_long.max()+10,
'urcrnrlat':animal_data.gps_lat.max()+10
}
plt.figure()
map = MyBasemap(**params)
map.drawcoastlines()
map.fillcontinents(color = 'coral')
map.drawmapboundary()
map.drawcountries()
map.printcountries()
x, y = map(animal_data.gps_long.values, animal_data.gps_lat.values)
map.plot(x, y, 'b-', linewidth=1)
plt.show()
and finally, the result:
Clearly this isn't as carefully labeled as one might hope, and some heuristics regarding country size, name length and map size should be implemented to make this perfect, but this is a good starting point.