Generate grid of latitude-longitude coordinates that fall within polygon - python

I'm trying to plot data onto a map. I would like to generate data for specific points on the map (e.g. transit times to one or more prespecified location) for a specific city.
I found data for New York City here: https://data.cityofnewyork.us/City-Government/Borough-Boundaries/tqmj-j8zm
It looks like they have a shapefile available. I'm wondering if there is a way to sample a latitude-longitude grid within the bounds of the shapefile for each borough (perhaps using Shapely package, etc).
Sorry if this is naive, I'm not very familiar with working with these files--I'm doing this as a fun project to learn about them

I figured out how to do this. Essentially, I just created a full grid of points and then removed those that did not fall within the shape files corresponding to the boroughs. Here is the code:
import geopandas
from geopandas import GeoDataFrame, GeoSeries
import matplotlib.pyplot as plt
from matplotlib.colors import Normalize
import matplotlib.cm as cm
%matplotlib inline
import seaborn as sns
from shapely.geometry import Point, Polygon
import numpy as np
import googlemaps
from datetime import datetime
plt.rcParams["figure.figsize"] = [8,6]
# Get the shape-file for NYC
boros = GeoDataFrame.from_file('./Borough Boundaries/geo_export_b641af01-6163-4293-8b3b-e17ca659ed08.shp')
boros = boros.set_index('boro_code')
boros = boros.sort_index()
# Plot and color by borough
boros.plot(column = 'boro_name')
# Get rid of are that you aren't interested in (too far away)
plt.gca().set_xlim([-74.05, -73.85])
plt.gca().set_ylim([40.65, 40.9])
# make a grid of latitude-longitude values
xmin, xmax, ymin, ymax = -74.05, -73.85, 40.65, 40.9
xx, yy = np.meshgrid(np.linspace(xmin,xmax,100), np.linspace(ymin,ymax,100))
xc = xx.flatten()
yc = yy.flatten()
# Now convert these points to geo-data
pts = GeoSeries([Point(x, y) for x, y in zip(xc, yc)])
in_map = np.array([pts.within(geom) for geom in boros.geometry]).sum(axis=0)
pts = GeoSeries([val for pos,val in enumerate(pts) if in_map[pos]])
# Plot to make sure it makes sense:
pts.plot(markersize=1)
# Now get the lat-long coordinates in a dataframe
coords = []
for n, point in enumerate(pts):
coords += [','.join(__ for __ in _.strip().split(' ')[::-1]) for _ in str(point).split('(')[1].split(')')[0].split(',')]
which results in the following plots:
I also got a matrix of lat-long coordinates I used to make a transport-time map for every point in the city to Columbia Medical Campus. Here is that map:
and a zoomed-up version so you can see how the map is made up of the individual points:

Related

How to get the intersection of 2 lines in a plot?

I would like to determine the intersection of two Matplotlib plots.
The input data for the first plot is stored in a CSV file that looks like this:
Time;Channel A;Channel B;Channel C;Channel D (s);(mV);(mV);(mV);(mV)
0,00000000;-16,28006000;2,31961900;13,29508000;-0,98889020
0,00010000;-16,28006000;1,37345900;12,59309000;-1,34293700
0,00020000;-16,16408000;1,49554400;12,47711000;-1,92894600
0,00030000;-17,10414000;1,25747800;28,77549000;-1,57489900
0,00040000;-16,98205000;1,72750600;6,73299900;0,54327920
0,00050000;-16,28006000;2,31961900;12,47711000;-0,51886220
0,00060000;-16,39604000;2,31961900;12,47711000;0,54327920
0,00070000;-16,39604000;2,19753400;12,00708000;-0,04883409
0,00080000;-17,33610000;7,74020200;16,57917000;-0,28079600
0,00090000;-16,98205000;2,31961900;9,66304500;1,48333500
This is the shortened CSV file. The Original has a lot more Data.
I got this code so far to get the FFT of Channel D:
import matplotlib.pyplot as plt
import pandas as pd
from numpy.fft import rfft, rfftfreq
a=pd.read_csv('20210629-0007.csv', sep = ';', skiprows=[1,2],usecols = [4],dtype=float, decimal=',')
dt = 1/10000
#print(a.head())
n=len(a)
#time increment in each data
acc=a.values.flatten() #to convert DataFrame to 1D array
#acc value must be in numpy array format for half way mirror calculation
fft=rfft(acc)*dt
freq=rfftfreq(n,d=dt)
FFT=abs(fft)
plt.plot(freq,FFT)
plt.axvline(x=150, color = 'red')
plt.show()
Does anybody know how to get the intersection of those 2 plots ( red line and blue line at the same frequency ) ?
I would be very grateful for any help!
manually
This is not really a programming question, rather basic mathematics.
Here is your plot:
Let's call (x1,y1) and (x2,y2) the first two points of your blue line and (x,y) the coordinates of the intersection.
You have this relationship between the points: (x-x1)/(x2-x1) = (y-y1)/(y2-y1)
Thus: y=y1+(x-x1)*(y2-y1)/(x2-x1)
Which gives FFT[0]+(150-0)*(FFT[1]-FFT[0])/(freq[1]-freq[0])
Coordinates of the intersection are (150, 0.000189)
programmatically
You can use the pd.Series.interpolate method
import numpy as np
import pandas as pd
np.random.seed(0)
s = pd.Series(np.random.randint(0,100,20),
index=sorted(np.random.choice(range(100), 20))).sort_index()
ax = s.plot()
ax.axvline(35, color='r')
s.loc[35] = np.NaN
ax.plot(35, s.sort_index().interpolate(method='index').loc[35], marker='o')

Plotting coordinates with Matplotlib is distorting the base-map

I am trying to show a spatial distribution of shops on a map using Geopandas and Matplotlib.
Problem:
When I am plotting the pins the base map gets distorted. Here is a sample before plotting the pins and after .
Question:
What is the source of this distortion? How can I prevent it?
import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt
from shapely.geometry import Polygon
# Creating the simplified polygon
latitude = [60.41125, 59.99236, 59.99236]
longitude = [24.66917, 24.66917, 25.36972]
geometry = Polygon(zip(longitude, latitude))
polygon = gpd.GeoDataFrame(index=[0], crs = 'epsg:4326', geometry=[geometry])
# ploting the basemap
ax = polygon.plot(color="#3791CB")
# Dict of sample coordinates
coordinates = {"latitude": ["60.193141", "60.292777", "60.175053", "60.163187", "60.245272", "60.154392", "60.182906"],
"longitude": ["24.934214", "24.969730", "24.831068", "24.739044", "24.860983", "24.884773", "24.959175"]}
# Creating a dataframe from coordinates
df = pd.DataFrame(coordinates)
# Creating the GeoDataFrame
shops = gpd.GeoDataFrame(coordinates, geometry=gpd.points_from_xy(df.longitude, df.latitude))
# Plotting office coordinates
shops.plot(ax=ax, color="red", markersize = 20, zorder=2)
# adding grid
plt.grid(linestyle=":", color='grey')
plt.show()
Thank you!
You're map and pins have different reference systems..
When you create your first GeoDataFrame you specify its Coordinate Reference System (crs = 'epsg:4326'). When you create the geodataframe for the shop coordinates you don't. This is where the distortion is coming from..
This should fix it:
shops = gpd.GeoDataFrame(
coordinates,
geometry = gpd.points_from_xy(
df.longitude,
df.latitude),
crs = "EPSG:4326"
)
)
Cheers!

Matplotlib cannot plot points on basemap from CSV, but plots corectly from JSON

Problem
I'm trying to plot a set of points on a base-map. Below is my code. However it doesn't display it correctly where it supposed to be displaying it on the map. I have added below a Dropbox link to the csv file I am using.
Dropbox link to the csv file
import pandas as pd
import geopandas
import matplotlib.pyplot as plt
%matplotlib inline
#read data from CSV
building = pd.read_csv('masteronlyfive.csv')
# convert coords to float type
building = building.astype({"lat": float, "long": float})
# convert to geodata series
building = geopandas.GeoDataFrame(towers, geometry=geopandas.points_from_xy(building.lat,building.long))
# set CRS
building.crs = {'init' :'epsg:4326'}
building.head()
# read basemap file and set CRS
world = geopandas.read_file("South_Africa_Polygon.shp")
world.crs = {'init' :'epsg:4326'}
# Plot basemap
ax = world.plot(color='white', edgecolor='black')
# plot points
building.plot(ax=ax, color='red')
plt.show()
What I have tried
I have taken the co-ordinates and re-coded them in a json format, instead of csv, so Im reading the data from a json array rather than doing a csv import, as such below and they work completely fine, its totally shocking for me.
import pandas as pd
import geopandas
import matplotlib.pyplot as plt
%matplotlib inline
#reading from json array
df = pd.DataFrame(
{'Country': ['building', 'building', 'building', 'building', 'building'],
'Latitude': [-28.506806, -27.463611, -29.192053, -28.871950, -27.242444],
'Longitude': [28.613972, 28.040001, 26.235583,27.873739, 28.838861]})
#creating geopandas points from the coordinates
gdf = geopandas.GeoDataFrame(
df, geometry=geopandas.points_from_xy(df.Longitude, df.Latitude))
#reading the basemap file
world = geopandas.read_file("South_Africa_Polygon.shp")
# plotting the basemap
ax = world.plot(color='white', edgecolor='black')
# plotting the geodata points
gdf.plot(ax=ax, color='red')
plt.show()
What could I possibly be doing wrong that the exact same co-ords works fine from JSON but not from CSV.
When you create a geodataframe, change long/lat order, like below:
# convert to geodata series
building = geopandas.GeoDataFrame(
geometry=geopandas.points_from_xy(building.long, building.lat)
)
Your longitude is x, not y. Your latitude is y. Hence when dealing with the points_from_xy() function, longitude (which is x) comes first. This is a very common error and you can spot it in the plot – the polygon boundaries are diagonally opposite to your points cluster, so it is most often the order of lat/lon!
P.S. I am not sure why your original code references towers variable in that code snippet, so I removed it.

3D graph in yt module

could you help me with this code, please? I am trying to integrate the force line in the given point. I don't know where is a mistake - there is no streamline in the plot.
Data - dipole magnetic field are here
I tried this example with the change of data and the change of number of streamlines.
import numpy as np
import matplotlib.pyplot as plt
from numpy import array
import matplotlib as mpl
from mpl_toolkits.mplot3d import Axes3D # 3d graph
from mpl_toolkits.mplot3d import proj3d # 3d graph
import math
from matplotlib import patches
import code
import yt
from yt import YTArray # arrays in yt module
from yt.visualization.api import Streamlines # force lines
import matplotlib.pylab as pl# Choose point in field
X_point = 0.007089085922957821
Y_point = 0.038439192046320805
Z_point = 0# Load data (dictionary)
try:
import cPickle as pickle
except ImportError: # python 3.x
import picklewith open('data.p', 'rb') as fp:
data = pickle.load(fp)Bx_d = data["Bx"]
By_d = data["By"]
Bz_d = data["Bz"]# 3d array of dipole magnetic field
print(type(data))
bbox = np.array([[-0.15, 0.15], [0, 0.2], [-0.1, 0.1]]) # box, border
ds = yt.load_uniform_grid(data, Bx_d.shape, length_unit="Mpc", bbox=bbox, nprocs=100) # data, dimensionc = YTArray([X_point, Y_point, Z_point], 'm') # Define c: the center of the box, chosen point
c1 = ds.domain_center
print('c1', c1)
print(type(c1))
print('center',c)
N = 1 # N: the number of streamlines
scale = ds.domain_width[0] # scale: the spatial scale of the streamlines relative to the boxsize,
pos = c# Create streamlines of the 3D vector velocity and integrate them through
# the box defined above
streamlines = Streamlines(ds, pos, 'Bx', 'By', 'Bz', length=None) # length of integration
streamlines.integrate_through_volume()# Create a 3D plot, trace the streamlines through the 3D volume of the plot
fig=pl.figure()
ax = Axes3D(fig)
ax.scatter(X_point, Y_point, Z_point, marker = 'o', s=40, c='green')
print('tisk', streamlines.streamlines)for stream in streamlines.streamlines:
stream = stream[np.all(stream != 0.0, axis=1)]
ax.plot3D(stream[:,0], stream[:,1], stream[:,2], alpha=0.1)# Save the plot to disk.
pl.savefig('streamlines.png')
plt.show()
Output:
Without knowing more about the data, as well as what the output of the print call is, it's not entirely clear what the error is. If the streamlines have meaningful values (i.e., the values of stream[:,0] etc are within the bounds of your Axes3D, it should produce results.
Options for debugging would start with examining the individual values, then proceeding to plotting them in 2D (using pairs of components of each stream -- (0,1), (1,2) and (0,2)), and then examining what happens if you allow Axes3D to autoscale the xyz axes. You may also experiment with the alpha value, to see if the lines are simply too light to see.
An example image that this produces would also help, so that it can be made clear a few things about the properties matplotlib assigns to the Axes3D object.

Highlight specific points in matplotlib scatterplot

I have a CSV with 12 columns of data. I'm focusing on these 4 columns
Right now I've plotted "Pass def" and "Rush def". I want to be able to highlight specific points on the scatter plot. For example, I want to highlight 1995 DAL point on the plot and change that point to a color of yellow.
I've started with a for loop but I'm not sure where to go. Any help would be great.
Here is my code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import csv
import random
df = pd.read_csv('teamdef.csv')
x = df["Pass Def."]
y = df["Rush Def."]
z = df["Season"]
points = []
for point in df["Season"]:
if point == 2015.0:
print(point)
plt.figure(figsize=(19,10))
plt.scatter(x,y,facecolors='black',alpha=.55, s=100)
plt.xlim(-.6,.55)
plt.ylim(-.4,.25)
plt.xlabel("Pass DVOA")
plt.ylabel("Rush DVOA")
plt.title("Pass v. Rush DVOA")
plot.show
You can layer multiple scatters, so the easiest way is probably
plt.scatter(x,y,facecolors='black',alpha=.55, s=100)
plt.scatter(x, 2015.0, color="yellow")

Categories