GeoPandas: Plot two Geo DataFrames over each other on a map - python

I am new to using Geopandas and plotting maps from Geo Dataframe. I have two Geo DataFrames which belong to the same city. But they are sourced from different sources. One contains the Geometry data for houses and another for Census tracts. I want to plot the houses' boundary on top of the tract boundry.
Below is the first row from each data set. I am also not sure why the Geometry Polygon values are on such a different scale in each of these datasets.
Houses Data Set
House Data
Tract Data Set
Tract Data
I tried the following code in the Jupyer Notebook but nothing is showing up.
f, ax = plt.subplots()
tract_data.plot(ax=ax)
house_data.plot(ax=ax)
But an empty plot shows up.
This is my first post. Please let me know what else I can provide.

You probably need to set the correct coordinate reference system (crs). More info here
An easy fix might be
f, ax = plt.subplots()
tract_data.to_crs(house_data.crs).plot(ax=ax)
house_data.plot(ax=ax)

Related

How to plot specific rows of qualitative data using matplotlib on python?

I have a large spreadsheet of data that for privacy reasons I cannot show, but there is a column called 'origin' where there are hundreds of rows for particular company names. For example: 500 rows of information has been input for 500 people working at "Sony". I want to be able to make graphs for the information gathered for each institution, but I am having trouble only plotting for specific rows. The goal is to make a dashboard for each institution.
A way of putting this would be:
fig = px.scatter(df, x='gender'['female], y='race',
color='origin'['Sony'])
fig.update_traces(mode='markers+lines')
fig.show()
I want to focus on particular categories when plotting.
Any help is appreciated!

Is there a simple way to plot multiple series on one pandas scatter plot?

I come across this issue constantly; and my current solution is to create additional dataframes, I feel like there must be an easier solution.
Here is an example of data where I have multiple countries with multiple attributes:
If I wanted to plot Population vs. Depression (%) I would write:
ax = df.plot.scatter(x='Population', y='Depression (%)')
This isn't super helpful, as there are clearly lines linked to specific Countries (df['Country']). Is there a simple way to plot a scatter plot with different series (colors/shapes/etc) as different Countries?
Right now I use groupby to separate out individual Countries and plot them on the same axes (ax = ax).
Any thoughts or input would be greatly appreciated! Thank you!
Try c="Country" and then if you want some nice colors you can go colormap='viridis' for example documentation
ax2 = df.plot.scatter(x='length',
y='width',
c='species',
colormap='viridis')
Since you are using strings as variables we can't use this approach directly and need to convert the data to numbers. This can be done by writing:
c=df.country.astype("category").cat.codes

How to use Plotly to save/extract data information selected in a graph

I used plotly to create a scatter plot from a csv.
It is possible to select a point or multiple points in this graph and save/extract information related to this point contained in the csv ?

Matplotlib legend plotting name data title multiple times

I am plotting some data from a CSV. I recently edited the date range of the CSV file but the values are the same. Before, the data was being plotted simulatiously with 2 other data sets, and the legend had only one entry for this data set. After editing the CSV file, the legend now displays the label 3 times but overall graphs the data correctly. I have tried removing the other two data sets from the plot, using numpoints=1, and ensuring nothing is in a for loop (which none of this code uses one). Additionally, I made sure there wasn't 3 versions of the data saved in the same directory. Any suggestions on why this is happening and how to fix it? I'm including my plotting code in case something is in it that is wrong.
plt.plot(date_range,ice_extent1,color='red',label='MASIE')
plt.xlabel("Date (yyyy/mm)")
plt.ylabel("Sea Ice Extent (10^6 km^2)")
plt.title("Sea Ice Extent")`
plt.legend()

Skip weekends on stock charts with matplolib

This is not duplicate, because existing answers on similar questions don't describe exactly what I need.
Matplotlib has great formatters inside and I love to use them:
ax.xaxis.set_major_locator(matplotlib.dates.MonthLocator())
ax.xaxis.set_major_formatter(matplotlib.dates.DateFormatter('%b%y'))
They let me plot such stock market charts:
This is what I need, but it has 1 issue: weekends. They are present on x axis and make my chart a little ugly.
Other questions about this issue give advice to create custom formatter. They show examples of such formatters. But no one of them do pretty formatting like matplotlib do:
May19, Jun19, Jul19...
I mean this line of code:
ax.xaxis.set_major_formatter(matplotlib.dates.DateFormatter('%b%y'))
My question is: please help me to format x axis like matplotlib do: May19, Jun19, Jul19... and don't create weekends when stock market is closed.
What you could almost always do is something similar to what Nic Wanavit suggested.
Manually set your labels, depending on what you need on your axis.
Especially in this case the plot is looking a bit ugly because you have timespans in your data that are not provided with actual data (the weekends in this case) so pyplot will simply connect these points with the corresponding length from the x-axis.
What you can do then is just to plot your data equally distant - which is correct if the data is daily - otherwise consider to interpolate it using e.g. pandas bultin interpolation.
To avoid pyplot automatically detect the index I had to do this:
df['plotidx'] = [i for i in range(len(df['close'])):
Here all the closing values for the stock are stored in a column named 'close' obvsl.
You plot this correspondingly.
Then you can obtain all the ticks created via
labels = [item.get_text() for item in ax.get_xticklabels()]
Adjust them as desired with
labels[i] = string_for_the_label_no_i
Then get them back on the graph using
ax.xaxis.set_ticklabels(labels)
You need to somewhat "update" the plot then. Also keep in mind, that resizing a lot could end up with the labels being as also said in the documentation strange location.
It is some kind of a workaround but worked fine for me because it feels natural to plot data equally distant next to each other rather then making up some data for the weekends.
Greets
to set the x ticks
assuming that you have the dates variable in dataframe row df['dates']
ax.xaxis.set_ticks(df['dates'])

Categories