plot df on map - two plots instead of one - python

enter image description hereenter image description hereI want to plot data from a df in a map. However, I don't manage to print out the values in the plot, but I get two plots - one with an empty map and one with the values without map.
What am I doing wrong?
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import cartopy
import cartopy.crs as ccrs
import cartopy.feature as cf
#ax1=plt.axes(projection=ccrs.Mercator())
#fig, ax=plt.subplots(figsize=(8,6))
ax1.add_feature(cf.BORDERS);
ax1.add_feature(cf.RIVERS);
ax1.set_extent([7.4, 8.8, 47.5, 49.1])
ax1.set_title('Niederschlag', fontsize=13);
ax.grid(b=True, alpha=0.5)
df.plot(x="longitude", y="latitude", kind="scatter",
c='RR',colormap="YlOrRd")
plt.show()
Thanks in advance

pandas.DataFrame.plot creates a new plot by default, this is why you are getting two plots: first you crate one with ax1=plt.axes(projection=ccrs.Mercator()), then pandas creates another one.
To fix this you can use the ax argument on pandas.DataFrame.plot:
df.plot(x="longitude", y="latitude", kind="scatter", c='RR',colormap="YlOrRd", ax=ax1)
Edit
There is also a problem with the projection. You are setting the projection as Mercator which uses a different coordinates system from the standard one. When you set the bounds you are not specifying the projection and it being converted automatically. Your points though are in a different projection and no automatic conversion is performed.
To fix this I suggest using a different projection to start with (e.g.: PlateCarree). If this is not possible then you would need to convert the coordinates yourself.
This code produces a map with points:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import cartopy
import cartopy.crs as ccrs
import cartopy.feature as cf
df = pd.DataFrame(
{
"longitude": [7.99, 8.42, 7.78],
"latitude": [48.98, 48.74, 47.59],
"RR": [5, 15, 25],
}
)
ax1=plt.axes(projection=ccrs.PlateCarree())
ax1.add_feature(cf.BORDERS)
ax1.add_feature(cf.RIVERS)
ax1.set_extent([7.4, 8.8, 47.5, 49.1], crs=ccrs.PlateCarree())
ax1.set_title('Niederschlag', fontsize=13);
ax1.grid(b=True, alpha=0.5)
df.plot(x="longitude", y="latitude", kind="scatter", c="RR", colormap="YlOrRd", ax=ax1)

try this : plt.scatter(x=df['Longitude'], y=df['Latitude'])
or try using geo pandas like so
import pandas as pd
from shapely.geometry import Point
import geopandas as gpd
from geopandas import GeoDataFrame
df = pd.read_csv("Long_Lats.csv", delimiter=',', skiprows=0, low_memory=False)
geometry = [Point(xy) for xy in zip(df['Longitude'], df['Latitude'])]
gdf = GeoDataFrame(df, geometry=geometry)
#this is a simple map that goes with geopandas
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
gdf.plot(ax=world.plot(figsize=(10, 6)), marker='o', color='red', markersize=15);

Related

How to print the heatmap in a square shape using seaborn?

When I run the code below I notice that the heatmap does not have a square shape knowing that I have used square=True but it did not work! Any idea how can I print the heatmap in a square format? Thank you!
The code:
from datetime import datetime
import numpy as np
import pandas as pd
import matplotlib as plt
import os
import seaborn as sns
temp_hourly_A5_A7_AX_ASHRAE=pd.read_csv('C:\\Users\\cvaa4\\Desktop\\projects\\s\\temp_hourly_A5_A7_AX_ASHRAE.csv',index_col=0, parse_dates=True, dayfirst=True, skiprows=2)
sns.heatmap(temp_hourly_A5_A7_AX_ASHRAE,cmap="YlGnBu", vmin=18, vmax=27, square=True, cbar=False, linewidth=0.0001);
The result:
square=True should work to have square cells, below is a working example:
import pandas as pd
import numpy as np
import seaborn as sns
df = pd.DataFrame(np.tile([0,1], 15*15).reshape(-1,15))
sns.heatmap(df, square=True)
If you want a square shape of the plot however, you can use set_aspect and the shape of the data:
ax = sns.heatmap(df)
ax.set_aspect(df.shape[1]/df.shape[0]) # here 0.5 Y/X ratio
You can use matplotlib and set a figsize before plotting heatmap.
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
rnd = np.random.default_rng(12345)
data = rnd.uniform(-100, 100, [100, 50])
plt.figure(figsize=(6, 5))
sns.heatmap(data, cmap='viridis');
Note that I used figsize=(6, 5) rather than a square figsize=(5, 5). This is because on a given figsize, seaborn also puts the colorbar, which might cause the heatmap to be squished a bit. You might want to change those figsizes too depending on what you need.

Is there any way to show mean in box plot using Python?

I'm just starting using Matplotlib, and I'm trying to learn how to draw a box plot in Python using Colab.
My problem is: I'm not able to put the median on the graph. The graph just showed the quartiles, mean, and outliers. Can someone help me?
My code is the following.
from google.colab import auth
auth.authenticate_user()
import gspread
import numpy as np
from oauth2client.client import GoogleCredentials
gc = gspread.authorize(GoogleCredentials.get_application_default())
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as pl
sns.set_theme(style="ticks", color_codes=True)
wb = gc.open_by_url('URL_JUST_FOR_EXAMPLE')
boxplot = wb.worksheet('control-Scale10to100')
boxplotData = boxplot.get_all_values()
df = pd.DataFrame(boxplotData[1:], columns=boxplotData[0])
df.drop(df.columns[0], 1, inplace=True)
df = df.apply(pd.to_numeric, errors='ignore')
df.dtypes
df.describe()
dfBoxPlotData = df.iloc[:,4:15]
dfBoxPlotData.apply(pd.to_numeric)
dfBoxPlotData.head()
props = dict(whiskers="Black", medians="Black", caps="Black")
ax = df.plot.box(rot=90, fontsize=14, figsize=(15, 8), color=props, patch_artist=True, grid=False, meanline=True, showmeans=True, meanprops=dict(color='red'))
I tried running your code with a sample data set where the mean and median are distinct, and like #tdy showed, as long as the parameters showmeans=True and meanline=True are being passed to the df.plot.box method, the mean and median should both show up. Is it possible that in your data set, the mean and median are close enough together that they're hard to distinguish?
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as pl
mu, sigma = 50., 10. # mean and standard deviation
np.random.seed(42)
s = np.random.normal(mu, sigma, 30)
df = pd.DataFrame({'values':s})
props = dict(whiskers="Black", medians="Black", caps="Black")
ax = df.plot.box(rot=90, fontsize=14, figsize=(15, 8), color=props, patch_artist=True, grid=False, meanline=True, showmeans=True, meanprops=dict(color='red'))
pl.show()

How to make horizontal linechart with categorical variables and timeseries?

I want to replicate plots from this paper: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5000555/pdf/nihms774453.pdf I'm particularly interested in plot on page 16, right panel. I tried to do this in matplotlib but it seems to me that there is no way to access lines in linecollection.
I don't know how to change the color of the each line, according to the value at every index. I'd like to eventually get something like here: https://matplotlib.org/3.1.1/gallery/lines_bars_and_markers/multicolored_line.html but for every line, according to the data.
this is what I tried:
the data in numpy array: https://pastebin.com/B1wJu9Nd
import pandas as pd, numpy as np
import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection
from matplotlib import colors as mcolors
%matplotlib inline
base_range = np.arange(qq.index.max()+1)
fig, ax = plt.subplots(figsize=(12,8))
ax.set_xlim(qq.index.min(), qq.index.max())
# ax.set_ylim(qq.columns[0], qq.columns[-1])
ax.set_ylim(-5, len(qq.columns) +5)
line_segments = LineCollection([np.column_stack([base_range, [y]*len(qq.index)]) for y in range(len(qq.columns))],
cmap='viridis',
linewidths=(5),
linestyles='solid',
)
line_segments.set_array(base_range)
ax.add_collection(line_segments)
axcb = fig.colorbar(line_segments)
plt.show()
my result:
what I want to achieve:

Setting Image background for a line plot in matplotlib

I am trying to set a background image to a line plot that I have done in matplotlib. While importing the image and using zorder argument also, I am getting two seperate images, in place of a single combined image. Please suggest me a way out. My code is --
import quandl
import pandas as pd
import sys, os
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import itertools
def flip(items, ncol):
return itertools.chain(*[items[i::ncol] for i in range(ncol)])
df = pd.read_pickle('neer.pickle')
rows = list(df.index)
countries = ['USA','CHN','JPN','DEU','GBR','FRA','IND','ITA','BRA','CAN','RUS']
x = range(len(rows))
df = df.pct_change()
fig, ax = plt.subplots(1)
for country in countries:
ax.plot(x, df[country], label=country)
plt.xticks(x, rows, size='small', rotation=75)
#legend = ax.legend(loc='upper left', shadow=True)
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
plt.show(1)
plt.figure(2)
im = plt.imread('world.png')
ax1 = plt.imshow(im, zorder=1)
ax1 = df.iloc[:,:].plot(zorder=2)
handles, labels = ax1.get_legend_handles_labels()
plt.legend(flip(handles, 2), flip(labels, 2), loc=9, ncol=12)
plt.show()
So in the figure(2) I am facing problem and getting two separate plots
In order to overlay background image over plot, we need imshow and extent parameter from matplotlib.
Here is an condensed version of your code. Didn't have time to clean up much.
First a sample data is created for 11 countries as listed in your code. It is then pickled and saved to a file (since there is no pickle file data).
import quandl
import pandas as pd
import sys, os
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import itertools
from scipy.misc import imread
countries = ['USA','CHN','JPN','DEU','GBR','FRA','IND','ITA','BRA','CAN','RUS']
df_sample = pd.DataFrame(np.random.randn(10, 11), columns=list(countries))
df_sample.to_pickle('c:\\temp\\neer.pickle')
Next the pickle file is read and we create bar plot directly from pandas
df = pd.read_pickle('c:\\temp\\neer.pickle')
my_plot = df.plot(kind='bar',stacked=True,title="Plot Over Image")
my_plot.set_xlabel("countries")
my_plot.set_ylabel("some_number")
Next we use imread to read image into plot.
img = imread("c:\\temp\\world.png")
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
plt.imshow(img,zorder=0, extent=[0.1, 10.0, -10.0, 10.0])
plt.show()
Here is an output plot with image as background.
As stated this is crude and can be improved further.
You're creating two separate figures in your code. The first one with fig, ax = plt.subplots(1) and the second with plt.figure(2)
If you delete that second figure, you should be getting closer to your goal

How to change color of certain squares in a seaborn heatmap?

I'm trying to create a heatmap in seaborn (python) with certain squares colored with a different color, (these squares contain insignificant data - in my case it will be squares with values less than 1.3, which is -log of p-values >0.05). I couldn't find such function. Masking these squares also didn't work.
Here is my code:
import matplotlib.pyplot as plt
import numpy as np
import matplotlib as mpl
import seaborn as sns; sns.set()
data = [[1.3531363408, 3.339479161, 0.0760855365], [5.1167382617, 3.2890920405, 2.4764601828], [0.0025058257, 2.3165128345, 1.6532714962], [0.2600549869, 5.8427407219, 6.6627226609], [3.0828581725, 16.3825494439, 12.6722666929], [2.3386307357, 13.7275065772, 12.5760972276], [1.224683813, 2.2213656372, 0.6300876451], [0.4163788387, 1.8128374089, 0.0013106046], [0.0277592882, 2.9286203949, 0.810978992], [0.0086613622, 0.6181261247, 1.8287878837], [1.0174519889, 0.2621290291, 0.1922637697], [3.4687429571, 4.0061981716, 0.5507951444], [7.4201304939, 3.881457516, 0.1294141768], [2.5227546319, 6.0526491816, 0.3814362442], [8.147538027, 14.0975727815, 7.9755706939]]
cmap2 = mpl.colors.ListedColormap(sns.cubehelix_palette(n_colors=20, start=0, rot=0.4, gamma=1, hue=0.8, light=0.85, dark=0.15, reverse=False))
ax = sns.heatmap(data, cmap=cmap2, vmin=0)
plt.show()
I want to add that I'm not very advanced programmer.
OK, so I can answer my question myself now :) Here is the code that solved the problem:
import matplotlib.pyplot as plt
import numpy as np
import matplotlib as mpl
import seaborn as sns; sns.set()
data = np.array([[1.3531363408, 3.339479161, 0.0760855365],
[5.1167382617, 3.2890920405, 2.4764601828],
[0.0025058257, 2.3165128345, 1.6532714962],
[0.2600549869, 5.8427407219, 6.6627226609],
[3.0828581725, 16.3825494439, 12.6722666929],
[2.3386307357, 13.7275065772, 12.5760972276],
[1.224683813, 2.2213656372, 0.6300876451],
[0.4163788387, 1.8128374089, 0.0013106046],
[0.0277592882, 2.9286203949, 0.810978992],
[0.0086613622, 0.6181261247, 1.8287878837],
[1.0174519889, 0.2621290291, 0.1922637697],
[3.4687429571, 4.0061981716, 0.5507951444],
[7.4201304939, 3.881457516, 0.1294141768],
[2.5227546319, 6.0526491816, 0.3814362442],
[8.147538027, 14.0975727815, 7.9755706939]])
cmap1 = mpl.colors.ListedColormap(['c'])
fig, ax = plt.subplots(figsize=(8, 8))
sns.heatmap(data, ax=ax)
sns.heatmap(data, mask=data > 1.3, cmap=cmap1, cbar=False, ax=ax)
plt.show()
So the problem with masking which didn't work before was that it works only on arrays not on lists.
And another thing is just plotting the heatmap twice -second time with masking.
The only thing I still don't understand is that it masks opposite fields from what is written.. I want to mask values below 1.3, but then it colored values above 1.3.. So I wrote mask=data >1.3 and now it works...

Categories