Animated heatmap from dictionary of Pandas DataFrames - python

I would like to plot an animated heatmap from a group of DataFrames (for example saved in a dictionary), either as gif or a movie.
For example, say I have the following collection of DFs. I can display all of them one after the other. But I would like to have them all being shown in the same figure in the same way as a GIF is shown (a loop of the heatmaps).
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
dataframe_collection = {}
for i in range(5):
dataframe_collection[i] = pd.DataFrame(np.random.random((5,5)))
# Here within the same loop just for brevity
sns.heatmap(dataframe_collection[i])
plt.show()

The simplest way is to first create separate png images, and then use a software such as ImageMagick to convert them to an animated gif.
Example to create the png's:
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
dataframe_collection = {}
for i in range(5):
dataframe_collection[i] = pd.DataFrame(np.random.random((5,5)))
#plt.pcolor(dataframe_collection[i])
sns.heatmap(dataframe_collection[i])
plt.gca().set_ylim(0, len(dataframe_collection[i])) #avoiding problem with axes
plt.axis('off')
plt.tight_layout()
plt.savefig(f'dataframe_{i}.png')
After installing ImageMagick the following shell command creates a gif. If the defaults are not satisfying, use the docs to explore the many options.
convert.exe -delay 20 -loop 0 dataframe_*.png dataframes.gif
See also this post about creating animations and an animated gif inside matplotlib.
Note that Seaborn's heatmap also has some features such as sns.heatmap(dataframe_collection[i], annot=True).
If you're unable to use ImageMagick, you could show a video by quickly displaying single png files, simulating a video.
This and this post contain more explanations and example code. Especially the second part of this answer looks promising.

Related

Using missing_kwds with geopandas changes the shape of the displayed map

I'm using Geopandas (0.11.1) to plot data on maps. I'm facing an issue with missing_kwds. As some of my values are undefined, I want them to be colored in a specific way. I do that using the missing_kwds option of the plot method.
However, when using it, the shape of the map slightly changes, which is disgraceful when switching quickly from one to the other.
Here is an example.
A map without using missing_kwds :
import geopandas
import matplotlib.pyplot as plt
df = geopandas.read_file(geopandas.datasets.get_path("naturalearth_lowres"))
df.plot()
plt.savefig('world1.png')
A map using missing_kwds :
import geopandas
import matplotlib.pyplot as plt
import numpy as np
df = geopandas.read_file(geopandas.datasets.get_path("naturalearth_lowres"))
df.loc[df.name=="China", 'pop_est'] = np.nan
df.plot(column="pop_est", missing_kwds=dict(color="lightgray"))
plt.savefig('world2.png')
Those are the two resulting maps.
world1.png:
world2.png:
In case the difference isn't clear, here is a GIF that illustrates the shape changes.
Does anyone have an idea how I could solve this issue?
Add plt.gca().set_aspect('equal') after df.plot().

display images inside a loop by overwriting the existing plot/figure in python

I have hundreds of thousands of images which I have to get from URL, see them ,tag them and then save them in their respective category as spam or non spam. On top of that, I'll be working with google which makes it impossible. My idea is that instead of looking at each image by opening, analysing, renaming and then saving them in directory, I just get the image from url, see within a loop, input a single word and based on that input, my function will save them in their respective directories.
I tried doing
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
from IPython.display import Image
fg = plt.figure()
for i in range(5):
plt.imshow(np.random.rand(50,50))
plt.show()
x = input()
print(x)
but instead of overwriting the existing frame, it is plotting a different figure. I have even used 1,1 subplot inside a loop but it is not working. Ipython's method does not even display inside a loop either. Could somebody please help me with this problem.
You can make use of matplotlib's interactive mode by invoking plt.ion(). An example:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib notebook
fig, ax = plt.subplots()
plt.ion()
plt.show()
for i in range(5):
ax.imshow(np.random.rand(50,50)) # plot the figure
plt.gcf().canvas.draw()
yorn = input("Press 1 to continue, 0 to break")
if yorn==0:
break
Expected output:

box plot not appearing in Google Colab

Trying to create a simple Box Plot using Google Colab for my Intro Python class. It is not appearing as I would like it. You can see my code and output below. I read in a file on NBA statistics, and my box plot would be based on a variable called "SHOT_CLOCK".
So far what I have:
import pandas as pd
from matplotlib import pyplot as plt
df = pd.read_csv('file path')
plt.boxplot(df['SHOT_CLOCK'], vert=False)
plt.title('Box Plot for SHOT_CLOCK')
plt.xlabel('Shot Clock')
plt.show()
Output:
Edit
In your example you are passing a Series object, try this way
plt.figure()
plt.title('Box Plot for SHOT_CLOCK')
plt.xlabel('Shot Clock')
df.boxplot(column='SHOT_CLOCK')
Once you add the following Import to your code it will work:
import matplotlib.pyplot as plt
plt.style.use('classic')
%matplotlib inline

how to change each histograph item color in python?

Korean in pictures is not important. Sorry for showing non-english character
environment : Jupyter notebook
For this dataFrame(which read csv files), I want to make bar graph which has specific colors on each item.
so, I make some code like that...
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import font_manager, rc
font_name =font_manager.FontProperties(fname="c:/Windows/Fonts/malgun.ttf").get_name()
rc('font', family=font_name)
from matplotlib import colors as mcolors
colors=dict(mcolors.BASE_COLORS,**mcolors.CSS4_COLORS)
data = pd.read_csv('subway.csv')
subwayPassengerPerLine.plot.bar(color=['tab:blue','tab:green','tab:orange','tab:cyan','tab:purple','tab:brown','tab:green','tab:pink','tab:gold','tab:black','tab:black','tab:black','tab:black','tab:black','tab:black','tab:black','tab:black','tab:black','tab:black','tab:black','tab:black','tab:black','tab:black','tab:black','tab:black'])
I want to make like this one
But My code(upper code) doesn't change color.
how to change color in bar graph like second image? thanks
I believe, you don't need to use tab:"black" ...etc.
Just using
subwayPassengerPerLine.plot.bar(y = 'sum',color=['blue','green','orange','cyan','purple','brown','green','pink','gold','black','black','black','black','black','black','black','black','black','black','black','black','black','black','black','black'])
This can also help if you want to automate your plot color.
How to pick a new color for each plotted line within a figure in matplotlib?
Doc reference
https://python-graph-gallery.com/3-control-color-of-barplots/
Edited:
Missed the y = 'sum' field.
If you want to remove the useless legend, add this line too:
subwayPassengerPerLine.get_legend().remove()

Detailed date in cursor pos on pyplot charts

Let's say there's a time series that I want to plot in matplotlib:
dates = pd.date_range(start='2011-01-01', end='2012-01-01')
s = pd.Series(np.random.rand(1, len(dates))[0], index=dates)
The GUI backends in matplotlib have this nice feature that they show the cursor coordinates in the window. When I plot pandas series using its plot() method like this:
fig = plt.figure()
s.plot()
fig.show()
the cursor's x coords are shown in full yyyy-mm-dd at the bottom of the window as you can see on pic 1.
However when I plot the same series s with pyplot:
fig = plt.figure()
plt.plot(s.index, s.values)
fig.show()
full dates are only shown when I zoom in and in the default view I can only see Mon-yyyy (see pic 2) and I would see just the year if the series were longer.
In my project there are functions for drawing complex, multi-series graphs from time series data using plt.plot(), so when I view the results in GUI I only see the full dates in the close-ups. I'm using ipython3 v. 4.0 and I'm mostly working with the MacOSX backend, but I tried TK, Qt and GTK backends on Linux with no difference in the behavior.
So far I've got 2 ideas on how to get the full dates displayed in GUI at any zoom level:
rewrite plt.plot() to pd.Series.plot()
use canvas event handler to get the x-coord from the cursor pos and print it somewhere
However before I attempt any of the above I need to know for sure if there is a better quicker way to get the full dates printed in the graph window. I guess there is, because pandas is using it, but I couldn't find it in pyplot docs or examples or elsewhere online and it's none of these 2 calls:
ax.xaxis_date()
fig.autofmt_xdate()
Somebody please advise.
Hooks for formatting the info are Axes.format_coord or Axes.fmt_xdata. Standard formatters are defined in matplotlib.dates (plus some additions from pandas). A basic solution could be:
import matplotlib.dates
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
dates = pd.date_range(start='2011-01-01', end='2012-01-01')
series = pd.Series(np.random.rand(len(dates)), index=dates)
plt.plot(series.index, series.values)
plt.gca().fmt_xdata = matplotlib.dates.DateFormatter('%Y-%m-%d')
plt.show()

Categories