Creating heatmap of US locations changing over time with python - python

The data I have is a set of occurrences of a general event in various US states for a given set of years. All I care about illustrating is 1) the year of the occurrence(s), and 2) the location (state) of the occurrence(s).
I am using state capital latitude and longitudes as the location of each point plotted, and want each point on the map to have a color corresponding to the number of occurrences in that state, in that year. I would like to create a map subplot; essentially create one of these maps for each year of available data and plot all those maps together to visualize the change in location of events over time.
So far, the closest thing I have found I can do is create a scattergeo plot using python and plotly (i.e. the figure at the bottom of this: https://plot.ly/python/map-subplots-and-small-multiples/ ), which gives me the desired map subplot over time, but I cannot figure out how to make each of the submaps into a heat map, rather than a simple scatterplot. Any ideas would be greatly appreciated! As you can probably tell I'm a python noob, but I hope I was able to make my problem clear!

Related

Identifying Plot Name or Visualization Implementation

I'm working on a dataset of SMS records [datetime_entry, sms_sent] and I was looking to copy a really effective trend visual from a well cited Electricity demand study. Does anyone know the name of this plot, or the implementation of something similar in Python (as I'm not sure this was done in Python).
I know how to subplot the 4 charts after splitting the data by quarter, I'm just stumped on the plot type and stylization.
This is what matplotlib calls an eventplot.
Essentially each vertical line represents an occurance of a Mwh demand during that specific hour. So each row in the plot should have as many vertical lines as there are days in that quarter.
While it works in this plot for these data, relying on the combination of alpha level + data density can be slightly unreliable as the data change as the number of overlapping points is not readily visible. So you can also create a similar visualization using hist2d, where you manually specify your bins.

How can I plot multiple y variables stacked in a bubble plot?

I am trying to visualise some data for a construction project. Each week, for a few years, different vehicles will be accessing the site. I need to produce a graph showing when each vehicle will be accessing the site. The way I'd like to do this is to have a single graph that shows all vehicles on, so that weeks without any vehicles are empty and can be clearly identified, where as busy weeks will have lots of data visible at the same time.
I want to have the y axis be vehicle type (let's say cars, vans and trucks for now) and x axis be time (weeks of the year). I want to use bubbles to display the number of each vehicle, so if there is a dot at the coordinates (Van, Week 1) you will know that vans will be used during week 1, and the bubble size will tell you how many.
My question is essentially - what is this graph called? I want it to be called something like a "discrete stacked bubble plot" or something but that doesn't exist. Please see my example below. Any ideas on how to do this? Or am I approaching the problem the wrong way?
Example of what I want it to look like

How to offset (or unstack) data points within the same date in a Waterfall plotly chart?

After extensive research, the closest I found to my issue was this question.
I'm building a waterfall chart to illustrate the cashflow of a certain company. That being said, I have several cashflow entries on the same date.
Currently I'm using waterfallmode='overlay' but I tried group as well. I have some trouble in looking at such chart because the positive and negative entries do overlap and it all gets a bit confused.
What I want exactly is: To unstack (or offset) each entries within a day, so that they are located laterally and parallel to each other, as opposed to overlapped or stacked.
The closest settings I found to deal with it, and why they don't work are:
waterfallgap and waterfallgroupgap (Creating a FigureWidget object)
offset and offsetgroup
None of them work because they offset the entire group (as opposed to each entry).
I guess one solution would be to force a minute's difference in each of the Date entries. But I'm sure there is a cleaner solution

Skip weekends on stock charts with matplolib

This is not duplicate, because existing answers on similar questions don't describe exactly what I need.
Matplotlib has great formatters inside and I love to use them:
ax.xaxis.set_major_locator(matplotlib.dates.MonthLocator())
ax.xaxis.set_major_formatter(matplotlib.dates.DateFormatter('%b%y'))
They let me plot such stock market charts:
This is what I need, but it has 1 issue: weekends. They are present on x axis and make my chart a little ugly.
Other questions about this issue give advice to create custom formatter. They show examples of such formatters. But no one of them do pretty formatting like matplotlib do:
May19, Jun19, Jul19...
I mean this line of code:
ax.xaxis.set_major_formatter(matplotlib.dates.DateFormatter('%b%y'))
My question is: please help me to format x axis like matplotlib do: May19, Jun19, Jul19... and don't create weekends when stock market is closed.
What you could almost always do is something similar to what Nic Wanavit suggested.
Manually set your labels, depending on what you need on your axis.
Especially in this case the plot is looking a bit ugly because you have timespans in your data that are not provided with actual data (the weekends in this case) so pyplot will simply connect these points with the corresponding length from the x-axis.
What you can do then is just to plot your data equally distant - which is correct if the data is daily - otherwise consider to interpolate it using e.g. pandas bultin interpolation.
To avoid pyplot automatically detect the index I had to do this:
df['plotidx'] = [i for i in range(len(df['close'])):
Here all the closing values for the stock are stored in a column named 'close' obvsl.
You plot this correspondingly.
Then you can obtain all the ticks created via
labels = [item.get_text() for item in ax.get_xticklabels()]
Adjust them as desired with
labels[i] = string_for_the_label_no_i
Then get them back on the graph using
ax.xaxis.set_ticklabels(labels)
You need to somewhat "update" the plot then. Also keep in mind, that resizing a lot could end up with the labels being as also said in the documentation strange location.
It is some kind of a workaround but worked fine for me because it feels natural to plot data equally distant next to each other rather then making up some data for the weekends.
Greets
to set the x ticks
assuming that you have the dates variable in dataframe row df['dates']
ax.xaxis.set_ticks(df['dates'])

3D plot in python

I have worked out a table somewhat like the one in the link. The ultimate goal for plotting is to find out if there is a seasonal change pattern for certain products in a state. I have tried to figure out a 3-D plot in python, with x-axis being product name, y-axis being month and z-axis being YR2012 and YR2013 respectively.
And another small question related to this is how could I make python know that the SALESMONTH column contains month type of data rather than plain integers.
Thanks!

Categories