Seaborn Heatmap - Remove Excess Repeated Xticks - python

I am working with my genes expression dataset and created this heatmap:
On the X axis of this heatmap, I have my different cell types.
I want to have only one xtick in the center of each cell type and remove all of the other excess xticks.
Additionally, I want the last line of this heatmap to have different distinct colors for each cell type.
I searched for days for some possible solutions and tried to manipulate some of pandas and seaborn functions, but couldn't solve this problem.
It is supposed to look more like this:

Related

Python: make legend figure (legend only, no plot), based on a pandas dataframe

I have a pandas Data Frame with HEX colour codes in one column, and qualitative variables in the other column. It is like this:
Colour Phyla
#db5f57 Firmicutes_A
#dba157 Bacteroidota
#d3db57 Verrucomicrobiota
#57db5f Proteobacteria
#5791db Cyanobacteria
#a157db Synergistota
I am not interested in creating a plot with this colour map (such as barplots or scatterplots, no). I simply want to produce an image (PNG or JPEG) similar to a legend, containing the names of the variable next to coloured squares (whose colour should be equivalent to the HEX code in the DataFrame, of course). Something similar to the following:
I have experience with producing images with Seaborn and Matplotlib, which only return legends when we plot something.
How can I produce the legend from the dataframe?
Thanks in advance!

How do I share labels across collections in matplotlib? Or how do I produce the legend cleanly?

I am trying to use matplotlib to show some data in a clear way. My current goal is to label the data using two methods: color and shape. The color will be used to represent the data set these specific points come from, while the shape is used to represent whether that example is in category one or two. To visualize this, here is a simple example I drew in PowerPoint:
The reason for doing this instead of simply creating a legend with each specific data set and category stated is I am plotting upwards of 10 data sets, so the legend would remain significantly cleaner and easier to read if color was used for the data sets and shape used for general category (thus the legend would show 10 colors and two shapes, as opposed to 20 different color-shape combinations).
I am currently able to use matplotlib to set the label of the individual data sets by iterating through them and plotting each individually as follows:
import matplotlib.pyplot as plt
ax = plt.figure()
for data in datasets:
scat_plot = ax.scatter(data[x], data[y], label=data[label])
ax.legend()
plt.show()
However, when I attempt to plot the individual shapes and colors and assign them the same label, I am left with plots that do not recognize the two scatter collections as having the same label.
Any suggestions or hints would be greatly appreciated. Thank you.

Python Seaborn FacetGrid different y-axis

I'm using sns.FacetGrid to plot 10 subplots. I'd like to flex the y-axis to be different for each subplot.
At the moment it automatically uses the same for all subplots. Would it be possible to customize it to make it more specific for each subplot?
See the documentation for facet grid here
share{x,y}bool, ‘col’, or ‘row’ optional If True, the facets will
share y axes across columns and/or x axes across rows.
Be advised that this also breaks alignment across columns and will most likely not produce the results you intended. One Y axis will be displayed, which will be only valid for the leftmost plot.

Skip weekends on stock charts with matplolib

This is not duplicate, because existing answers on similar questions don't describe exactly what I need.
Matplotlib has great formatters inside and I love to use them:
ax.xaxis.set_major_locator(matplotlib.dates.MonthLocator())
ax.xaxis.set_major_formatter(matplotlib.dates.DateFormatter('%b%y'))
They let me plot such stock market charts:
This is what I need, but it has 1 issue: weekends. They are present on x axis and make my chart a little ugly.
Other questions about this issue give advice to create custom formatter. They show examples of such formatters. But no one of them do pretty formatting like matplotlib do:
May19, Jun19, Jul19...
I mean this line of code:
ax.xaxis.set_major_formatter(matplotlib.dates.DateFormatter('%b%y'))
My question is: please help me to format x axis like matplotlib do: May19, Jun19, Jul19... and don't create weekends when stock market is closed.
What you could almost always do is something similar to what Nic Wanavit suggested.
Manually set your labels, depending on what you need on your axis.
Especially in this case the plot is looking a bit ugly because you have timespans in your data that are not provided with actual data (the weekends in this case) so pyplot will simply connect these points with the corresponding length from the x-axis.
What you can do then is just to plot your data equally distant - which is correct if the data is daily - otherwise consider to interpolate it using e.g. pandas bultin interpolation.
To avoid pyplot automatically detect the index I had to do this:
df['plotidx'] = [i for i in range(len(df['close'])):
Here all the closing values for the stock are stored in a column named 'close' obvsl.
You plot this correspondingly.
Then you can obtain all the ticks created via
labels = [item.get_text() for item in ax.get_xticklabels()]
Adjust them as desired with
labels[i] = string_for_the_label_no_i
Then get them back on the graph using
ax.xaxis.set_ticklabels(labels)
You need to somewhat "update" the plot then. Also keep in mind, that resizing a lot could end up with the labels being as also said in the documentation strange location.
It is some kind of a workaround but worked fine for me because it feels natural to plot data equally distant next to each other rather then making up some data for the weekends.
Greets
to set the x ticks
assuming that you have the dates variable in dataframe row df['dates']
ax.xaxis.set_ticks(df['dates'])

Python Pandas Dataframe Plot: setting axis and legend font size and xtick spacing

In Python 2.7, I am attempting to plot (line plot) the columns of a pandas dataframe as subplots in one figure, with the columns representing a time series of prices for different products. This can be easily done using the df.plot command as shown below. However, I am finding it impossible to find a way to control axis and legend font size and xtick spacing. Is there a good alternative to get this done?
df.plot(subplots=True, layout=(numrows,numcols),sharex=True)
plt.show()
The key here is that the script takes in a dataframe that on each run will have a different number of columns, and so it isn't practical to attempt to specify these settings line by line for each plot using subplots, as the number of columns will be different on each run of the script.

Categories