Group Boxplots with multiple dataframes - python

I have tried to create a scatter with grouped boxplots as the ones on the following links:
matplotlib: Group boxplots
https://cmdlinetips.com/2019/03/how-to-make-grouped-boxplots-in-python-with-seaborn/
how to make a grouped boxplot graph in matplotlib
However, the data I want to use comes in a format as:
5y_spreads
7y_spreads
10y_spreads
(each of the images above comes from a different worksheet in the same workbook)
I need to work the data in Python to make it ready for seaborn and that is what is difficult for me.
It is not structured as in the examples from the links. I understand this requires mastering dataframes (something I am still learning).
I also need to show the latest value to see where the bonds are trading now, compared to the range.

Related

Tibco SPOTFIRE - How to create lists of unique values from data table for creating a new data table

i'm pretty new about using spotfire and I've to realize some bar and line charts like those graphs I realized with python and matplotlib :
bar chart from python
line chart from python
In order to realize those graphs, I created a set of unique values for the x axis which contains differents sprints of stories (refer to jira and agile method for more informations) and i created 3 lists (begin, planned and ends) for gathering all of the business values for each sprint occurence. Then I created a pandas dataframe gathering my 4 new lists and I used the columns with matplolib to realize the graphs (the second graph shows the cumulative sum of begin and end business values per sprints).
My question is : is it possible to create a list of unique values for the x axis in spotfire and how to create a data table from another data table, just like I did for the python graphs ? All I can get for the moment by using spotfire is this :
bar and line charts from spotfire
I already tried to merge each graphs of the same category together in order to get the same result however the two x axis (begin and ends) do not have the same number of values and i get some errors if i try. If anyone had a solution that can solve that problem, it would be great.
PS : I can't give any data files cause i'm working for a society and some of those data could be confidential and sorry for potential clerical errors, cause i'm french.

Plotting a grouped stacked bar chart

I am trying to create a grouped, stacked bar chart. I was able to do it in excel and this image shows what I am trying to create but I want to do it through Python. I have all the data in a pandas data frame that is able to create separate stacked bar charts but I cannot get the grouping as seen in excel.
Excel Formatting:
If you could do it in Excel with easy then I strongly suggest you to do it with Excel. Unless you have other requirements.
There are many libraries you can use to create this type of plot: matplotlib, seaborn, or plotly. The one I use most is plotly. You can see the list of sample figures of plotly here: https://plotly.com/python/
Or you can join plotly community, there are many pros there might help with figure. I find there is few pros on figures in stackoverflow to plotly community: https://community.plotly.com/

plotting very large data in python

I am trying to plot a large set of values against time. My dataset spans over 46 days and includes data for every second of the day. Since the plots are incomprehensible when plotted directly, I tried to group them. the groupby function in pandas works fine as long as one needs to find some aggregates or summary statistics. I tried the following command, but it just gives a blop on the plot and does not do what I want it to.
df1 = df.groupby(pd.Grouper(key='time', freq='7D'))['values']
Is there a way to group the data according to a column and then add it in a new column?
I also tried plots after making time the index, but that also does not help.

How can I arrange two faceted side-by-side charts horizontally in Altair?

Altair offers lovely feature to facet charts using facet method. For example, following dataset visualizes nicely:
print(df[['Year', 'Profile', 'Saison', 'Pos']].to_csv())
,Year,Profile,Saison,Pos
0,2017,6.0,Sommer,VL
1,2017,6.0,Winter,VL
13,2017,6.0,Winter,HL
12,2017,6.0,Sommer,HL
18,2017,6.0,Sommer,HR
6,2017,6.0,Sommer,VR
7,2017,6.0,Winter,VR
19,2017,6.0,Winter,HR
14,2018,5.5,Winter,HL
8,2018,5.5,Winter,VR
15,2018,5.5,Sommer,HL
20,2018,4.3,Winter,HR
21,2018,5.0,Sommer,HR
3,2018,5.5,Sommer,VL
2,2018,6.2,Winter,VL
9,2018,4.5,Sommer,VR
17,2019,4.5,Sommer,HL
11,2019,4.2,Sommer,VR
22,2019,3.5,Winter,HR
10,2019,5.28,Winter,VR
5,2019,4.6,Sommer,VL
4,2019,4.9,Winter,VL
16,2019,4.0,Winter,HL
23,2019,4.5,Sommer,HR
with the following command:
alt.Chart(df).mark_bar().encode(x='Year:O', y='Profile:Q').facet(row='Saison:N', column='Pos:N')
But, as you can seem I have still a lot of place horizontally and would like to use it by rearranging Winter plot right next to the Summer plot:
I understand that I already used column grid to facet over attribute Pos, but visually for me Winter and Sommer plots are two separate plots (just like here), which I'd like to place side by side.
I tried to create two different charts in the same cell and using html emit them side by side, but in Jupyter environment there is a limitation on just one Altair/Vega plot per cell.
Is there any method I can use to arrange these charts horizontally?
In Altair, there is no good way to do this, because faceted charts cannot be nested according to the Vega-Lite schema. However, the Vega-Lite renderer actually does handle this in some cases, despite it technically being disallowed by the schema.
So you can hack it by doing something like this:
chart = alt.Chart(df).mark_bar().encode(
x='Year:O',
y='Profile:Q'
).facet('Saison:N')
spec = alt.FacetChart(
data=df,
spec=chart,
facet=alt.Facet('Pos:N')
).to_json(validate=False)
print(spec)
The resulting spec can be pasted by hand into http://vega.github.io/editor to reveal this (vega editor link):
You'll even notice that the vega editor flags parts of the spec as invalid. This is admittedly not the most satisfying answer, but it sort of works.
Hopefully in the future the Vega-Lite schema will add actual support for nested facets, so they can be used more directly from Altair.

Assigning color on Creating Stacked Column chart with xlsxwriter Pandas Python

I was successfully able to generate Stacked Column charts in the newly created Excel sheet using pandas dataframe with xlsxwriter of Python Pandas. But, I can't figure out how to assign color yet.
Here is the picture.
This is from Pandas xlsxwriter documentation. In the given picture, each "Metric" has different colors. For Example, Metric 8 is pink and Metric 1 is blue. I want to assign specific colors to each metric in this example. Obviously, each metric belongs to each row of data in question.
I understand I can do this in excel individually. But, I am writing Python code to generate several dozens of stacked column charts and put it in excel using xlsxwriter. So, it is not practical to do this by hand.
Any help is appreciated !
You need to set the fill color for the series.
See the following Pandas-XlsxWriter stacked charts with colors example. The example uses brew colors but you can replace those with any Html like RGB color.
See also the XlsxWriter Chart documentation and Working with Charts which explain the API and give examples of setting the colors of different properties.

Categories