Cross-reference a python code chunk in Rmarkdown with reticulate - python

I have a Rmarkdown document containing some python code chunks using the reticulate library. The code executes output perfectly; however, how would I cross-reference the generated plots in the text using its label? I am using bookdown::pdf_documents2 etc, and have no issue with inline reference of R chunks using the standard \#ref(fig:my-plot).
An MWE would be:
```{python, my-plot}
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(10)
y = np.arange(10)
plt.figure()
plt.plot(x, y)
plt.show()
```
In Figure \#ref(fig:my-plot)....
I have tried without prepending the chunk type (fig, etc.)
I understand I can save the image to file and include it with knitr as a subsequent R chunk, but it would be preferable to do it through the python chunk alone.

I must have been making an error when previously attempting fig.cap within the chunk, as Daniel correctly suggests in the question comments. Attempting with this again, I can cross-reference perfectly with the python chunk across all HTML, PDF and word outputs. The updated MWE is:
```{python, my-plot, fig.cap="My Caption"}
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(10)
y = np.arange(10)
plt.figure()
plt.plot(x, y)
plt.show()
```
In Figure \#ref(fig:my-plot)...
Each of the outputs in the YAML header required the varying bookdown:: options. For PDF (bookdown::pdf_document2), for word (bookdown::word_document2) and Rmdformats in my case, the option of use_bookdone: true.

Related

xgboost.plot_tree shows - Empty characters/boxes/blocks as labels

SITUATION
When I plot xgboost.plot_tree I get a bunch of empty characters/boxes/blocks on the graph only instead of the titles, labels and numbers. I use more than 400 features so that can be a contributing factor for this.
CODE 1
fig, ax = plt.subplots(figsize=(170, 170))
plot_tree(xgbmodel, ax=ax)
plt.savefig("temp.pdf")
plt.show()
CODE 2
plot_tree(xgbmodel, num_trees=2)
fig = plt.gcf()
fig.set_size_inches(150, 100)
fig.savefig('tree.png')
ERROR
both code 1 and code 2 results the same image
This is is just a crop of the whole tree because that is much bigger so I would not be able to upload here, but the tree shape look perfect.
SOLUTIONS I have Tried
This has problem with plotting, I can plot without any problem - Plot a Single XGBoost Decision Tree
This has other issues - xgboost.plot_tree: binary feature interpretation
I have plotted the code that #jared_mamrot has given to me and it have brought the same error, I have restarted and cleaned my environment and run this fist and only, in the same notebook.
GitHub Recommendation this model.get_booster().get_dump(dump_format='text') printed a out a bit more than 200'000 characters = 63 A4 size pages of 11size fonts of Calibri, that looks perfectly correct ex.: 0.0268656723\n\t\t\t\t\t34:[f0<6.5] yes=53,no=54,missing=53\n\t\t\t\t\t\. Is it possible that I have this issue because it can not display so much text in such a normal size graph?
I wasn't able to reproduce your error. Can you please add more details to your question and confirm that this code works? link to pima-indians-diabetes.csv
#!/usr/bin/env python3
# plot decision tree
from numpy import loadtxt
from xgboost import XGBClassifier
from xgboost import plot_tree
import matplotlib.pyplot as plt
import graphviz
# load data
dataset = loadtxt('pima-indians-diabetes.csv', delimiter=",")
# split data into X and y
X = dataset[:,0:8]
y = dataset[:,8]
# fit model no training data
model = XGBClassifier()
model.fit(X, y)
# plot/save fig
fig, ax = plt.subplots(figsize=(170, 170))
plot_tree(model, ax=ax)
plt.savefig("test.pdf")
Edit per comment:
I can't reproduce this issue/error. No matter which package version / char encoding / line endings / etc my notebook always renders the text correctly. The only thing I can suggest is installing a new virtual environment (e.g. miniconda) with current versions of the required packages (conda install notebook numpy matplotlib xgboost graphviz python-graphviz) and testing it again.
Also, make sure you don't have windows line endings (see: Matplotlib plotting some characters as blank square / https://github.com/jupyterlab/jupyterlab/issues/1104 / https://github.com/jupyterlab/jupyterlab/issues/3718 / https://github.com/jupyterlab/jupyterlab/pull/3882 ) and specify the font you are using (e.g. How to change fonts in matplotlib (python)?):
# plot decision tree
from numpy import loadtxt
from xgboost import XGBClassifier
from xgboost import plot_tree
from matplotlib.font_manager import FontProperties
import matplotlib.pyplot as plt
import graphviz
# load data
dataset = loadtxt('pima-indians-diabetes.csv', delimiter=",")
# split data into X and y
X = dataset[:,0:8]
y = dataset[:,8]
# fit model no training data
model = XGBClassifier()
model.fit(X, y)
# plot/save fig
prop = FontProperties()
prop.set_file('Arial.ttf')
fig, ax = plt.subplots(figsize=(170, 170))
plot_tree(model, ax=ax, fontproperties=prop)
plt.savefig("test.png")
fig.show()
I have moved my whole environment to a local machine from an AWS EC2 than it run perfectly. The AWS EC2 some other weird things like it wasn't allowing to use Extension in Jupyter Lab. Both of them are Ubuntu 20.04 LTS.

display images inside a loop by overwriting the existing plot/figure in python

I have hundreds of thousands of images which I have to get from URL, see them ,tag them and then save them in their respective category as spam or non spam. On top of that, I'll be working with google which makes it impossible. My idea is that instead of looking at each image by opening, analysing, renaming and then saving them in directory, I just get the image from url, see within a loop, input a single word and based on that input, my function will save them in their respective directories.
I tried doing
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
from IPython.display import Image
fg = plt.figure()
for i in range(5):
plt.imshow(np.random.rand(50,50))
plt.show()
x = input()
print(x)
but instead of overwriting the existing frame, it is plotting a different figure. I have even used 1,1 subplot inside a loop but it is not working. Ipython's method does not even display inside a loop either. Could somebody please help me with this problem.
You can make use of matplotlib's interactive mode by invoking plt.ion(). An example:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib notebook
fig, ax = plt.subplots()
plt.ion()
plt.show()
for i in range(5):
ax.imshow(np.random.rand(50,50)) # plot the figure
plt.gcf().canvas.draw()
yorn = input("Press 1 to continue, 0 to break")
if yorn==0:
break
Expected output:

Printing image in r-markdown using matplotlib and python code

I am trying to run the python code using the R-Markdown file (RMarkdown to pdf).
What I achieved till now -
1- I am able to configure my python engine using knitr and reticulate library
2- I am able to execute my python codes.
What I tried -
1- I tried all the methods which are discussed in this forum, but nothing is working out.
2- I also tried to save the image,(as one of the posts here suggests), but that also is not working.
My problem -
1- When I am trying to plot a graph using matlplotlib and command plt.imshow() and plt.show(), it's not printing the image in the output. Rather it's showing the image in a separate window. You can see my results in the attached image.
Result_of_my_code
Here is my code
```{r setup, include=FALSE}
library(knitr)
library(reticulate)
knitr::knit_engines$set(python = reticulate::eng_python)
```
```{python}
import numpy as np
import os
import torch
import torchvision.datasets as dsets
import matplotlib.pyplot as plt
print(os.getcwd())
os.chdir('D:\\1st year\\Python codes\\CIFR Analysis\\self contained analysis')
print(os.getcwd())
train_mnist = dsets.MNIST("../data", train=True)
test_mnist = dsets.MNIST("../data", train= False)
print(len(train_mnist))
#print(train_mnist[0][0])
plt.imshow(train_mnist[0][0], cmap="gray")
#plt.savefig("trainzero.png")
plt.show()
```
Kindly, help me to fix this issue, as I want to compile my python codes using the R markdown file.
thanks
So with R Markdown, you have to do some things a little differently. In the following, I have a dataframe with two series created by concatenating them. The original plotting code in the Jupyter Notebook is as follows and just printed out the series.
# make a plot of model fit
train.plot(figsize=(16,8), legend=True)
backtest.plot(legend=True);
However, it does not work with way with R Markdown. Then with plotting, you always have to assign them, and with the code below, you get the same plot.
dfreg = pd.concat([reg, backtest], axis = 1)
ax = dfreg.plot(figsize=(16,8), legend = True)
ax1 = predictions.plot(legend=True)
plt.show()
This is common with other plotting functions like plot_acf() too.

matplotlib fails to output EPS figure with usetex = True

I am trying to output (savefig) matplotlib figures as EPS; however, it seems there is a conflict when using the LaTeX rendering AND saving EPS figures. For example, the following code produces a good EPS figure:
import matplotlib.pyplot as plt
import numpy as np
plt.figure()
plt.plot(np.random.rand(100))
plt.savefig('plot.eps')
whereas this code produces an EPS figure that can not be viewed; my document viewer (Ubuntu's Evince) continuously says "Loading..."
import matplotlib.pyplot as plt
import numpy as np
plt.rc('text', usetex = True)
plt.figure()
plt.plot(np.random.rand(100))
plt.savefig('plot.eps')
Is there a known issue when combining these two options? Is there any kind of work around (aside from saving as PDF or saving as PDF then converting to EPS)?
The only solution I could find was to update matplotlib from 1.2.1 to 1.3.1. Now it works without problems.

How do I write a Latex formula in the legend of a plot using Matplotlib inside a .py file?

I am writing a script in Python (.py file) and I am using Matplotlib to plot an array.
I want to add a legend with a formula to the plot, but I haven't been able to do it.
I have done this before in IPython or the terminal. In this case, writing something like this:
legend(ur'$The_formula$')
worked perfectly. However, this doesn't work when I call my .py script from the terminal/IPython.
The easiest way is to assign the label when you plot the data,
e.g.:
import matplotlib.pyplot as plt
ax = plt.gca() # or any other way to get an axis object
ax.plot(x, y, label=r'$\sin (x)$')
ax.legend()
When writing code for labels it is:
import pylab
# code here
pylab.plot(x,y,'f:', '$sin(x)$')
So perhaps pylab.legend('$latex here$')
Edit:
The u is for unicode strings, try just r'$\latex$'

Categories