Alternative ways of logging cell output in jupyterLab - python

I have successfully replicated the logging methodology by #Mercury in this post: Reconnecting remote Jupyter Notebook and get current cell output
Namely, adding this code chunk to my notebook:
import sys
import logging
nblog = open("nb.log", "a+")
sys.stdout.echo = nblog
sys.stderr.echo = nblog
get_ipython().log.handlers[0].stream = nblog
get_ipython().log.setLevel(logging.INFO)
My main edit to that code is replacing a+ with w+ because I want to overwrite the log file every time I rerun my notebook.
However, I would like my logger to include information from cell outputs that aren't explicitly printed. For example, if I do head(df) in a cell instead of print(head(df)). Is that possible?
Thanks!

Related

Saving `h2o_model.accuracy` printed output to a file

h2o_model.accuracy prints model validation data when executed in a Jupyter Notebook cell (which is desirable, despite the function name). How to save this whole validation output (entire notebook cell contents) to a file? Please test before suggesting redirections.
I'd be careful using %%capture, it doesn't capture html content (tables) in the stdout.
The redirect_stdout works flawlessly when used from python CLI/script. IPython/Jupyter might cause issues with tables as they are displayed not printed. Note that you should not use .readlines() to get the results from StringIO - use .getvalue().
You can use h2o_model.save_model_details(path) to persist information about the model to a json file (which might serve you better in a long run but it's not really human readable).
If you really want to have the output that looks like what would you get from a Jupyter notebook, you can use the following hack:
create a template jupyter notebook that contains:
import os
import h2o
h2o.connect(verbose=False)
h2o.get_model(os.environ["H2O_MODEL"])
and in your original notebook add
!H2O_MODEL={h2o_model.key} jupyter nbconvert --to html --execute template.ipynb --output={h2o_model.key}_results.html
You can also create a template for the nbconvert to hide the code cells.
You should call h2o_model.accuracy() (note the parentheses). The reason the whole model gets printed is non-idiomatic implementation of __repl__ in h2o models which prints rather then returning a string (there's a JIRA to fix that).
If you encounter some other situation where you would like to save printed output of some command, you can use redirect_stdout[1] to capture it (assuming you have python 3.4+).
[1] https://docs.python.org/3.9/library/contextlib.html#contextlib.redirect_stdout
Ok, so only the h2o_model.accuracy output cannot be captured, while xgb_model.cross_validation_metrics_summary or even h2o_model alone can - e.g. like that:
%%capture captured_output
# print model validation
# data to `captured_output`
xgb_model
In another notebook cell:
# print(captured_output.stdout.replace("\n\n","\n"))
with open(filename, 'w') as f:
f.write((captured_output.stdout.replace("\n\n","\n")))

load code from a code cell from one jupyter notebook into another jupyter notebook

I want to load (i.e., copy the code as with %load) the code from a code cell in one jupyter notebook into another jupyter notebook (Jupyter running Python, but not sure if that matters). I would really like to enter something like
%load cell[5] notebookname.ipynb
The command copies all code in cell 5 of notebookname.ipynb to the code cell of the notebook I am working on. Does anybody know a trick how to do that?
Adapting some code found here at Jupyter Notebook, the following will display the code of a specific cell in the specified notebook:
import io
from nbformat import read
def print_cell_code(fname, cellno):
with io.open(fname, 'r', encoding='utf-8') as f:
nb = read(f, 4)
cell = nb.cells[cellno]
print(cell.source)
print_cell_code("Untitled.ipynb",2)
Not sure what you want to do once the code is there, but maybe this can be adapted to suit your needs. Try print(nb.cells) to see what read brings in.
You'll probably want to use or write your own nbconvert preprocessor to extract a cell from one and insert into another. There is a good amount research into these docs it takes to understand how to write your preprocessor, but this is the preferred way.
The quick fix option you have is that the nbformat specification is predicated on JSON, which means that if you read in a ipynb file with pure python (ie with open and read), you can call json.loads on it to turn the entire file into a dict. From there, you can access cells in the cells entry (which is a list of cells). So, something like like this:
import json
with open("nb1.ipynb", "r") as nb1, open("nb2.ipynb", "r") as nb2:
nb1, nb2 = json.loads(nb1.read()), json.loads(nb2.read())
nb2["cells"].append(nb1["cells"][0]) # adds nb1's first cell to end of nb2
This assumes (as does your question) there is no metadata conflict between the notebooks.

How to suppress "Update Links" Alert with xlwings

I am interfacing with Excel files in Python using the xlwings api. Some Excel files I am interacting with have old links which cause a prompt to appear when the file is opened asking if the user would like to update the links. This causes the code to hang indefinitely on the line that opened the book until this prompt is closed by a user. Is there a way to modify the settings of the Excel file so that this prompt will not appear or it will be automatically dismissed without opening the actual file?
I have tried using the xlwings method:
xlwings.App.display_alerts = False
to suppress the prompt, but as far as I can tell this can only be run for an instance of Excel after it has been opened. There are some Excel api's that do not require a file to be open in order to read data like xlrd, but they are not very convenient for reading and copying large amounts of data (Multiple/Entire sheets of data).
The following code demonstrates the issue:
import xlwings as xw
wb = xw.Book(r'C:\Path\To\File\Filename')
print('Done')
On a regular Excel file the code proceeds through and prints "Done" without the need of user interference, but on an Excel file where the "update links" prompt comes up, it will not proceed to the print statement until the prompt is dismissed by a user.
Expanding on your first attempt -- you're not handling an App instance, rather you're trying to assign to the xlwings.App class.
However, it seems that the display_alerts doesn't successfully suppress this alert in xlwings, try this:
import xlwings as xw
app = xw.App(add_book=False)
app.display_alerts = False
wb = app.books.api.Open(fullpath, UpdateLinks=False)
I believe there is an implementation in xlwings to avoid update links messages now. I was able to bypass these alerts by adding the following
app.books.open(fname, update_links=False, read_only=True, ignore_read_only_recommended=True)
You can see these arguments available in the documentation xlwings.Book.open(...)
I presently have 20+ source workbooks that I loop though to extract some rows of data. It was intolerable to respond to the update links prompt of each opened workbook. I tried the other solutions here but none worked for me. After reviewing the cited xlwings docs, this is the solution that worked for me:
for fname in workbook_list:
wb = xw.books.open(fname, update_links = False)
# Extract some data...
wb.close()
My environment is Win10Pro / Python 3.8.1 / pywin32 version: 303 / Excel 365 Subscription / xlwings 0.26.2

XLWings data link warning not caught by display_alerts [duplicate]

I am interfacing with Excel files in Python using the xlwings api. Some Excel files I am interacting with have old links which cause a prompt to appear when the file is opened asking if the user would like to update the links. This causes the code to hang indefinitely on the line that opened the book until this prompt is closed by a user. Is there a way to modify the settings of the Excel file so that this prompt will not appear or it will be automatically dismissed without opening the actual file?
I have tried using the xlwings method:
xlwings.App.display_alerts = False
to suppress the prompt, but as far as I can tell this can only be run for an instance of Excel after it has been opened. There are some Excel api's that do not require a file to be open in order to read data like xlrd, but they are not very convenient for reading and copying large amounts of data (Multiple/Entire sheets of data).
The following code demonstrates the issue:
import xlwings as xw
wb = xw.Book(r'C:\Path\To\File\Filename')
print('Done')
On a regular Excel file the code proceeds through and prints "Done" without the need of user interference, but on an Excel file where the "update links" prompt comes up, it will not proceed to the print statement until the prompt is dismissed by a user.
Expanding on your first attempt -- you're not handling an App instance, rather you're trying to assign to the xlwings.App class.
However, it seems that the display_alerts doesn't successfully suppress this alert in xlwings, try this:
import xlwings as xw
app = xw.App(add_book=False)
app.display_alerts = False
wb = app.books.api.Open(fullpath, UpdateLinks=False)
I believe there is an implementation in xlwings to avoid update links messages now. I was able to bypass these alerts by adding the following
app.books.open(fname, update_links=False, read_only=True, ignore_read_only_recommended=True)
You can see these arguments available in the documentation xlwings.Book.open(...)
I presently have 20+ source workbooks that I loop though to extract some rows of data. It was intolerable to respond to the update links prompt of each opened workbook. I tried the other solutions here but none worked for me. After reviewing the cited xlwings docs, this is the solution that worked for me:
for fname in workbook_list:
wb = xw.books.open(fname, update_links = False)
# Extract some data...
wb.close()
My environment is Win10Pro / Python 3.8.1 / pywin32 version: 303 / Excel 365 Subscription / xlwings 0.26.2

Connecting Excel with Python

Using code below, I can get the data to print.
How would switch code to xlrd?
How would modify this code to use a xls file that is already open and visible.
So, file is open first manually, then script runs.
And, gets updated.
and then get pushed into Mysql
import os
from win32com.client import constants, Dispatch
import numpy as np
#----------------------------------------
# get data from excel file
#----------------------------------------
XLS_FILE = "C:\\xtest\\example.xls"
ROW_SPAN = (1, 16)
COL_SPAN = (1, 6)
app = Dispatch("Excel.Application")
app.Visible = True
ws = app.Workbooks.Open(XLS_FILE).Sheets(1)
xldata = [[ws.Cells(row, col).Value
for col in xrange(COL_SPAN[0], COL_SPAN[1])]
for row in xrange(ROW_SPAN[0], ROW_SPAN[1])]
#print xldata
a = np.asarray(list(xldata), dtype='object')
print a
If you mean that you want to modify the current file, I'm 99% sure that is not possible and 100% sure that it is a bad idea. In order to alter a file, you need to have write permissions. Excel creates a file lock to prevent asynchronous and simultaneous editing. If a file is open in Excel, then the only thing which should be modifying that file is... Excel.
If you mean that you want to read the file currently in the editor, then that is possible -- you can often get read access to a file in use, but it is similarly unwise -- if the user hasn't saved, then the user will see one set of data, and you'll have another set of data on disk.
While I'm not a fan of VB, that is a far better bet for this application -- use a macro to insert the data into MySQL directly from Excel. Personally, I would create a user with insert privileges only, and then I would try this tutorial.
If you want to manipulate an already open file, why not use COM?
http://snippets.dzone.com/posts/show/2036
http://oreilly.com/catalog/pythonwin32/chapter/ch12.html

Categories