How to merge changes in Jupyter notebooks

How to merge changes in Jupyter notebooks - python

Collaboration with a coworker on a Jupyter notebook is driving me nuts. We're working on different versions (I would say "branches", but that's probably too fancy for what we're doing) of the same notebook. I try to merge (some of) the changes he introduces, into my version. Since diffing JSON files is a nightmare, I convert the two notebooks to .py files (Download as\Python (.py file) from the File menu of the notebooks) and then compare the .py files in PyCharm. This works nicely, also because all output is removed when exporting to .py.
The problem now is to import the changed .py file into Jupyter. Is this possible? The one thing that gives me hope of an affirmative answer, is that into the exported .py files there are some # In[4]: comments, which maybe the Jupyter interface may use to understand how the code is divided into cells. Or is it just impossible to go back? If so, do you have any other suggestions to merge some of changes between two different versions of a Jupyter notebook?

To answer the second question:
(And this question looks related there)
When we had that problem, using jq as described in this post worked okay. (The part starting from "Enter jq".)
To use this, you would have a second, "stripped" version of the notebook which is the one you add to git, in addition to your development notebook (which is not added to git, otherwise you get merge conflicts with the development notebooks of your teammates).
You always need an additional step,
nbstrip_jq mynotebook.ipynb > mynotebook_stripped.ipynb
before doing git add mynotebook_stripped.ipynb, git commit etc. But if everyone in your team does that, the changes are more or less nicely manageable with git. For bigger projects, you could try automating it like described further below in the same post.

Related

How to redirect a python.exe within a pre-existing venv's .env file

Recently I uninstalled python and then reinstalled it to see what affects it would have on my code and ensure that I knew how to fix any path or directory issues. I mostly succeeded but am having trouble within my venvs. Testing out a basic django app I made any of my python manage.py ... code returns the error No Python at '"C:\Python311\python.exe'.
After doing some research it appears that the issue is that within .env\scripts there are two files called "python.exe" and "pythonw.exe" are causing the problem because they are trying to access a python.exe that no longer exists. (Note this is my understanding based off of pythons venv document and I could be incorrect)
My question is what is the best way to deal with this issue. I see two options but am not sure how to accomplish either.
Fully reset the .env folder so it points to the correct python.
Just change those two files so they point to the correct instance of python.
So far the only things I have tested are running code outside of venv's entirely and creating new ones to run the code from. Both work totally fine so it is definitely something from the historic venvs.
I have also looked at this question and I believe the two to be related but could be mistaken. Unfortunately it does not seem an answer was given there.

How to prevent the repetition of object name for Python using autocompletion in VSCode?

I am using VSCode for writing Python code in a Jupyter Notebook. The relevant extensions installed are Python, Pylance and Jupyter. The problem occurs when I try to use tab to autocomplete method names for any object. For example, if the suggestion box looks like this:
and I press Tab to accept the suggestion, the object name database is repeated i.e the code looks like dataset.dataset.as_numpy_iterator instead of dataset.as_numpy_iterator. How can I remove this object name duplication? Thanks!

After wasting a lot of time searching for a fix, I tried using the latest Insider's build (instead of the stable build) of VSCode and surprise surprise - it did not have this issue. Moreover, even in the stable build, it occurs only in Jupyter Notebooks and not in standalone .py files. I am posting this as an answer so that other people don't have to waste more time on this!

It looks like provided by some extension you have installed. Such as Tabnine AI, Kite and so on. But I can't reproduce it on both of them.
I can't get to know which extension provides it in your picture, it looks like was cut off in your picture. But it does not provide by the Python extension.

I bumped into the same issue. Simply disabling and re-enabling the Jupyter Keymap extension solved my problem.
Judging from the lack of related search results, this issue seems to occur only under some rare circumstance...

Call and run a jupyter notebook from Excel using xlwings?

In the company I work at, we have just started experimenting with the migration of several computation-heavy projects from Excel to Python. In this process, we have discovered xlwings and the power it can bring to the integration of Excel and Python.
Several of our projects include reading in input data from Excel worksheets, doing some background calculations, and then outputting the results to different sheets in the same workbook. From the
example projects on the xlwings website, I know it's possible to replace VBA macros (which we used so far) with Python, while keeping the Excel front-end.
However, my co-workers, who are primarily financial experts and not programmers, really like the interactivity of jupyter notebooks and would appreciate it if they could work in them during the modeling phase of the projects (rather than switching to PyCharm all of a sudden).
All in all, the ideal workflow would look like this for us:
Inputting from Excel sheets, doing some modeling and model calibration in Python through jupyter notebooks, running some tests, then if we're at a final stage, then outputting to Excel. A key constraint is that the end-users of our models are used to the VBA-like functionality (eg. Run buttons in Excel).
So my question is the following:
Is it possible to call and run a jupyter notebook from Excel as it was a .py file (ie. through the RunPython function)? This way, I assume that we could avoid the intermediate step of "converting" the models from .ipynb to .py, not to mention having two code versions of the same model.
Thank you for any suggestions!

Look here: https://github.com/luozhijian/jupyterexcel with using 'def'. That creates functions callable in Jupyter.

Thanks for the replies!
We started experimenting with nbconvert as #Wayne advised, and basically wrote a small wrapper .py code that can be called in the Excel macro via RunPython and that runs the specified notebook as-is. In the notebook, interaction between Excel and jupyter (eg. reading parameter data in from Excel and outputting values from jupyter to Excel) is handled by xlwings in turn.
I'm not sure it is the most optimal solution concerning execution speed though. However, in our case it works just fine and so far we haven't experienced any additional overhead that would hinder user experience.

This is possible with xlOil (disclaimer: I wrote it). The relevant docs are here.
The summarised procedure is: install xloil
pip install xloil
xloil install
Then edit %APPDATA%\xloil\lxloil.ini to add a reference to xloil.jupyter. Then in an Excel cell type
=xloJpyConnect("MyNotebook.ipynb")
Where you have "MyNotebook.ipynb" loaded in a local jupyter kernel. Then you can execute code on the kernel with:
=xloJpyRun(<result of connect func>, "{} + {} * {}", A1, B1, C1)
Or watch a global variable in the kernel with:
=xloJpyWatch(<result of connect func>, "variable_to_watch")
Or you can add a decorator #xloil.func to functions defined in the kernel to make them appear as Excel worksheet functions.

It sounds like this is what you want https://towardsdatascience.com/python-jupyter-notebooks-in-excel-5ab34fc6439
PyXLL and the pyxll-jupyter package allows you to run a Jupyter notebook inside of Excel and seamlessly interact with Excel from Python. You can also write Python functions in the notebook and call them from Excel as a user defined function or macro.
From the article:
"""
It used to be an “either/or” choice between Excel and Python Jupyter Notebooks. With the introduction of the PyXLL-Jupyter package now you can use both together, side by side.
In this article I’ll show you how to set up Jupyter Notebooks running inside Excel. Share data between the two and even call Python functions written in your Jupyter notebook from your Excel workbook!
"""
If the above link doesn't work you can also find it here https://www.pyxll.com/blog/python-jupyter-notebooks-in-excel/

Jupyterlab for Flask or Django web development

I recently found out about Jupyterlab. I like the improvement over plain Notebooks.
I was hoping we could actually use Jupyterlab as an online IDE for web development of Django, Flask or other projects. I don't like developing in a local environment. However I cannot find anything about using Jupyter for web development. Not in their Github repo or searching on Google.
Opening normal .py the tab function to list all functions, classes etc don't work. This also doesn't work when importing a .py file in a .ipynb file.
Using nbconvert and P2J to convert all files back and forth from .py to ipynb and vice versa isn't really efficient.
And besides this, another issue with this approach is that
if you import nb 2 in nb 1 and you change something in nb 2 you have to restart the entire kernel of nb 1 in order to have the changes take effect. Simply re-running the import or importlib.reload(nb2) doesn't work.
Is there a good approach to this?

How do I document the Jupyter Notebook Profile startup?

When I start up the Jupyter Notebook I've modified the ipython_config.py in my ipython profile to automatically load numpy as np:
c.InteractiveShellApp.exec_lines = [
'import numpy as np',
]
This works great. When I start up a Notebook, in the first cell I can immediately call all of the numpy library via np.. However, if I'm sharing this Notebook via a gist or some other method, these imports are not explicitly shown. This is suboptimal as it makes clear reproducibility impossible.
My question: Is there a way that I could automatically populate the first cell of a new Notebook with the code that I'm importing? (Or some other similar way to document the imports that are occurring for the Notebook).
I'd be OK with removing the exec_lines option and pre-populating the code that I have to run myself or some other solution that gets at the main idea: clear reproducibility of the code that I'm initially importing in the Notebook.
Edit
A deleted answer that might be helpful to people landing here: I found jupyter_boilerplate which as an installable Notebook extension "Adds a customizable menu item to Jupyter (IPython) notebooks to insert boilerplate snippets of code" -- would allow one to easily create a starting code snippet that could be filled in.
Sidenote to MLavoie because "comments disabled on deleted / locked posts / reviews"
Yes, you are right that:
While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. - From Review – MLavoie Jul 8 '16 at 17:27
But, you'll notice, that this is a widget to be installed, so there isn't relevant code to paste here. It was unhelpful to delete the above answer.

Almost automatically:
%load startup.py
Put import/config code in a version controlled file on your PYTHONPATH and %load it into the first cell.
This has the advantage of allowing you to use different startup code without tweaking your startup config, and notebooks remain portable, i.e. send the notebook and startup file to other users and they can run it without tweaking their startup config.

Create a notebook that contains the preparations you want and use that as a template. That is, copy it to a new file and open it.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.