In the company I work at, we have just started experimenting with the migration of several computation-heavy projects from Excel to Python. In this process, we have discovered xlwings and the power it can bring to the integration of Excel and Python.
Several of our projects include reading in input data from Excel worksheets, doing some background calculations, and then outputting the results to different sheets in the same workbook. From the
example projects on the xlwings website, I know it's possible to replace VBA macros (which we used so far) with Python, while keeping the Excel front-end.
However, my co-workers, who are primarily financial experts and not programmers, really like the interactivity of jupyter notebooks and would appreciate it if they could work in them during the modeling phase of the projects (rather than switching to PyCharm all of a sudden).
All in all, the ideal workflow would look like this for us:
Inputting from Excel sheets, doing some modeling and model calibration in Python through jupyter notebooks, running some tests, then if we're at a final stage, then outputting to Excel. A key constraint is that the end-users of our models are used to the VBA-like functionality (eg. Run buttons in Excel).
So my question is the following:
Is it possible to call and run a jupyter notebook from Excel as it was a .py file (ie. through the RunPython function)? This way, I assume that we could avoid the intermediate step of "converting" the models from .ipynb to .py, not to mention having two code versions of the same model.
Thank you for any suggestions!
Look here: https://github.com/luozhijian/jupyterexcel with using 'def'. That creates functions callable in Jupyter.
Thanks for the replies!
We started experimenting with nbconvert as #Wayne advised, and basically wrote a small wrapper .py code that can be called in the Excel macro via RunPython and that runs the specified notebook as-is. In the notebook, interaction between Excel and jupyter (eg. reading parameter data in from Excel and outputting values from jupyter to Excel) is handled by xlwings in turn.
I'm not sure it is the most optimal solution concerning execution speed though. However, in our case it works just fine and so far we haven't experienced any additional overhead that would hinder user experience.
This is possible with xlOil (disclaimer: I wrote it). The relevant docs are here.
The summarised procedure is: install xloil
pip install xloil
xloil install
Then edit %APPDATA%\xloil\lxloil.ini to add a reference to xloil.jupyter. Then in an Excel cell type
=xloJpyConnect("MyNotebook.ipynb")
Where you have "MyNotebook.ipynb" loaded in a local jupyter kernel. Then you can execute code on the kernel with:
=xloJpyRun(<result of connect func>, "{} + {} * {}", A1, B1, C1)
Or watch a global variable in the kernel with:
=xloJpyWatch(<result of connect func>, "variable_to_watch")
Or you can add a decorator #xloil.func to functions defined in the kernel to make them appear as Excel worksheet functions.
It sounds like this is what you want https://towardsdatascience.com/python-jupyter-notebooks-in-excel-5ab34fc6439
PyXLL and the pyxll-jupyter package allows you to run a Jupyter notebook inside of Excel and seamlessly interact with Excel from Python. You can also write Python functions in the notebook and call them from Excel as a user defined function or macro.
From the article:
"""
It used to be an “either/or” choice between Excel and Python Jupyter Notebooks. With the introduction of the PyXLL-Jupyter package now you can use both together, side by side.
In this article I’ll show you how to set up Jupyter Notebooks running inside Excel. Share data between the two and even call Python functions written in your Jupyter notebook from your Excel workbook!
"""
If the above link doesn't work you can also find it here https://www.pyxll.com/blog/python-jupyter-notebooks-in-excel/
Related
My use-case is - To visualize the peer feedback of staff to find interesting facts and inferences.
Ex - reading data from a .csv file and creating visualization on the feedback like a word cloud, bar charts, spider charts, etc.
The expected end-user experience is -
a) User clicks on the executable
b) User is asked to select a file
c) User sees all the visualizations
Also, in the future, I want to give the option for users to apply filters and search for categorical variables & staff nos.
ps: I want to keep tools like power bi, MicroStrategy, etc out of scope for this PoC.
There does not seem to be a direct way to convert a Jupyter notebook to an executable file.
However the standard way to tackle your problem seems to be a two-step process:
convert the notebook into a regular Python script. You can download your notebook as a Python script from the Jupyter GUI or use nbconvert, this thread is related.
turn the script into an executable. There are several tools available for that matter, such as Cx_freeze or Pyinstaller.
Is it possible to visualize data in pycharm in the same way that you can do in jupyter?
For instance in jupyter you can run a single line at a time and see the output which can be helpful for working with data sets.
Of course you can just use a function like head() or show() to see what is going on but you have to run the whole file (as opposed to one line at a time in jupyter) and it can make working with data a bit harder to understand.
Does anyone have any recommendations for me in terms of pycharm as that is what I am most familar with, or do you think it is worth me learning something like jupyter?
Why don't you use Atom and install the package Hydrogen? it offers you the same possibilities as Jupyter while working in a script (not a notebook).
You can execute the code line by line like in Jupyter by clicking ctrl+Enter or run it as a script. Here's the documentation of the package.
Atom is a light IDE so it combines both the power of Jupyter and PyCharm. I have used it and it is great and has so many packages like hydrogen, pep8 (helps to write a code that is conform to the pep8) and code beautifiers (for Python, R, JSON,etc), and a lot of great features.
I recommend learning Jupyter. Although PyCharm is an excellent IDE, running Jupyter in it looks clunky and something like this:
Collaboration with a coworker on a Jupyter notebook is driving me nuts. We're working on different versions (I would say "branches", but that's probably too fancy for what we're doing) of the same notebook. I try to merge (some of) the changes he introduces, into my version. Since diffing JSON files is a nightmare, I convert the two notebooks to .py files (Download as\Python (.py file) from the File menu of the notebooks) and then compare the .py files in PyCharm. This works nicely, also because all output is removed when exporting to .py.
The problem now is to import the changed .py file into Jupyter. Is this possible? The one thing that gives me hope of an affirmative answer, is that into the exported .py files there are some # In[4]: comments, which maybe the Jupyter interface may use to understand how the code is divided into cells. Or is it just impossible to go back? If so, do you have any other suggestions to merge some of changes between two different versions of a Jupyter notebook?
To answer the second question:
(And this question looks related there)
When we had that problem, using jq as described in this post worked okay. (The part starting from "Enter jq".)
To use this, you would have a second, "stripped" version of the notebook which is the one you add to git, in addition to your development notebook (which is not added to git, otherwise you get merge conflicts with the development notebooks of your teammates).
You always need an additional step,
nbstrip_jq mynotebook.ipynb > mynotebook_stripped.ipynb
before doing git add mynotebook_stripped.ipynb, git commit etc. But if everyone in your team does that, the changes are more or less nicely manageable with git. For bigger projects, you could try automating it like described further below in the same post.
I'm learning some Data Science and for that I'm using Python with Jupyter Notebook. Which I think it's great for data analysis, mainly because it’s super easy to run step-by-step code. You can see everything that is happening.
On the other hand, to do more complex projects, like a web crawler or an Object Oriented program to extract information from an API, I’m using Sublime Text3. IMO it’s simple, clean, light… perfect. Also I think that .py is better than .ipynb for that (I don't even know if it's possible to do OO with Jupyter).
My problem now is integrating these two tools. The best I can do now is convert the dictionnaires in some .csv file and read it manually in Jupyter notebook. Obviously it doesn't sounds very smart and it is like a temporary solution just for experimentation.
This is the first time I'm dealing with a project which I need to integrate more than one environment and not only working with the same languages with all the files in the same folder etc. so I'm not very familiar on how to approach that.
If someone could explain the right way of integrating these two IDEs, how to make all the process more 'automatic', if it's better to use some database and then extract with SQL or something like that I'd appreciate very much.
PS: Also, if you guys have any material on how should a Python Data Science project be organized it would be awesome. Thanks!
I use ipython magic commands to help me switch between a text editor and a ipython notebook.
Specifically, I like to experiment with the code in the Notebook for the reasons you mentioned, and then when I'm ready to integrate it as a class in a bigger system I use the %%writefile filename.py command which will export that cell into a .py file.
You can also use %load filename.py and %run myfile.py to bring .py files into the notebook.
I am new to python. I have a very basic question. Is there a way I can execute part of a python program? ie some thing similar to Matlab where after running the code once, I can execute parts of the program
If you want something similar to Matlab, check out ipython - it comes with the goodness of python with a workflow similar to Matlab.
ipython has the concept of Notebooks which are composed of cells. These cells can be executed individually giving you the behavior you expect.
What you are looking for is called "cell" execution in MATLAB.
The Spyder python editor is in general a good approximation of MATLAB-style IDE for python. It supports executing the full script, the selected lines or a "cell" that is defined by a portion of code stating with a comment like
# %%
or
###
To get Spyder I suggest to install a scientific python distribution such as Anaconda or WinPython.
Alternatively, as pointed out by vikramls, you can embrace a more modern paradigm, convert your script to an ipython notebook and get "cell" execution for free.
PS The ipython notebook is a fantastic environment that allow to mix rich text, code and plots in a single document that is great for some workflows. On the other hand Spyder provides some unique features such as graphical variable inspector (a-la MATLAB), integrated HTML documentation and code error analysis not available in the notebook.