Package python software with pylucene dependency

Package python software with pylucene dependency - python

I'm working on a python project that needs pylucene(python wrapper for lucene, a java library for search-engines programming).
I've created a Dockerfile that automatically downloads and compile pylucene; then also installs other needed pip dependencies.I builded this Dockerfile obtaining a docker image with all the dependencies(both pylucene and the others installed using pip).
Setting in pycharm this image as remote python interpreter I can run my code, but now I need to release my software in a way that allows to execute it also without pycharm or any other IDE that support remote interpreters.
I thought about creating another Dockerfile that starts from the dependency image and then copy in it my source obtaining an image where the code can be executed.
I don't like this solution much beacause the objective of my project is processing large offline datasets, so in this way the user of this image always have to specify bindings between container and host filesystem.
Are there any better options? Maybe creating an archive that contains my source, pylucene and pip dependencies?
Windows 10 64 bit, python 3.8.2, pylucene latest version (8.3.0)

Related

Remote interpreter deployment for python script

I'm planning to run a machine learning script consistently on one of my Google Cloud VMs.
When I configured the remote interpreter, unfortunately all the imported libraries where not recognized anymore (as they might not be installed in the virtual envoirement in the cloud). I tried to install the missing modules (for example yfinance) through the Pycharm terminal extension within my remote host connection over SSH and SFTP. So I basically chose the 188.283.xxx.xxx #username in the Pycharm terminal, and used pip3 install to install the missing modules. Unfortnately my server (due to limited ressources) collapses during the build process.
Is there a way to automatically install the needed libraries when connecting the script to the remote interpreter?
Shouldn't that be the standard procedure? And if not: does my approach make sense?
Thank you all in advance
Peter

You could use something like importlib to install your modules at runtime, but I'd just opt for creating a requirements file using pip freeze > requirements.txt which you can then use on the server to get all your dependencies in one go (pip install -r requirements.txt) before your first run.
If it fails for any reason (or when you've updated the requirements file) you can run it again and it will only install whatever wasn't installed before.
This way it's clear what modules (and which version of each module) you've installed. In my experience with machine learning using the right version or combination of versions can be important, so it makes sense to define those and not just always get the latest version. This especially helps when trying to run some older project.

Create Linux distribution (tar.zip) of python project from Windows+Eclipse+PyDev

I am learning and developing an application using Windows 10, Eclipse + PyDev (Python 3.4.3).
This application is using 2 more Python libraries downloaded from PyPi repository.
Now my target system is a Linux environment with no internet connectivity.
I would like to install my code inside virtual environment in the system.
I learnt installing the Python libraries through tar.gz files from different sources.
But I am not sure how to package my code for Linux distribution from Windows, I don't see any proper options.
Below are my requests:
Steps for Packaging Linux distribution of my Python code from
Windows machine
Correct steps for creating Virtual Environment
activation and Freezing in case of upgrade of my code later (I dont
clear steps for this)

You don't need to do anything special to distribute your code for linux (unless you're using some platform-specific features).
You need to package your code properly, with a setup.py as detailed in the packaging projects tutorial.

How to use PyCharm Remote Deployment to develop multiple packages?

PyCharm Professional has a Remote Deployment feature that allows for editing, running and debugging code remotely. This is a powerful feature when writing short scripts and top-level applications that make use of standard or third-party library packages. You can even create a virtualenv on the remote, with all dependency packages installed, and use that to execute the remote program.
However when writing applications that make use of multiple packages that are also developed alongside the application, it becomes necessary to edit packages. Without PyCharm the usual way to do this is with pip install -e . or python setup.py develop, which integrates the source directory with Python's package system, making it possible to edit a number of packages alongside the application.
With a single package, I've found the PyCharm will deploy the package code into its remote workspace, which works OK for debugging if I'm running a script or entry point from within this same package.
The problem I'm having with PyCharm is that it's not clear how to remotely edit and debug multiple packages. Let's say I have a PyCharm project open for one of these packages. When finding references or debugging into code that is in another (yet still developed-by-me) package, PyCharm shows a cached version of the second package (on my local machine). This is fine until I edit the second package on the remote host - after which the cached version is now out of sync and doesn't automatically update. This leads to a mismatch between execution result and debugger/editor state.
There are other quirks too, such as the edited package not actually being installed into the remote's virtualenv.
I haven't been able to find a proper guide to this workflow in PyCharm's documentation, and I'm starting to wonder if I'm either going about this the entirely wrong way, or maybe PyCharm just doesn't support this kind of app+multiple-packages development?

Python Manual/Isolated/Portable Windows Installation

I thought it is an easy question but I spent a lot of google time to find the answer with no luck. Hope you can help me.
My company has a large SW system on windows which is portable, meaning copy some folders, add some folder to windows path and you are ready to go.
No registry, no dll in system directory, no shortcuts, Nothing!
I want to start using python 3.x in our system in the same paradigm. I also want the ability to add to this distribution a pip/conda 3rd packages from time to time.
I don't want to install python msi on all the systems.
I don't want to pack it to standalone executable like py2exe and pyinstaller or use special python distribution like PyWin32.
Somehow, I couldn't find a formal official solution for that.
The closest thing was here but no pip is supported, python is minimal, and the system isolation is "almost".
3.8. Embedded Distribution New in version 3.5.
The embedded distribution is a ZIP file containing a minimal Python
environment. It is intended for acting as part of another application,
rather than being directly accessed by end-users.
When extracted, the embedded distribution is (almost) fully isolated
from the user’s system, including environment variables, system
registry settings, and installed packages. The standard library is
included as pre-compiled and optimized .pyc files in a ZIP, and
python3.dll, python36.dll, python.exe and pythonw.exe are all
provided. Tcl/tk (including all dependants, such as Idle), pip and the
Python documentation are not included.
Note The embedded distribution does not include the Microsoft C
Runtime and it is the responsibility of the application installer to
provide this. The runtime may have already been installed on a user’s
system previously or automatically via Windows Update, and can be
detected by finding ucrtbase.dll in the system directory. Third-party
packages should be installed by the application installer alongside
the embedded distribution. Using pip to manage dependencies as for a
regular Python installation is not supported with this distribution,
though with some care it may be possible to include and use pip for
automatic updates. In general, third-party packages should be treated
as part of the application (“vendoring”) so that the developer can
ensure compatibility with newer versions before providing updates to
users.
Any ideas?
Thanks.

How about... installing Python in one machine and replicate that installation on others computers?
Usually, I install Python in a Windows Virtualbox machine (Microsoft usually give it for free to try it or for testing old Internet Explorer versions).
Then I copy the Python directory to my Windows machine (the real host) and usually works. This makes possible to using various python versions.
Did you try to complete the Python Embedded Distribution? Usually they not come with Tkinter, but once I could copy files and put in this distribution in a way that works. Try it too.

You can install pip with get-pip.py

How to properly deploy python webserver application with extension deps?

I developed my first webserver app in Python.
It's a but unusual, because it does not only depend on python modules (like tornado) but also on some proprietary C++ libs wrapped using SWIG.
And now it's time to deliver it (to Linux platform).
Due to dependency on C++ lib, just sending sources with requirements.txt does not seem enough. The only workaround would be to have exact Linux installation to ensure binary compatibility of the lib. But in this case there will be problems with LD_PATH etc.
Another option is to write setup.py to create sdist and then deploy it with pip install.
Unfortunately that would mean I have to kill all instances of the server before installing my package. The workaround would be to use virtualenv for each instance though.
But maybe I'm missing something much simpler?

If you need the package to be installed by some user the easiest way will be to write the setup.py - but no just with simple setup function like most of installers. If you look at some packages, they have very complicated setup.py scripts which builds many things and C extensions with installation scripts for many external dependences.
The LD_PATH problem you can solve like this. If your application have an entry-point like some script which you save in python's bin directory (or system /usr/bin) you override LD_PATH like export LD_PATH="/my/path:$LD_PATH".
If your package is system service, like some servers or daemons, you can write system package, for example debian package or rpm. Debian has a lot of scripts and mechanism to point out the dependencies with packages.
So, if you need some system libraries on the list you write it down in package source and debian will install them when you will be installing your package. For example your package have dependencies for SWIG and other DEV modules, and your C extension will be built properly.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.