I'm planning to run a machine learning script consistently on one of my Google Cloud VMs.
When I configured the remote interpreter, unfortunately all the imported libraries where not recognized anymore (as they might not be installed in the virtual envoirement in the cloud). I tried to install the missing modules (for example yfinance) through the Pycharm terminal extension within my remote host connection over SSH and SFTP. So I basically chose the 188.283.xxx.xxx #username in the Pycharm terminal, and used pip3 install to install the missing modules. Unfortnately my server (due to limited ressources) collapses during the build process.
Is there a way to automatically install the needed libraries when connecting the script to the remote interpreter?
Shouldn't that be the standard procedure? And if not: does my approach make sense?
Thank you all in advance
Peter
You could use something like importlib to install your modules at runtime, but I'd just opt for creating a requirements file using pip freeze > requirements.txt which you can then use on the server to get all your dependencies in one go (pip install -r requirements.txt) before your first run.
If it fails for any reason (or when you've updated the requirements file) you can run it again and it will only install whatever wasn't installed before.
This way it's clear what modules (and which version of each module) you've installed. In my experience with machine learning using the right version or combination of versions can be important, so it makes sense to define those and not just always get the latest version. This especially helps when trying to run some older project.
Related
I currently work on a server of my university, where I am using Python with Tensorflow and Cuda. I built some machine learning models on my local machine and copied them onto the server, where they run fine with the basic Python installation.
However, I want to install a few additional packages and I also want to make sure that I do not have to be worried about updates of the software installed on the server that could cause problems with my code.
I do not have and cannot get admin permissions to install any programs or modules outside my personal directory, since I am only a regular user and a student above that.
Now I wondered if there is an easy way to just clone the main Python installation to a virtual environment in my home folder. I could basically create a new virtual environment and install Tensorflow etc, but I found it very frustrating when I tried to set it up with Cuda on my private computer. Moreover, I find it rather complicated to access the server (log-in credentials, vpn, 2-factor-authorisation), so I want to minimize the trial-and-error time that I often need when I try custom installation of modules.
Approach 1:
use "pip freeze > requirement.txt" to generate the requirements file and install all requirements using requirements file using "pip install -r requirement.txt" command.
Approach 2:
use a virtualenv-clone package to make clone of the existing environment.
https://pypi.org/project/virtualenv-clone/
First, my reasons to do this - I know it's a bad idea but I am out of ideas.
I want to install a package which requires a ld version, which is higher than the one in the repo of my Centos 6.5. So I should either go for setting up everything in a Docker and running it in production - something I lack experience with and I don't feel comfortable doing for a serious project. Or upgrade ld manually building from external source. Which I read, could result in devastation of my Centos. So the last option I am left with is install the packed on other machine and manually copy it to site-packages.
I have successfully installed the package on my home laptop under Debian.
I encountered everywhere advice to copy the whole site-packages directory. Something which I don't want to do as I have different packages on both machines and I want to avoid messing up with other stuff.
I copied the .so build and .egginfo of the package. Then, on the target machine, pip freeze indeed showed me the transferred package. However, Python can't find it when I try to import and use it.
Am I missing something else?
Not any of that.
Don't mess with system Python's site-packages dir, this belongs to the system Python env only. You should only add/remove code in there by using the package manager of your OS (that's yum for CentOS). This is especially true in Linux where many OS services can rely on system Python.
So what to do instead? Use a virtualenv and/or pipx to isolate any other dependencies of the package you want to install from the system versions.
How to package Python itself into virtualenv? Is this even possible?
I'm trying to run python on a machine which it is not installed on, and I thought virtualenv made this possible. It activates, but can't run any Python.
When setting up the virtualenv (this can also be done if it already set up) simply do:
python -m virtualenv -p python env
And Python will be added to the virtualenv, and will become the default python of it.
The version of Python can also be passed, as python uses the first version found in the PATH.
virtualenv makes it convenient to use multiple python versions in different projects on the same machine, and isolate the pip install libraries installed by each project. It doesn’t install or manage the overall python environment. Python must be installed on the machine before you can install or configure the virtualenv tool itself or switch into a virtual environment.
Side note, consider using virtualenvwrapper — great helper for virtualenv.
You haven't specified the Operating System you are using.
In case you're using Windows, you don't use virtualenv for this. Instead you:
Download the Python embeddable package
Unpack it
Uncomment import site in the python37._pth file (only if you want to add additional packages)
Manually copy your additional packages (the ones you usually install with pip) to Lib\site-packages (you need to create that directory first, of course)
Such a python installation is configured in such a way that it can be moved and run from any location.
You only have to ensure the Microsoft C Runtime is installed on the system (but it almost always already is). See the documentation note:
Note The embedded distribution does not include the Microsoft C Runtime and it is the responsibility of the application installer to provide this. The runtime may have already been installed on a user’s system previously or automatically via Windows Update, and can be detected by finding ucrtbase.dll in the system directory.
You might need to install python in some location you have the permissions to do so.
PyCharm Professional has a Remote Deployment feature that allows for editing, running and debugging code remotely. This is a powerful feature when writing short scripts and top-level applications that make use of standard or third-party library packages. You can even create a virtualenv on the remote, with all dependency packages installed, and use that to execute the remote program.
However when writing applications that make use of multiple packages that are also developed alongside the application, it becomes necessary to edit packages. Without PyCharm the usual way to do this is with pip install -e . or python setup.py develop, which integrates the source directory with Python's package system, making it possible to edit a number of packages alongside the application.
With a single package, I've found the PyCharm will deploy the package code into its remote workspace, which works OK for debugging if I'm running a script or entry point from within this same package.
The problem I'm having with PyCharm is that it's not clear how to remotely edit and debug multiple packages. Let's say I have a PyCharm project open for one of these packages. When finding references or debugging into code that is in another (yet still developed-by-me) package, PyCharm shows a cached version of the second package (on my local machine). This is fine until I edit the second package on the remote host - after which the cached version is now out of sync and doesn't automatically update. This leads to a mismatch between execution result and debugger/editor state.
There are other quirks too, such as the edited package not actually being installed into the remote's virtualenv.
I haven't been able to find a proper guide to this workflow in PyCharm's documentation, and I'm starting to wonder if I'm either going about this the entirely wrong way, or maybe PyCharm just doesn't support this kind of app+multiple-packages development?
First let me explain the current situation:
We do have several python applications which depend on custom (not public released ones) as well as general known packages. These depedencies are all installed on the system python installation. Distribution of the application is done via git by source. All these computers are hidden inside a corporate network and don't have internet access.
This approach is bit pain in the ass since it has the following downsides:
Libs have to be installed manually on each computer :(
How to better deploy an application? I recently saw virtualenv which seems to be the solution but I don't see it yet.
virtualenv creates a clean python instance for my application. How exactly should I deploy this so that usesrs of the software can easily start it?
Should there be a startup script inside the application which creates the virtualenv during start?
The next problem is that the computers don't have internet access. I know that I can specify a custom location for packages (network share?) but is that the right approach? Or should I deploy the zipped packages too?
Would another approach would be to ship the whole python instance? So the user doesn't have to startup the virutalenv? In this python instance all necessary packages would be pre-installed.
Since our apps are fast growing we have a fast release cycle (2 weeks). Deploying via git was very easy. Users could pull from a stable branch via an update script to get the last release - would that be still possible or are there better approaches?
I know that there are a lot questions. Hopefully someone can answer me r give me some advice.
You can use pip to install directly from git:
pip install -e git+http://192.168.1.1/git/packagename#egg=packagename
This applies whether you use virtualenv (which you should) or not.
You can also create a requirements.txt file containing all the stuff you want installed:
-e git+http://192.168.1.1/git/packagename#egg=packagename
-e git+http://192.168.1.1/git/packagename2#egg=packagename2
And then you just do this:
pip install -r requirements.txt
So the deployment procedure would consist in getting the requirements.txt file and then executing the above command. Adding virtualenv would make it cleaner, not easier; without virtualenv you would pollute the systemwide Python installation. virtualenv is meant to provide a solution for running many apps each in its own distinct virtual Python environment; it doesn't have much to do with how to actually install stuff in that environment.