Proper way to customize third-party python package in project [duplicate] - python

I installed some package via pip install something. I want to edit the source code for the package something. Where is it (on ubuntu 12.04) and how do I make it reload each time I edit the source code and run it?
Currently I am editing the source code, and then running python setup.py again and again, which turns out to be quite a hassle.

You should never edit an installed package. Instead, install a forked version of package.
If you need to edit the code frequently, DO NOT install the package via pip install something and edit the code in '.../site_packages/...'
Instead, put the source code under a development directory, and install it with
$ python setup.py develop
or
$ pip install -e path/to/SomePackage
Or use a vcs at the first place
$ pip install -e git+https://github.com/lakshmivyas/hyde.git#egg=hyde
Put your changes in a version control system, and tell pip to install it explicitly.
Reference:
Edit mode

You can edit the files installed in /usr/local/lib/python2.7/dist-packages/. Do note that you will have to use sudo or become root.
The better option would be to use virtual environment for your development. Then you can edit the files installed with your permissions inside your virtual environment and only affect the current project.
In this case the files are in ./venv/lib/pythonX.Y/site-packages
The path could be dist-packages or site-packages, you can read more in the answer to this question
Note that, as the rest of the people have mentioned, this should only be used sparingly, for small tests or debug, and being sure to revert your changes to prevent issues when upgrading the package.
To properly apply a change to the package (a fix or a new feature) go for the options described in other answers to contribute to the repo or fork it.

I too needed to change some things inside a package. Taking inspiration from the previous answers, You can do the following.
Fork the package/repo to your GitHub
clone your forked version and create a new branch of your choice
make changes and push code to the new branch on your repository
you can easily use pip install -e git+repositoryurl#branchname
There are certain things to consider if its a private repository

If you are doing the custom module that you want hot loading, you can put your running code also inside the module. Then you can use python -m package.your_running_code. In this way, you can change the module in the package and reflect the result of your running code immediately.

Related

How can I edit a GitHub repository (for a Python package) locally and run the package with my changes?

I have cloned a GitHub repository that contains the code for a Python package to my local computer (it's actually on a high performance cluster). I have also installed the package with pip install 'package_name'. If I now run a script that uses the package, it of course uses the installed package and not the cloned repository, so if I want to make changes to the code, I cannot run those. Is there a way to do this, potentially with pip install -e (but I read that was deprecated) or a fork? How could I then get novel updates in the package to my local version, as it is frequently updated?
If you run an IDE like PyCharm, you can mark a folder in your project as Sources Root. It will then import any packages from that folder instead of the standard environment packages.
In the end I indeed did use pip install -e, and it is working for now. I will figure it out once the owner of the package releases another update!

Do others need to have the same modules installed to run my code in Python?

If I use a module such as tkinter, would somebody need to have that module installed as well in order for my code to run on their machine?
Definitely. You can use virtual environments or containers to deliver required packages or have a requrements.txt or similar to install the dependencies.
python comes with a number of standard modules pre-installed, if the other person is running python (the same version of you) then he/she won't need to install anything, it will just work, that's the case of tkinter. But if you use external packages that you installed to run your code, for example celery, then he/she will need to do the same thing.
If you gave your code to someone to run, they would need to download the same modules, unless you also sent the environment too. The only way I know around this is to freeze your code where you would create an executable. I've used cx_Freeze and pyInstaller and haven't had any issues but it also depends on your needs. You can find some more information through here:
https://docs.python-guide.org/shipping/freezing/
Hope this helps!
In your running environment do a, this file you add to your repo
pip freeze > requirements.txt
When people clone your repo, they only have to do a:
pip install -r requirements.txt
and they will install exactly the same pypi modules you have.
With virtualenv you can isolate a python environment to each project, with pyenv you can use different pythonversions withing the various environment also.

setup.py + virtualenv = chicken and egg issue?

I'm a Java/Scala dev transitioning to Python for a work project. To dust off the cobwebs on the Python side of my brain, I wrote a webapp that acts as a front-end for Docker when doing local Docker work. I'm now working on packaging it up and, as such, am learning about setup.py and virtualenv. Coming from the JVM world, where dependencies aren't "installed" so much as downloaded to a repository and referenced when needed, the way pip handles things is a bit foreign. It seems like best practice for production Python work is to first create a virtual environment for your project, do your coding work, then package it up with setup.py.
My question is, what happens on the other end when someone needs to install what I've written? They too will have to create a virtual environment for the package but won't know how to set it up without inspecting the setup.py file to figure out what version of Python to use, etc. Is there a way for me to create a setup.py file that also creates the appropriate virtual environment as part of the install process? If not — or if that's considered a "no" as this respondent stated to this SO post — what is considered "best practice" in this situation?
You can think of virtualenv as an isolation for every package you install using pip. It is a simple way to handle different versions of python and packages. For instance you have two projects which use same packages but different versions of them. So, by using virtualenv you can isolate those two projects and install different version of packages separately, not on your working system.
Now, let's say, you want work on a project with your friend. In order to have the same packages installed you have to share somehow what versions and which packages your project depends on. If you are delivering a reusable package (a library) then you need to distribute it and here where setup.py helps. You can learn more in Quick Start
However, if you work on a web site, all you need is to put libraries versions into a separate file. Best practice is to create separate requirements for tests, development and production. In order to see the format of the file - write pip freeze. You will be presented with a list of packages installed on the system (or in the virtualenv) right now. Put it into the file and you can install it later on another pc, with completely clear virtualenv using pip install -r development.txt
And one more thing, please do not put strict versions of packages like pip freeze shows, most of time you want >= at least X.X version. And good news here is that pip handles dependencies by its own. It means you do not have to put dependent packages there, pip will sort it out.
Talking about deploy, you may want to check tox, a tool for managing virtualenvs. It helps a lot with deploy.
Python default package path always point to system environment, that need Administrator access to install. Virtualenv able to localised the installation to an isolated environment.
For deployment/distribution of package, you can choose to
Distribute by source code. User need to run python setup.py --install, or
Pack your python package and upload to Pypi or custom Devpi. So the user can simply use pip install <yourpackage>
However, as you notice the issue on top : without virtualenv, they user need administrator access to install any python package.
In addition, the Pypi package worlds contains a certain amount of badly tested package that doesn't work out of the box.
Note : virtualenv itself is actually a hack to achieve isolation.

What are the Python equivalents to Ruby's bundler / Perl's carton?

I know about virtualenv and pip. But these are a bit different from bundler/carton.
For instance:
pip writes the absolute path to shebang or activate script
pip doesn't have the exec sub command (bundle exec bar)
virtualenv copies the Python interpreter to a local directory
Does every Python developer use virtualenv/pip? Are there other package management tools for Python?
From what i've read about bundler — pip without virtualenv should work just fine for you. You can think of it as something between regular gem command and bundler. Common things that you can do with pip:
Installing packages (gem install)
pip install mypackage
Dependencies and bulk-install (gemfile)
Probably the easiest way is to use pip's requirements.txt files. Basically it's just a plain list of required packages with possible version constraints. It might look something like:
nose==1.1.2
django<1.3
PIL
Later when you'd want to install those dependencies you would do:
$ pip install -r requirements.txt
A simple way to see all your current packages in requirements-file syntax is to do:
$ pip freeze
You can read more about it here.
Execution (bundler exec)
All python packages that come with executable files are usually directly available after install (unless you have custom setup or it's a special package). For example:
$ pip install gunicorn
$ gunicorn -h
Package gems for install from cache (bundler package)
There is pip bundle and pip zip/unzip. But i'm not sure if many people use it.
p.s. If you do care about environment isolation you can also use virtualenv together with pip (they are close friends and work perfectly together). By default pip installs packages system-wide which might require admin rights.
You can use pipenv, which has similar interface with bundler.
$ pip install pipenv
Pipenv creates virtualenv automatically and installs dependencies from Pipfile or Pipfile.lock.
$ pipenv --three # Create virtualenv with Python3
$ pipenv install # Install dependencies from Pipfile
$ pipenv install requests # Install `requests` and update Pipfile
$ pipenv lock # Generate `Pipfile.lock`
$ pipenv shell # Run shell with virtualenv activated
You can run command with virtualenv scope like bundle exec.
$ pipenv run python3 -c "print('hello!')"
There is a clone pbundler.
The version that is currently in pip simply reads the requirements.txt file you already have, but is much out of date. It's also not totally equivalent: it insists on making a virtualenv. Bundler, I notice, only installs what packages are missing, and gives you the option of giving your sudo password to install into your system dirs or of restarting, which doesn't seem to be a feature of pbundler.
However, the version on git is an almost complete rewrite to be much closer to Bundler's behaviour... including having a "Cheesefile" and now not supporting requirements.txt. This is unfortunate, since requirements.txt is the de facto standard in pythonland, and there's even Offical BDFL-stamped work to standardize it. When that comes into force, you can be sure that something like pbundler will become the de facto standard. Alas, nothing quite stable yet that I know of (but I would love to be proven wrong).
I wrote one — https://github.com/Deepwalker/pundler .
On PIP its pundle, name was already taken.
It uses requirements(_\w+)?.txt files as your desired dependencies and creates frozen(_\w+)?.txt files with frozen versions.
About (_\w+)? thing — this is envs. You can create requirements_test.txt and then use PUNDLEENV=test to use this deps in your run with requirements.txt ones alongside.
And about virtualenv – you need not one, its what pundle takes from bundler in first head.
Python Poetry is the closest to Ruby bundler as of 2020 (and already since 2018). It's already more than two years old, still very active, has great documentation. One might complain about curl-pipe-python-style being the recommended way of installing, but there are alternatives, e.g. homebrew on macOS.
Primary website: https://python-poetry.org/
Github: https://github.com/python-poetry/poetry
Documentation: https://python-poetry.org/docs/
It uses virtualenvs behind the scenes (in contrast to bundler), but it provides and uses a lock-file, takes care of sub dependencies, adheres to specified version constraints and allows automatically updating outdated packages. There's even autocompletion for your favorite shell.
With its use of a pyproject.toml file, it's also going a bit further than bundler (closer to a gemspec. It's also comparable to JavaScript's and TypeScript's npm and yarn).
Poetrify (a complementing project) helps converting projects from requirements.txt to pyproject.toml for Poetry.
The lock file can be exported to requirements.txt by poetry export -f requirements.txt > requirements.txt, if you need that for other tooling (or the unlikely case want to go back).
I'd say Shovel is worth a look. It was developed specifically to the Pythonish version of Rake. There's not a ton of commit activity on the project, but seems stable and useful.
You can use pipx to install and run Python Applications in Isolated Environments automatically.
You can use pipenv to create and manage a virtualenv for your projects automatically.
Both wraps pip with virtual environment tools and aiming for different use cases.
All of these are one of the most stared project listed in the github PyPA repository.
FYI: Debian bullseye/testing currently lacks pipx. But package from sid should work fine. (2021-06-19)
No, no all the developers use virtualenv and/or pip, but many developers use/prefer these tools
And now, for package development tools and diferent environments that is your real question. Exist any other tools like Buildout (http://www.buildout.org/en/latest/) for the same purpose, isolate your environment Python build system for every project that you manage. For some time I use this, but not now.
Independent environments per project, in Python are a little different that the same situation in Ruby. In my case i use pyenv (https://github.com/yyuu/pyenv) that is something like rbenv but, for Python. diferent versions of python and virtualenvs per project, and, in this isolated environments, i can use pip or easy-install (if is needed).

Best practice for installing python modules from an arbitrary VCS repository

I'm newish to the python ecosystem, and have a question about module editing.
I use a bunch of third-party modules, distributed on PyPi. Coming from a C and Java background, I love the ease of easy_install <whatever>. This is a new, wonderful world, but the model breaks down when I want to edit the newly installed module for two reasons:
The egg files may be stored in a folder or archive somewhere crazy on the file system.
Using an egg seems to preclude using the version control system of the originating project, just as using a debian package precludes development from an originating VCS repository.
What is the best practice for installing modules from an arbitrary VCS repository? I want to be able to continue to import foomodule in other scripts. And if I modify the module's source code, will I need to perform any additional commands?
Pip lets you install files gives a URL to the Subversion, git, Mercurial or bzr repository.
pip install -e svn+http://path_to_some_svn/repo#egg=package_name
Example:
pip install -e hg+https://rwilcox#bitbucket.org/ianb/cmdutils#egg=cmdutils
If I wanted to download the latest version of cmdutils. (Random package I decided to pull).
I installed this into a virtualenv (using the -E parameter), and pip installed cmdutls into a src folder at the top level of my virtualenv folder.
pip install -E thisIsATest -e hg+https://rwilcox#bitbucket.org/ianb/cmdutils#egg=cmdutils
$ ls thisIsATest/src
cmdutils
Are you wanting to do development but have the developed version be handled as an egg by the system (for instance to get entry-points)? If so then you should check out the source and use Development Mode by doing:
python setup.py develop
If the project happens to not be a setuptools based project, which is required for the above, a quick work-around is this command:
python -c "import setuptools; execfile('setup.py')" develop
Almost everything you ever wanted to know about setuptools (the basis of easy_install) is available from the the setuptools docs. Also there are docs for easy_install.
Development mode adds the project to your import path in the same way that easy_install does. An changes you make will be available to your apps the next time they import the module.
As others mentioned, you can also directly use version control URLs if you just want to get the latest version as it is now without the ability to edit, but that will only take a snapshot, and indeed creates a normal egg as part of the process. I know for sure it does Subversion and I thought it did others but I can't find the docs on that.
You can use the PYTHONPATH environment variable or symlink your code to somewhere in site-packages.
Packages installed by easy_install tend to come from snapshots of the developer's version control, generally made when the developer releases an official version. You're therefore going to have to choose between convenient automatic downloads via easy_install and up-to-the-minute code updates via version control. If you pick the latter, you can build and install most packages seen in the python package index directly from a version control checkout by running python setup.py install.
If you don't like the default installation directory, you can install to a custom location instead, and export a PYTHONPATH environment variable whose value is the path of the installed package's parent folder.

Categories