Where do you clone Python module git repositories?

Where do you clone Python module git repositories? - python

I'm interested in contributing to a GitHub Python module repo, but I'm not entirely sure where to clone it. This is a simple module, just an __init__.py and some .py files. No other files need to be installed or changed outside of the module's folder.
I would like to be able to clone the repository directly in my site-packages folder. When I want to use the library as is, I would switch to the master branch. If I want to develop a new feature, can branch off of devel. If I want to try out a new feature someone else implemented, I can switch to that particular branch. I can even keep it in the development branch, to get the latest, albeit possibly unstable, features. All this without having to change the import statement to point to a different location in any of my scripts. This option, even though is seems to do all the things I want it to do, seems a bit wrong for some reason. Also, I'm not sure what this would do to pip when calling python -m pip list --outdated. I have a feeling it won't know what the current version is.
Another option would be to clone it to some other folder and keep only the pip-installed variant in the site-packages folder. That way I would have a properly installed library in site-packages and I could try out new features by creating a script inside the repo folder. This doesn't seem nearly as flexible as the option above, but it doesn't mess with the site-packages folder.
Which is the best way to go about this? How do you clone repositories when you both want to work on them and use them with the latest features?

I think this is more a question about packaging and open source than Python itself, but I'll try to help you out.
If you want to host your package on Pip, you should go here, and there you'll see how to upload and tag appropriately your package for usage.
If you want to add some functionality to some open source library, what you could do is to try to submit a Pull Request to that library, so everybody can use it. Rules for PR are specific for each project, you you should ask the maintainer.
If your modification doesn't get merged into master, but you still want to use it without changing import statements, you could fork that repo, and publish your own modifications on, for instance, Github.
In that case, you could install you modifications like this:
pip install git+https://github.com/username/amazing-project.git
So in that way, your library will come from your own repo.
If you're going for the third option, I strongly recommend you using virtualenv, where you can create different virtual environments with different packages, dependencies and so on, without messing up with your Python installation. A nice guide is available here.

Related

Enable module modification in virtualenv

I have 10 django projects that use over 50 django apps. Each app is separated in its own project and added to pypi and is getting use by few project. Every thing if fine except every time i work on a project and i want to change some code that is in one of my modules (that happens a lot) I have to open the module project, make my changes, test and publish to pypi then come back to my project update requirements.txt file and get the updated module from pip.
I'm looking for a way to be able to edit module right away from all of my projects. For example instead of getting it from pypi i want to get it from git and be able to commit to the git repository in my venv folder!
I know it seems a little bit crazy but i could save a lot of time! publisher and user of all of the modules is me so I don't mind the user to be able to change as well.
Any thought or suggestion will be appreciated. Also any none pip solution will be fine as well like writing a custom shell script.

I don't know about editing in your venv folder, which I think is not a good practice, but you can install from github by pip. You can use 'pip install git+https://github.com/urltoproject/repository.git'. Fill in the necessary details yourself of course. This also works with other systems like gitlab. You could have a separate development requirement file and a production requirement file to separate the two environments, or you install on the commandline directly with pip.

How to specify another tox project folder as a dependency for a tox project

We have a tox-enabled project (let's call it "main" project) which has a dependency on another tox project (let's call it "library" project) - all united in one repository since it is all part of a large overarching project.
How the project works for the regular user
For a regular install as an end-user you would simply install "library" first and then "main", right from the repository or whatever sources, and then run it.
What our issue is with tox
However, as a developer the situation is different, because "tox" should work and you might want to have multiple versions around at the same time.
You usually check out the large overarching repository and then the filesystem layout is this:
overarchingproject/main/
overarchingproject/main/src/
overarchingproject/main/tox.ini
overarchingproject/main/setup.py
...
overarchingproject/library/
overarchingproject/library/src/
overarchingproject/library/tox.ini
overarchingproject/library/setup.py
...
Now if I go into main/ and type "tox", this happens:
Current behavior: It will try to build the "main" project with the dependency on "library" - which will obviously result in an attempt to get "library" from pip. However, the project is not released yet (therefore not on pip) so it won't work - even though the lib is right there in the same repo.
Result: it doesn't work.
Workaround 1: We could set up our own package index/ask users to do that. However, asking everyone contributing to the project to do that with DevPI or similar just to be able to run the unit tests seems not such a good idea, so we'd need to do it centrally.
But if we provided a package index at some central place or a pip package for "library", people couldn't run the tests of "main" easily with involvement of a modified version of "library" they created themselves:
"library" is in the same repository after all, so people might as well modify it at some point.
That typing "tox" inside the "main" project folder will not easily pick up upon the present neighbouring "library" version but only a prepackaged online thing isn't exactly intuitive.
Workaround 2: We tried sitepackages=True and installing "library" in the system - however, sitepackages=True has caused us notable amount of troubles and it doesn't seem a good idea in general.
Desired behavior: We want tox to use the local version of the "library" in that folder right in the same overarching repository which people will usually get in one thing:
That version might be newer or even locally modified, so this is clearly what the dev user wants to use. And it exists, which cannot be said about the pip package right now.
Why do we want the overarching repository anyway with subprojects ("main", "library", ...) and not just one single project?
We develop a multi-daemon large project with many daemons for various purposes, with shared code in some libs to form a university course management system (which deals with forums, course management with possibility to hand in things, attached code versioning systems for student projects etc.).
It is possible to just use a subset of daemons, so it makes sense that they are separate projects, but still they are connected enough that most people would want to have most of them - therefore all of them in one repository.
The library itself is also suitable to be used for entirely different projects, but it is usually used for ours to start with - so that is where it is stuffed into the repository. So that means it is always around in the given relative path, but it has its separate tox.ini and unit tests.
TL;DR / Summary
So how can we make tox look for a specific dependency in another toxable project folder instead of just pip when installing a project?
Of course "main"'s regular setup.py install process shouldn't mess around with tox or searching the local disk: it should just check for one specific relative path, and then give up if that one is not present (and fall back to pip or whatever).
So best would be if the relative path could be somehow stored in tox.ini.
Or is this all just a pretty bad idea? Should we solve this differently to make our "main" project easily toxable with the latest local dev version of "library" as present in the local repository?

You can use pip's --editable option in your main project, like followings:
deps =
--editable=file:///{toxinidir}/../library
-r{toxinidir}/requirements.txt
P.S. Don't use this style: -e file:///{toxinidir}/../library, because tox pass whole string as an argument to argparse in error foramt.

As suggested in the comments to the response of diabloneo it is possible to supply an install_command in the tox.ini file:
I used this to make a bash script that takes all the usual pip arguments, but then runs pip before with just pip install --editable="file://`pwd`/../path/to/neighbour/repo", and only then actually runs the regular pip install $# afterwards with the arguments to the script (as would be passed by tox to pip directly). I then used this script with install_command instead of the regular default pip command.
With this two-step procedure it works fine :-)

I realize it's been a long time since the question was asked but might be useful to someone, I tried to do the same, and it turns out it can be done using distshare directory of tox: https://tox.wiki/en/latest/example/general.html#access-package-artifacts-between-multiple-tox-runs
# example two/tox.ini
[testenv]
# install latest package from "one" project
deps = {distshare}/one-*.zip

You should try separate tool for that like ansible.
Example
ansible creates and run your virtualenv
ansible install local dependencies like you subprojects
In every subproject use tox to run tests, lint, docs
It was more convenient way to tox subprojects.

How to "build" a python script with its dependencies

I have a simple python shell script (no gui) who uses a couple of dependencies (requests and BeautifulfSoup4).
I would like to share this simple script over multiple computers. Each computer has already python installed and they are all Linux powered.
At this moment, on my development environments, the application runs inside a virtualenv with all its dependencies.
Is there any way to share this application with all the dependencies without the needing of installing them with pip?
I would like to just run python myapp.py to run it.

You will need to either create a single-file executable, using something like bbfreeze or pyinstaller or bundle your dependencies (assuming they're pure-python) into a .zip file and then source it as your PYTHONPATH (ex: PYTHONPATH=deps.zip python myapp.py).
The much better solution would be to create a setup.py file and use pip. Your setup.py file can create dependency links to files or repos if you don't want those machines to have access to the outside world. See this related issue.

As long as you make the virtualenv relocatable (use the --relocatable option on it in its original place), you can literally just copy the whole virtualenv over. If you create it with --copy-only (you'll need to patch the bug in virtualenv), then you shouldn't even need to have python installed elsewhere on the target machines.
Alternatively, look at http://guide.python-distribute.org/ and learn how to create an egg or wheel. An egg can then be run directly by python.

I haven't tested your particular case, but you can find source code (either mirrored or original) on a site like github.
For example, for BeautifulSoup, you can find the code here.
You can put the code into the same folder (probably a rename is a good idea, so as to not call an existing package). Just note that you won't get any updates.

Generating patches for open source, source code in virtualenv site-packages

I have an open source python library sitting in my virtualenv site-packages. And I noticed a bug in that library and would like to contribute my patches back to the open source project.
The problem is, my virtualenv site-packages is not version controlled by git (obviously, since it was installed via pip) and it's a pain to rename a specific string which is causing the bug (which is located in multiple files, 10+ files) manually and then using diff to generate the patches.
A simpler way - since the project is hosted on github - is actually to place that library under git control, and then make a "pull request" on github. But I am not sure whether it makes sense or not to be directly managing a git repository inside my virtualenv's site-packages directory. (will that cause problems to pip???)
How would you manage your personal workflow to contribute back to open source projects efficiently in such a scenario?

Fork the project on github, clone it to a directory separate from your virtualenv, make the pull request, and install your own fork into the virtualenv by pointing pip at your fork in github.

Is site-packages appropriate for applications or just libraries?

I'm in a bit of a discussion with some other developers on an open source project. I'm new to python but it seems to me that site-packages is meant for libraries and not end user applications. Is that true or is site-packages an appropriate place to install an application meant to be run by an end user?

Once you get to the point where your application is ready for distribution, package it up for your favorite distributions/OSes in a way that puts your library code in site-packages and executable scripts on the system path.
Until then (i.e. for all development work), don't do any of the above: save yourself major headaches and use zc.buildout or virtualenv to keep your development code (and, if you like, its dependencies as well) isolated from the rest of the system.

We do it like this.
Most stuff we download is in site-packages. They come from pypi or Source Forge or some other external source; they are easy to rebuild; they're highly reused; they don't change much.
Must stuff we write is in other locations (usually under /opt, or c:\opt) AND is included in the PYTHONPATH.
There's no great reason for keeping our stuff out of site-packages. However, our feeble excuse is that our stuff changes a lot. Pretty much constantly. To reinstall in site-packages every time we think we have something better is a bit of a pain.
Since we're testing out of our working directories or SVN checkout directories, our test environments make heavy use of PYTHONPATH.
The development use of PYTHONPATH bled over into production. We use a setup.py for production installs, but install to an alternate home under /opt and set the PYTHONPATH to include /opt/ourapp-1.1.

The program run by the end user is usually somewhere in their path, with most of the code in the module directory, which is often in site-packages.
Many python programs will have a small script located in the path, which imports the module, and calls a "main" method to run the program. This allows the programmer to do some upfront checks, and possibly modify sys.path if needed to find the needed module. This can also speed up load time on larger programs, because only files that are imported will be run from bytecode.

Site-packages is for libraries, definitely.
A hybrid approach might work: you can install the libraries required by your application in site-packages and then install the main module elsewhere.

If you can turn part of the application to a library and provide an API, then site-packages is a good place for it. This is actually how many python applications do it.
But from user or administrator point of view that isn't actually the problem. The problem is how we can manage the installed stuff. After I have installed it, how can I upgrade and uninstall it?
I use Fedora. If I use the python that came with it, I don't like installing things to site-packages outside the RPM system. In some cases I have built rpm myself to install it.
If I build my own python outside RPM, then I naturally want to use python's mechanisms to manage it.
Third way is to use something like easy_install to install such thing for example as a user to home directory.
So
Allow packaging to distributions.
Allow selecting the python to use.
Allow using python installed by distribution where you don't have permissions to site-packages.
Allow using python installed outside distribution where you can use site-packages.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.