Releasing a python package - should you include doc and tests?

Releasing a python package - should you include doc and tests? - python

So, I've released a small library on pypi, more as an exercise (to "see how it's done") than anything else.
I've uploaded the documentation on readthedocs, and I have a test suite in my git repo.
Since I figure anyone who might be interested in running the test will probably just clone the repo, and the doc is already available online, I decided not to include the doc and test directories in the released package, and I was just wondering if that was the "right" thing to do.
I know answers to this question will be rather subjective, but I felt it was a good place to ask in order to get a sense of what the community considers to be the best practice.

It is not required but recommended to include documentation as well as unit tests into the package.
Regarding documentation:
Old-fashioned or better to say old-school source releases of open source software contain documentation, this is a (de facto?) standard (have a look at GNU software, for example). Documentation is part of the code and should be part of the release, simply because once you download the source release you are independent. Ever been in the situation where you've been on a train somewhere, where you needed to have a quick look into the documentation of module X but didn't have internet access? And then you relievedly realized that the docs are already there, locally.
Another important point in this regard is that the documentation that you bundle together with the code for sure applies to the code version. Code and docs are in sync.
One more thing especially regarding Python: you can write your docs using Sphinx and then build beautiful HTML output based on the documentation source in the process of installing the package. I have seen various Python packages doing exactly this.
Regarding tests:
Imagine the tests are bundled in the source release and are easy to be run by the user (you should document how to do this). Then, if the user observes a problem with your code which is not so easy to track down, he can simply run the unit tests in his environment and see if at least those are passing. If not, you've probably made a wrong assumption when specifying the behavior of your code, which is good to know about. What I want to say is: it can be very good for you as a developer if you make it very simple for the user to execute unit tests.

Related

Why has the syntax changed from flask.ext.* to flask_*?

It looks like there was a deprecation. How was that decided? Is there a difference between Python 3 and Python 2?

The old flask.ext was deprecated in issue #1135, which was created back in 2014. The actual deprecation notice was turned on in 2016. The reasoning behind the deprecation is:
Some introductory information for new contributors:
Flask used to have flaskext as a namespace for extensions, so they were importable as flaskext.foo. This didn't work well, so the new form flask_foo was introduced. flask.ext.foo is a compatibility layer that will try to import both variants. See http://flask.pocoo.org/docs/0.10/extensions/
flask.ext.foo is hard to maintain, and since now all extensions have switched to the new package naming scheme, it is no longer worth it. We want to deprecate it for 1.0, so we need some sort of tool which can help users to rewrite all their old imports in their apps.
One could write a Python script similar to this beast. This will get the job done, but as its docstring says, it's a terrible hack.
lib2to3 proved useful for writing larger migration tools, but it's nontrivial to use it.
https://github.com/mitsuhiko/python-modernize/ is one based on it, and it seems to me that's the easiest project one could rip off from.
I wasn't able to find complete tutorials that are useful for this. Most seem to be focused on porting to Python 3, which would imply running the default 2to3 fixers on the user's codebase (which we definetly don't want)
One will have to read the sourcecode of 2to3 and lib2to3 to understand, i think. This is doable by entering libraryname hg.python.org into Google, where the libraryname is either 2to3 or lib2to3.
The current state for doing sourcecode manipulation in Python sucks, and i'd like to see a library which wraps lib2to3 and provides a more concise API.
The old .ext was a compatibility layer to support the old flaskext module while waiting for flask_ to standardize.
This separates the flask. namespace from each module's namespace, as the module now lives completely in its own module (flask_module) instead of being loaded into a general namespace for all extensions in Flask. It's also clearer that the module is not bundled as a part of Flask.

Use code from repository that has no setup.py

Is there a best practice for how to use code from a python github repository that is missing its setup.py file?
Since I cannot reference it through my own requirements.txt, may I just copy the code into one of my own files?
Specifically, I want to use the function tile_raster_images(...) from https://github.com/lisa-lab/DeepLearningTutorials/blob/master/code/utils.py

You could just paste the code into your own codebase, yes. As long as it was written for the same Python version and you're aware of the needed dependencies (both to third-party libraries and within the same project), you should be golden.
However, the project does not have a license. Quoting from the Github Help:
Generally speaking, the absence of a license means that the default copyright laws apply. This means that you retain all rights to your source code and that nobody else may reproduce, distribute, or create derivative works from your work.
Since you don't definitely know what copyright laws apply to the project, you should be cautious to use the code as you might be committing copyright violation. Especially if this is you're planning to use the code in a commercial product, you're on thin ice.
You should contact the owner of the code and ask them to add a license, so that future users can use the code without worries.

I don't see the downside of just copying the code.
The code you want to use is from a tutorial. I think it actually aims to be directly used instead of serving as a python package.
So just copy it somewhere in your project. Be aware of the license of the source code if you intend to republish it.

What to include in PyPi package?

I'm packaging my new python library for PyPi. The repository contains:
Sphinx documentation sources
Supplemental JavaScript library
Examples
Is it a good idea to include such things into a python egg?
What's the convention?
You can see the guts of the library at https://github.com/peterhudec/authomatic

You shall not make everything into the python egg, but anyway, that's up to the python setup.py bdist_egg to choose what to include or not. But in the source package you upload to pypi, yes, include everything that can't be generated by setup.py. You can upload separately the documentation, so it can get published as well.
But generally, what you need to get included in the egg, is what is necessary for the egg to run as-is. Everything else can be included, but can be distributed through other ways, that's up to you.

There are packages on PyPI that are entirey (or almost) entirely written in bash (virtualenvwrapper.sh is one).
If there is a supplemental JavaScript library that you can package, that wouldn't be a bad thing. This prevents the case where the user might not have npm installed, so it makes your library easier to use and your users happier.
Documentation doesn't NEED to be included but if you want to, then by all means do it. Libraries both include and don't include documentation. github3.py now includes it while requests does not. It's up to your preference.
I personally always have the examples in the documentation, so they're included in my packages that include the documentation. I can't think of any packages off the top of my head that include a separate package of examples, but if you feel it's necessary, then go ahead. I might, however, make that a sub-directory of the library itself though. It will make the name-spacing better when it is installed.
But basically, there are no set conventions beyond having the code to perform the task you say the package will perform.

What I can tell for PyQT4:
it includes doc, examples, plugins, ...
I do not know about your JavaScript library but I think it is no problem to include that as well.
This is an example - I do not know the convention. I would put in everything that could be important to the user of your library.

Packaging Jython with Modules for Easy Installation

I've just built a Jython project that uses both some Python module imports and some Java jars. On my own computer, since I just wanted the thing done, I've gotten things to work in a very hacky way by hardcoding sys.path and installing every module and jar I wanted separately. This is definitely not something I want to keep for a release version. I've read about being able to package everything up into one standalone Jython jar, and that sounds pretty good to me. Is there any reason I shouldn't do this? If not, is there a guide on the best way to do this someone can point me to? I'm running the whole thing through PIG, so having a callable Jython jar would be ideal.
I know some similar questions to this already exist on SO, but the answers to those seem pretty old, and the documentation given (for MavenJython, for example) is pretty poor. I've already looked at MavenJython, and Jip, but I can't really decide between the two, and I'm not really finding sufficient information for either. An ideal answer to this question what I should use, why I should use it, and give a brief demo of how one would use it. A link to any of those would also be awesome.
Thanks!

Please elaborate on how the MavenJython documentation can be improved. Things have not changed since 2011 (you probably have seen this answer), which is not that long ago. There is the website, but as a programmer you probably just want to read the source code of the jython-compile-maven-plugin-test project, which is only 200 lines of code. It is a good idea to use this package as a starting point for your own project.
Distribution philosophies
The way Java software distribution (and Windows software distribution) usually works is that you package everything you need. So yes, a standalone Jython jar would be appropriate. The drawback is that every software may use a different Jython version, the benefit being that this might be what you want (updates may break things). This is the MavenJython approach too, packaging everything in the correct versions.
Python and Linux software distribution just installs packages, checking compatibility at install time. This is the jip approach, which assumes you already have Jython, and whoever installs software will resolve compability issues by installing the correct versions.
Differences
I can not say much about jip though, I have not used it. From what I see in the demos jip is meant to provide Python packages access to Maven Java libraries. It also seems to be capable of producing maven packages from Jython code. So you can probably achieve your goal using either MavenJython or jip. Just try.
The deliverables built using MavenJython distribute Jython, while jip does not.
If you want to instruct programmers who already use jython and are unfamiliar with Maven, who want to use your library to fetch your jython library package, jip might be the way to go.
If you want to write Jython libraries for programmers and distribute them, you can use either MavenJython or jip.
If you have a software package that is going to end up as a deliverable to customers, which happens to also use Jython code and Jython packages, perhaps also providing in-program scripting to the user, go with MavenJython. It allows you to create a standalone executable.
Pig use case
For running jython through pig it is enough to install jython and put the jython sources in your path -- see the embed python section of the pig manual. jip can be appropriate for installing jython packages locally, but is not necessary if you only want to run your code. If you however want to distribute software which uses pig, and install pig, jython and your code on a clients computer, MavenJython can do that for you.

How can I ensure good test-coverage of my big Python proejct

I have a very large python project with a very large test suite. Recently we have decided to quantify the quality of our test-coverage.
I'm looking for a tool to automate the test coverage report generation. Ideally I'd like to have attractive, easy to read reports but I'd settle for less attractive reports if I could make it work quickly.
I've tried Nose, which is not good enough: It is incompatible with distribute / setuptools' namespace package feature. Unfortunately nose coverage will never work for us since we make abundant use of this feature. That's a real shame because Nose seems to work really nicely in Hudson (mostly)
As an alternative, I've heard that there's a way to do a Python coverage analysis in Eclipse, but I've not quite locked-down the perfect technique.
Any suggestions welcome!
FYI we use Python 2.4.4 on Windows XP 32bit

Have you tried using coverage.py? It underlies "nose coverage", but can be run perfectly well outside of nose if you need to.
If you run your tests with (hypothetically) python run_my_tests.py, then you can measure coverage with coverage run run_my_tests.py, then get HTML reports with coverage html.
From your description, I'm not sure what problem you had with nose, especially whether it was a nose issue, or a coverage.py issue. Provide some more details, and I'm sure we can work through them.

Ned has already mentioned his excellent coverage.py module.
If the problem you're having is something nose specific, you might want to consider using another test runner. I've used py.test along with the pytest_coverage plugin that lets you generate coverage statistics. It also has a pytest_nose plugin to help you migrate.
However, I don't understand exactly what the problem you're facing is. Can you elaborate a little on the "distribute / setuptools' namespace package feature" you mentioned? I'm curious to know what the problem is.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.