I'm wondering how to handle multiple major versions of a dependency library.
I have an open source library, Foo, at an early release stage. The library is a wrapper around another open source library, Bar. Bar has just launched a new major version. Foo currently only supports the previous version. As I'm guessing that a lot of people will be very slow to convert from the previous major version of Bar to the new major version, I'm reluctant to switch to the new version myself.
How is this best handled? As I see it I have these options
Switch to the new major version, potentially denying people on the old version.
Keep going with the old version, potentially denying people on the new version.
Have two different branches, updating both branches for all new features. Not sure how this works with PyPi. Wouldn't I have to release at different version numbers each time?
Separate the repository into two parts. Don't really want to do this.
The ideal solution for me would be to have the same code base, where I could have some sort of C/C++ macro-like thing where if the version is new, use new_bar_function, else use old_bar_function. When installing the library from PyPi, the already installed version of the major version dictates which version is used. If no version is installed, install the newest.
Would much appreciate some pointers.
Normally the Package version information is available after import with package.__version__. You could parse that information from Bar and decide based on this what to do (chose the appropriate function calls or halt the program or raise an error or ...).
You might also gain some insight from https://www.python.org/dev/peps/pep-0518/ for ways to control dependency installation.
It seems that if someone already has Bar installed, installing Foo only updates Bar if Foo explicitly requires the new version. See https://github.com/pypa/pip/pull/4500 and this answer
Have two different branches, updating both branches for all new features. Not sure how this works with PyPI. Wouldn't I have to release at different version numbers each time?
Yes, you could have a 1.x release (that supports the old version) and a 2.x release (that supports the new version) and release both simultaneously. This is a common pattern for packages that want to introduce a breaking change, but still want to continue maintaining the previous release as well.
Related
I've read in few places that generally, Python doesn't provide backward compatibility, which means that any newer version of Python may break code that worked fine for earlier versions. If so, what is my way as a developer to know what versions of Python can execute my code successfully? Is there any set of rules/guarantees regarding this? Or should I just tell my users: Just run this with Python 3.8 (for example) - no more no less...?
99% of the time, if it works on Python 3.x, it'll work on 3.y where y >= x. Enabling warnings when running your code on the older version should pop DeprecationWarnings when you use a feature that's deprecated (and therefore likely to change/be removed in later Python versions). Aside from that, you can read the What's New docs for each version between the known good version and the later versions, in particular the Deprecated and Removed sections of each.
Beyond that, the only solution is good unit and component tests (you are using those, right? 😉) that you rerun on newer releases to verify stuff still works & behavior doesn't change.
According to PEP387, section "Making Incompatible Changes", before incompatible changes are made, a deprecation warning should appear in at least two minor Python versions of the same major version, or one minor version in an older major version. After that, it's a free game, in principle. This made me cringe with regards to safety. Who knows if people run airplanes on Python and if they don't always read the python-dev list. So if you have something that passes 100% coverage unit tests without deprecation warnings, your code should be safe for the next two minor releases.
You can avoid this issue and many others by containerizing your deployments.
tox is great for running unit tests against multiple Python versions. That’s useful for at least 2 major cases:
You want to ensure compatibility for a certain set of Python versions, say 3.7+, and to be told if you make any breaking changes.
You don’t really know what versions your code supports, but want to establish a baseline of supported versions for future work.
I don’t use it for internal projects where I can control over the environment where my code will be running. It’s lovely for people publishing apps or libraries to PyPI, though.
While it is possible to simply use pip freeze to get the current environment, it is not suitable to require an environment as bleeding edge as what I am used too.
Moreover, some developer tooling are only available on recent version of packages (think type annotations), but not needed for users.
My target users may want to use my package on slowly upgrading machines, and I want to get my requirements as low as possible.
For example, I cannot require better than Python 3.6 (and even then I think some users may be unable to use the package).
Similarly, I want to avoid requiring the last Numpy or Matplotlib versions.
Is there a (semi-)automatic way of determining the oldest compatible version of each dependency?
Alternatively, I can manually try to build a conda environment with old packages, but I would have to try pretty randomly.
Unfortunately, I inherited a medium-sized codebase (~10KLoC) with no automated test yet (I plan on making some, but it takes some time, and it sadly cannot be my priority).
The requirements were not properly defined either so that I don't know what it has been run with two years ago.
Because semantic versionning is not always honored (and because it may be difficult from a developper standpoint to determine what is a minor or major change exactly for each possible user), and because only a human can parse release notes to understand what has changed, there is no simple solution.
My technical approach would be to create a virtual environment with a known working combination of Python and libraries versions. From there, downgrade one version by one version, one lib at a time, verifying that it still works fine (may be difficult if it is manual and/or long to check).
My social solution would be to timebox the technical approach to take no more than a few hours. Then settle for what you have reached. Indicate in the README that lib requirements may be overblown and that help is welcome.
Without fast automated tests in which you are confident, there is no way to automate the exploration of the N-space (each library is a dimension) to find a some minimums.
I have a library on PyPI called foobar and it's currently at version 1.2.0 (using semantic versioning).
The next version doesn't preserve API compatibility with versions 1.x, so I'll release it as 2.0.0.
What is the best practice to publish this new version to PyPI, so that clients which are using the 1.x versions don't accidentally upgrade to 2.0.0 and break their code? (I'm assuming that there are people who didn't enforce a version dependency like >=1.0.0, <2.0.0 in their code).
Would it be better to create a completely new package called foobar2 on PyPI and push the new version there? How do other projects handle this?
I assert: API changes normally fall into two categories.
New API B replaces A but A can entirely be implemented using new API B. Therefore it is feasible to maintain the old API simultaneously with the new API.
This could be as simple as a new API being moved to tidy or rationalize your module, or more complex such as a conversion of args to kwargs or whatever.
New API replaces old API but cannot implement it for whatever technical reason.
These are your options for categories IMO. Which one you take will depend a lot on what changes you are making and how much you a) care about or b) are in contact and can talk to your users (i.e. you can get away with a few unannounced breakages if it's just a few people on your team who you can subsequently help fix their issues).
1. Provide both old and new in your new version.
Implement the old API using the new one but mark it as deprecated using the warnings module. You give people notice to convert and you can remove the old API at some point in the future.
This is best practice for API changes of the first type. It keeps everyone on the same stream and allows you to tidy up at some point in the future.
2. Warn, then introduce new API.
If you are in situation 2 or situation 1 but can't justify the resource to implement old using new, then you can easily release a version 1.2.1 that uses the warnings module to warn users that you are about to add a new version that will break their codez, and that they should quickly peg the version in their requirements.txt.
Say when you're going to release version 2.0, and then you've warned them.
But this is only really fair if it's not too much effort to migrate from 1.2.0 to 2.0 for your users.
3. Add a completely new package.
If there are profound differences, and it would be a right pain for your users to update their code to the point that they would essentially need to rewrite it, then you shouldn't be afraid of just using a completely new package. We all make mistakes, and no one in the Python community is not aware of that given the differences between Python 2 and Python 3 :-). unittest2 is also one such example from a while back.
What will people expect.
Personally, if I had automatic upgrades occurring on a system I cared about and if I didn't peg the versions to upgrade only maintenance releases, I would consider it my fault if someone released a major upgrade that my system automatically took but then stopped working because of it.
That gives you the moral highground IMO, but that isn't much consolation for your lazy users (or worse, customers) who didn't check and got burnt.
What other people do
paramiko kept API back-compatibility on 1.x to 2.0
beautifulsoup change name on PyPI (from BeautifulSoup to bs4)
django tends to deprecate features and remove them in later feature releases, and in general I would never upgrade a django install I cared about from 1.X to 1.(X+1) without testing it first. (I consider this to be the best practice, like a lot of things the django folk do.)
So the summary is: there is a mix, and it really is up to you. However the only completely safe ways to avoid user self-inflicted issues is to keep back-compatibility entirely or create a new package, as BeautifulSoup did.
I am having some kind of confusion regarding the right way of declaring requirements of Python packages.
New builds that are not officially released yes do have pre-release names like 0.2.3.dev20160513165655.
pip is really smart to install pre-releases when we add --pre option and when we are building the develop branch we do use it. Master branch does not use it.
I discovered that if I put foobar>=0.2.3 in a requirements file the development version will not be picked even if I specified the --pre parameter.
The pip documentation is not helping here too much because is missing to point anything about pre-releases.
I used the approach of putting foobar>0.2.2 which in conjunction with --pre would install the pre-release.
Still even this if a bit flawed because if we release a hotfix like 0.2.2.1 it may have picked it.
So, what's the best approach to deal with this?
Side note: It would be highly desired not to have to patch the requirement file when we do make a release (a pull request from develop to master). Please remember that develop branch is always using --pre and the master doesn't.
For anyone else coming across this question, the answer is in the same documentation:
If a Requirement specifier includes a pre-release or development version (e.g. >=0.0.dev0) then pip will allow pre-release and development versions for that requirement. This does not include the != flag.
Hence, specifying >=0.2.3.dev0 or similar should pick the "newest" prerelease.
Note that if you already have 0.2.3 released, it will always sort "newer" than prereleases such as 0.2.3.dev20160513165655. PEP 440 says the following:
The developmental release segment consists of the string .dev, followed by a non-negative integer value. Developmental releases are ordered by their numerical component, immediately before the corresponding release (and before any pre-releases with the same release segment), and following any previous release (including any post-releases).
It also says:
... publishing developmental releases of pre-releases to general purpose public index servers is strongly discouraged, as it makes the version identifier difficult to parse for human readers. If such a release needs to be published, it is substantially clearer to instead create a new pre-release by incrementing the numeric component.
Developmental releases of post-releases are also strongly discouraged ...
So ideally you would not use a datestamp, but something like dev1, dev2, dev3. I think the PEP is actually saying you should use 0.2.3.dev1, 0.2.4.dev1, 0.2.5.dev1, but either is equally readable. It really depends on how many builds you are producing.
In your case, if 0.2.3 is already released, all the subsequent development releases need to be 0.2.4.dev20160513165655 so that pip will see it as newer.
I have created a medium sized project in python 2.7.3 containing around 100 modules. I wish to find out with which previous versions of python (ex: 2.6.x, 2.7.x) is my code compatible (before releasing my project in public domain). What is the easiest way to find it out?
Solutions I know -
Install multiple versions of python and check in every versions. But I don't have test cases defined yet, so need to define those first.
Read and compare changelog of the various python versions I wish to check compatibility for, and accordingly find out.
Kindly provide better solutions.
I don't really know of a way to get around doing this without some test cases. Even if your code could run in an older version of python there is no guarantee that it works correctly without a suite of test cases that sufficiently test your code
No, what you named is pretty much how it's done, though the What's New pages and the documentation proper may be more useful than the full changelog. Compatibility to such a huge, moving target is infeasible to automate even partially. It's just not as much work as it sounds like, because:
Some people do have test suites ;-)
You don't (usually) need to consider bugfix releases (such as 2.7.x for various x). It's possible that your code requires a bug fix, but generally the .0 releases are quite reliable and code compatible with x.y.0 can run on any x.y.z version.
Thanks to the backwards compatibility policy, it is enough to establish a minimum supported version, all later releases (of the same major version) will stay compatible. This doesn't help in your case as 2.7 is the last 2.x release ever, but if you target, say, 2.5 then you usually don't have to check for 2.6 or 2.7 compatibility.
If you keep your eyes open while coding, and have a bit of experience as well as a good memory, you'll know you used some functionality that was introduced in a recent version. Even if you don't know what version specifically, you can look it up quickly in the documentation.
Some people embark with the intent to support a specific version, and always keep that in mind when developing. Even if it happens to work on other versions, they'd consider it unsupported and don't claim compatibility.
So, you could either limit yourself to 2.7 (it's been out for three years), or perform tests on older releases. If you just want to determine whether it's compatible, not which incompatibilities there are and how they can be fixed, you can:
Search the What's New pages for new features, most importantly new syntax, which you used.
Check the version constraints of third party libraries you used.
Search the documentation of standard library modules you use for newly added functionality.
A lot easier with some test cases but manual testing can give you a reasonably idea.
Take the furthest back version that you would hope to support, (I would suggest 2.5.x but further back if you must - manually test with that version keeping notes of what you did and especially where it fails if any where - if it does fail then either address the issue or do a binary search to see which version the failure point(s) disappear at. This could work even better if you start from a version that you are quite sure you will fail at, 2.0 maybe.
1) If you're going to maintain compatibility with previous versions, testing is the way to go. Even if your code happens to be compatible now, it can stop being so at any moment in the future if you don't pay attention.
2) If backwards compatibility is not an objective but just a "nice side-feature for those lucky enough", an easy way for OSS is to let users try it out, noting that "it was tested in <version> but may work in previous ones as well". If there's anyone in your user base interested in running your code in an earlier version (and maintain compatibility with it), they'll probably give you feedback. If there isn't, why bother?