Does PyPI have simple urls for package downloads? - python

Does PyPI support simple download urls? The reason I want to do this, is that I have a PC with curl installed, but not pip. So I would be able to install the package with:
pip install ppci
But since pip is not available, what I want to do is download this package with curl and untar it.
Now I can do this:
curl https://pypi.python.org/packages/4c/e8/fd7241885330ace50d2f7598a2652d4e80c1d922faece7bba88529cf6cfe/ppci-0.5.4.tar.gz
tar xfz ppci-0.5.4.tar.gz
But what I want is a cleaner url, like this:
curl https://pypi.python.org/packages/ppci/0.5.4/ppci-0.5.4.tar.gz
So, that in future I can easily upgrade the version to this:
curl https://pypi.python.org/packages/ppci/0.5.5/ppci-0.5.5.tar.gz
Does this url, or something alike exist, such that I can easily increase the version number and get the newer version without the long hashcode in it?

The right url is:
https://pypi.io/packages/source/p/ppci/ppci-0.5.4.tar.gz
Note that this url will redirect, but curl can handle it with the -L option.
The url format is, as explained below in the comments:
https://pypi.io/packages/source/{ package_name_first_letter }/{ package_name }/{ package_name }-{ package_version }.tar.gz

These all appear to work as of 2019-10-30, and redirect one to the next:
https://pypi.io/packages/source/p/pip/pip-19.3.1.tar.gz
https://pypi.org/packages/source/p/pip/pip-19.3.1.tar.gz
https://files.pythonhosted.org/packages/source/p/pip/pip-19.3.1.tar.gz
https://files.pythonhosted.org/packages/ce/ea/9b445176a65ae4ba22dce1d93e4b5fe182f953df71a145f557cffaffc1bf/pip-19.3.1.tar.gz
This answer describes a way to fetch wheels using a similar index built by Debian: https://stackoverflow.com/a/53176862/881629
PyPI documentation actively discourages using the conveyor service as above, as it's mostly for legacy support, and we "should generally query the index for package URLs rather than guessing". https://warehouse.readthedocs.io/api-reference/integration-guide.html#querying-pypi-for-package-urls
(Thanks to Wolfgang Kuehn for the pointer to Warehouse documentation, but note that to get a correct wheel we need to select the appropriate entry for the target platform from the urls field in the API response. We can't grab a static element from the list, as order appears to vary between packages.)

The url for wheels is, by example invoke
https://files.pythonhosted.org/packages/py3/i/invoke/invoke-1.6.0-py3-none-any.whl
or in general
file_name := {distribution}-{version}(-{build tag})?-{python tag}-{abi tag}-{platform tag}.whl
first_letter := first letter of distribution
https://files.pythonhosted.org/packages/{python tag}/{first_letter}/{distribution}/{file_name}
I don't know if this is an official contract of PyPI Warehouse.
You can always query, in a RestFull manner, its JSON API like so
https://pypi.org/pypi/invoke/1.6.0/json
The download url is then at document path /urls[1]/url

Related

Google image download with python cannot download images

I'm using google_images_download library to download top 20 images for a keyword. It's worked perfectly when I'm using it last days. Code is as follows.
from google_images_download import google_images_download
response = google_images_download.googleimagesdownload()
arguments = {"keywords":keyword,"limit":10,"print_urls":True}
paths = response.download(arguments)
Now it gives following error.
Evaluating...
Starting Download...
Unfortunately all 10 could not be downloaded because some images were not downloadable. 0 is all we got for this search filter!
Errors: 0
How can I solve this error.
There has been some changes on Google end (how they respond to the request) which results in this issue. Joeclinton1 on github has done some modifications to the original repo which provides a temporary fix.
You can find the updated repo here: https://github.com/Joeclinton1/google-images-download.git . The solution is in patch-1 branch if I'm not mistaken.
First uninstall the current version of google_images_download.
Then manually install Joeclinton1's repo by:
git clone https://github.com/Joeclinton1/google-images-download.git
cd google-images-download && sudo python setup.py install #no need for 'sudo' on windows Anaconda environment
or to install it with pip
pip install git+https://github.com/Joeclinton1/google-images-download.git
This should solve the problem. Note that currently this repo only supports upto 100 images.
I faced the same issue with google-image-download, which used to work perfect earlier!
I have an alternative that I would like to suggest, which should solve the problem.
Solution: Instead of using google-image-download for Python, use bing-image-downloader, that downloads from Bing! search engine.
Steps:
Step 1:
Install the library by using: pip install bing-image-downloader
Step 2:
from bing_image_downloader import downloader
downloader.download(query_string, limit=100, output_dir='dataset',
adult_filter_off=True, force_replace=False, timeout=60)
That's it! All you would need to do is to add your image topic to the query_string.
Note:
Parameters that you can further tweak:
query_string : String to be searched.
limit : (optional, default is 100) Number of images to download.
output_dir : (optional, default is 'dataset') Name of output dir.
adult_filter_off : (optional, default is True) Enable of disable adult filteration.
force_replace : (optional, default is False) Delete folder if present and start a fresh download.
timeout : (optional, default is 60) timeout for connection in seconds.
Further Reference: https://pypi.org/project/bing-image-downloader/
If you want to download less than 100 images per query string, google-images-download will work better than bing-images-downloader. It handles the errors better and, you know, Google Images gives quite better results than Bing equivalent.
However, if you're trying to download more than 100 images, google-images-downloader will give you a lot of headaches. As mentioned in this answer, Google changed their end, and because of this the repo is having a lot of failures (more info on the situation status here).
So, if you want to download thousands of images, use bing-image-downloader:
Install package from pip
pip install bing-image-downloader
Run query.
NOTE: The documentation seems to be incorrect, as it returns a "No module found" error when importing the package as from bing_image_downloader import downloader (as mentioned in this answer). Import it and use it like this:
from bing_image_downloader.downloader import download
query_string = 'muscle cars'
download(query_string, limit=1000, output_dir='dataset', adult_filter_off=True, force_replace=False, timeout=60, verbose=True)
Another easy way to download any number of images :-
pip install simple_image_download
from simple_image_download import simple_image_download as simp
response = simp.simple_image_download
response().download(a, b)
Where a= string of subject you want to download
B= number of images you want to download

how to get the download url from dnf or rpmdb in python

I need of to get the download url from a rpm packages on fedora under python.
For example with dnf I just type:
# dnf download --url xterm
rsync://fedora.tu-chemnitz.de/ftp/pub/linux/fedora/linux/releases/27/Everything/x86_64/os/Packages/x/xterm-330-3.fc27.x86_64.rpm
I need the same thing but with python.
I have tried with "import dnf" and "import rpm" but without succeed.
With DNF, it would look something like this (untested because writing quickly):
base = dnf.Base()
base.read_all_repos()
base.fill_sack()
for pkg in base.sack.query().filter(name='xterm'):
print(pkg.remote_location())
You probably want to do a little more processing, like only using one of the locations if multiple exist, and maybe some error handling.

How to ensure that README.rst is valid?

There are two version of my little tool:
https://pypi.python.org/pypi/tbzuploader/2017.11.0
https://pypi.python.org/pypi/tbzuploader/2017.12.0 Bug: The pypi page looks ugly.
In the last update a change in README.rst cases a warning:
user#host> rst2html.py README.rst > /tmp/foo.html
README.rst:18: (WARNING/2) Inline emphasis start-string without end-string.
README.rst:18: (WARNING/2) Inline emphasis start-string without end-string.
Now the pypi page looks ugly :-(
I use this recipe to do CI, bumpversion, upload to pypi: https://github.com/guettli/github-travis-bumpversion-pypi
How could I ensure that no broken README.rst gets released any more? With other words I want to avoid that the pypi page looks ugly.
Dear detail lovers: Please don't look into the current particular error in the README.rst. That's is not the question :-)
Update
As of Sep 21, 2018, the Python Packaging Authority recommends an alternative command twine check. To install twine:
pip install twine
twine check dist/*
Note that twine requires readme_renderer. You could still use readme_renderer, and you only need to install twine if you want its other features, which is a good idea anyway if you are releasing to PyPI.
From the official Python packaging docs, Uploading your Project to PyPI:
Tip: The reStructuredText parser used on PyPI is not Sphinx! Furthermore, to ensure safety of all users, certain kinds of URLs and directives are forbidden or stripped out (e.g., the .. raw:: directive). Before trying to upload your distribution, you should check to see if your brief / long descriptions provided in setup.py are valid. You can do this by following the instructions for the pypa/readme_renderer tool.
And from that tool's README.rst:
To check your long description's locally simply install the readme_renderer library using:
$ pip install readme_renderer
$ python setup.py check -r -s
Preamble
I had a readme which would not render on PyPi, other than the first element on the page (an image). I ran the file against multiple validators, and tested it against other renders. It worked perfectly fine everywhere else! So, after a long, nasty fight with it, and numerous version bumps so I could test a PyPi revision, I tried reducing the file to a bare minimum, from which I'd build it back up. It turned out that the first line was always processed, and then nothing else was...
Solution
Discovering this clue regarding the first line, I then had an epiphany... All I had to do was change the line endings in the file! I was editing the file in Windows, with Windows line endings being tacked on implicitly. I changed that to Unix style and (poof!) PyPi fully rendered the doc!
Rant...
I've encountered such things in the past, but I took it for granted that PyPi would handle cross platform issues like this. I mean one of the key features of Python is being cross platform! Am I the first person working in Windows to encounter this?! I don't appreciate the hours of time this wasted.
You could try if rstcheck catches the type of error in your readme. If it does, run it after pytest in your script section. (and add it in your requirements ofc).

Artifactory PyPi repo layout with build promotion

Q1:
I have an Artifactory PyPi enabled repo my-pypi-repo where I can publish my packages. When uploading via python setup.py -sdist, I get a structure like this:
my-pypi-repo|
|my_package|
|x.y.z|
|my_package-x.y.z.tar.gz
The problem is this structure will not match any "allowed" repo layout in Artifactory, since [org] or [orgPath] are mandatory:
Pattern '[module]/[baseRev]/[module]-[baseRev].[ext]' must at-least
contain the tokens 'module', 'baseRev' and 'org' or 'orgPath'.
I managed to publish to a path by 'hacking' the package name to myorg/my_package, but then pip cannot find it, so it's pretty useles.
Q2:
Has anyone tried the "ci-repo" and "releases-repo" with promotion for Python using Artifactory?
What I would like to achieve:
CI repo:
my_package-1.2.3+build90.tar.gz When this artifact gets promoted build metadata gets dropped
Releases repo:
my_package-1.2.3.tar.gz
I can achieve this via repo layouts (providing I resolve Q1). The problem is how to deal with the "embedded" version inside my Python script, hardcoded in setup.py.
I'd rather not rebuild the package again, for best practices.
I am running into the same issue in regards to your first question/problem. When configuring my system to publish to artifactory using pip, it uses the format you described.
As you mentioned, the [org] or [orgPath] is mandatory and this basically breaks all the REST API functionality, like searching for latest version, etc. I'm currently using this as my Artifact Path Pattern:
[org]/[module]/[baseRev].([fileItegRev])/[module]-[baseRev].([fileItegRev]).[ext]
The problem is that pip doesn't understand the concept of [org] in this case. I'm temporarily using a python script to publish my packages to Artifactory to get around this. Hopefully this is something that can be addressed by the jFrog team.
The python script simply uses Artifactory's REST API to publish to my local pypi repository, tacking on a few properties so that some of the REST API functions work properly, like Artifact Latest Version Search Based on Properties.
I need to be able to use that call because we're using Chef in-house and we use that method to get the latest version. The pypi.version property that gets added when publishing via python setup.py sdist upload -r local doesn't work with the REST API so I have to manually add the version property. Painful to be honest since we can't add properties when using the upload option with setup.py. Ideally I'd like to be able to do everything using pip, but at the moment this isn't possible.
I'm using the requests package and the upload method in the Artifactory documentation here. Here is the function I'm using to publish adding a few properties (feel free to add more if you need):
def _publish_artifact(name, version, path, summary):
base_url = 'http://server:8081/artifactory/{0}'.format(PYPI_REPOSITORY)
properties = ';version={0};pypi.name={1};pypi.version={0};pypi.summary={2}'\
.format(version, name, summary)
url_path = '/Company/{0}/{1}/{0}-{1}.zip'.format(name, version)
url = '{0}{1}{2}'.format(base_url, properties, url_path)
dist_file = r'{0}\dist\{1}-{2}.zip'.format(path, name, version)
files = {'upload_file': open(dist_file, 'rb')}
s = requests.Session()
s.auth = ('username', 'password')
reply = s.put(url, files=files)
logger.info('HTTP reply: {0}'.format(reply))
A1: Artifactory layouts aren't enforcive, you can deploy any file under any path to any repo. Some layout-related features, like snapshots cleanup won't work then, but I don't think you need them anyway.
A2: The best solution will be to code your promotion in a promotion user plugin. Renaming artifacts on the fly during their promotion to another repo is one of the most popular scenarios of this kind of plugin.

Using pip within a python script

I am writing a utility in python that needs to check for (and if necessary, install and even upgrade) various other modules with in a target project/virtualenv, based on user supplied flags and/or input. I am currently trying to utilize 'pip' directly/programatically (because of it's existing support for the various repo types I will need to access), but I am having difficulty in finding examples or documentation on using it this way.
This seemed like the direction to go:
import pip
vcs = pip.vcs.VersionControl(url="http://path/to/repo/")
...but it gives no joy.
I need help with some of the basics aparently - like how can I use pip to pull/export a copy of an svn repo into a given local directory. Ultimately, I will also need to use it for git and mercurial checkouts as well as standard pypi installs. Any links, docs or pointers would be much appreciated.
Pip uses a particular format for vcs urls. The format is
vcsname+url#rev
#rev is optional, you can use it to reference a specific commit/tag
To use pip to retrieve a repository from a generic vcs to a local directory you may do this
from pip.vcs import VcsSupport
req_url = 'git+git://url/repo'
dest_path = '/this/is/the/destination'
vcs = VcsSupport()
vc_type, url = req_url.split('+',1)
backend = vcs.get_backend(vc_type)
if backend:
vcs_backend = backend(req_url)
vcs_backend.obtain(dest_path)
else:
print('Not a repository')
Check https://pip.pypa.io/en/stable/reference/pip_install/#id8 to know which vcs are supported

Categories