Why is Twine 1.9.1 still uploading to legacy PyPi? - python

I want to upload packages to pypi.org as mentioned in the Migrating to PyPI.org documentation, but Twine uploads to https://upload.pypi.org/legacy/.
It's available on pypi.python.org/pypi/mypolr, but is not found on pypi.org.
I've tried to read
several other questions, tutorials, and guides.
My pip.ini-file (I'm on Windows 10) looks like this:
[distutils]
index-servers =
pypi
[pypi]
I don't have my username or password stored, so the [pypi] section is empty (as mentioned in migration docs).
I've put the .ini-file in my user folder, and confirmed (per this answer) that it's actually using the one I've set (using environment variable PIP_CONFIG_FILE).
Afraid that I had got something wrong, I also tried without a pip.ini-file to make Twine use its defaults.
I'm using Python 3.6.3 (from Anaconda), and my tools' versions are:
Twine 1.9.1 (migration docs says it should be 1.8+)
setuptools 38.2.3 (migration docs says it should be 27+)
Whether or not it's relevant, here is some more info:
Link to my setup.py
setup is imported from setuptools and not distutils.core
README.rst is used as long description, but in the PyPi page only first 8 asterix of header is shown. (Compare this with this)
The package I upload is version is 0.2.1 (at the time of posting this)
setuptools_scm is used to fetch versions from git tags
build is made with python setup.py sdist bdist_wheel
Please let me know if there is any other information that could be useful to figure this out.

You appear to be doing everything correctly. Twine is not uploading via legacy PyPI (https://pypi.python.org). It is uploading to the new PyPI (https://pypi.org, a.k.a. "Warehouse") via the original (and so far only) PyPI API, and this API just happens to be named "legacy".
Also, your package is present on Warehouse at https://pypi.org/project/mypolr/; Warehouse search is apparently not production-ready.

The docs for Warehouse explain this confusing nomenclature. Quotes below are from the front page and from the page about the Legacy API:
Warehouse is a web application that implements the canonical Python package index (repository); its production deployment is PyPI. It replaces an older code base that powered pypi.python.org.
Legacy API
The “Legacy API” provides feature parity with pypi-legacy, hence the term “legacy”.
...
Upload API
The API endpoint served at upload.pypi.org/legacy/ is Warehouse’s emulation of the legacy PyPI upload API. This is the endpoint that tools such as twine and distutils use to upload distributions to PyPI.
In other words, as I understand it:
PyPI was once a web application hosted at pypi.python.org. That old application, which no longer runs, is now referred to by the name pypi-legacy.
PyPI is now a web application hosted at pypi.org. This new application is named Warehouse. The old pypi.python.org is now just a redirect to pypi.org.
In addition to some new endpoints, Warehouse still exposes a couple of API endpoints that pypi-legacy used to have. Because these endpoints were copied across from pypi-legacy, they are together known as the "Legacy API".
In addition to that, the upload endpoint within Warehouse's Legacy API is served from the URL path /legacy, a naming choice which again reflects the fact that it is a (partial) reimplementation of the endpoint used for uploads in pypi-legacy.
This all seems more confusing than it needs to be, but it is what it is.

In case anyone else is coming here from google, mystified why their uploads are failing, don't forget to check https://status.python.org/ to make sure there isn't an outage. Sometimes you just gotta wait :p

Related

Hostnames/IP adresses of packages from pypi, CRAN, maven,

We have server behind proxy and we want this server to be able to run commands such as:
python: pip install module
R: install.packages("fortunes")
...
Simply to install packages from these sources. Since we are behind proxy, we cannot install these unless the proxy has them whitelisted (otherwise the proxy probihits the connection between the server and wherever the package resides).
My question is: what should we whitelist to be able to run these commands?
I am not sure how the package websites actually works (whether they store the packages themselves or it is just the index and the actual packages resides on other domains/hostnames/...). I believe pypi is quite friendly here (packages are actually found there), but CRAN or Maven = don't know. We are running Spark servers, so our primary concerns are python, R, Java or Scala libraries/packages.
Maven: is actually storing packages. Regarding mirroring, see this answer. It also contains the url of the central repository.
Pypi: From the documentation on how to upload a package to the index, it seems like it is also physically storing the packages.
CRAN: also hosts the packages. There are several mirrors, you will need to whitelist one you want to use
You might want to consider setting up an internal mirror where you put your dependencies once, and then don't need to go to the outside internet.

How do I ensure pip gets package from internal pypi?

I have an application with a requirements.txt which includes a number of third party libraries along with one internal package which must be downloaded from a private pypi instance. Something like:
boto3
flask
flask-restplus
gunicorn
an_internal_package
The problem is that an_internal_package is named something quite common and occludes a package already available on the global pypi. For example, let's call it twisted. The problem I've run into is that setting --extra-index-url within requirements.txt seems to still grab twisted from the global pypi.
--extra-index-url=https://some.internal.pypi.corp.lan
boto3
flask
flask-restplus
gunicorn
twisted # actually an internal package
How can I indicate that twisted should be loaded exclusively from the private pypi and not from the global one?
You could link directly to the package on your internal index instead:
boto3
flask
flask-restplus
gunicorn
https://some.internal.pypi.corp.lan/simple/twisted/Twisted-19.2.0.tar.bz2
This has the effect of pinning the dependency, but this is generally considered best practice anyways.
You can refer index for solution, its little tricky. you should be handling both private pypi and main pypi.
instead of using --extra-index-url you should be using --index-url. however read i recommend you to read through the given link

How do I connect to an external Oracle database using the Python cx_Oracle package on Google App Engine Flex?

My Python App Engine Flex application needs to connect to an external Oracle database. Currently I'm using the cx_Oracle Python package which requires me to install the Oracle Instant Client.
I have successfully run this locally (on macOS) by following the Instant Client installation steps. The steps required me to do the following:
Make a directory called /opt/oracle
Create a symlink from /opt/oracle/instantclient_12_2/libclntsh.dylib.12.1 to ~/lib/
However, I am confused about how to do the same thing in App Engine Flex (instructions). Specifically, here's what I'm confused about:
The instructions say I should run sudo yum install libaio to install the libaio package. How do I do this on GAE Flex? Or is this package already available?
I think I can add the Instant Client files to GAE (a whopping ~100MB!), then set the LD_LIBRARY_PATH environment variable in app.yaml to export LD_LIBRARY_PATH=/opt/oracle/instantclient_12_2:$LD_LIBRARY_PATH. Will this work?
Is this even feasible without using custom Docker containers on App Engine Flex?
Overall I'm not sure if I'm on the right track. Would love to hear from someone who has managed this before :)
If any of your dependencies is not available in the base GAE flex images provided by Google and cannot be installed via pip (because it's not a python package or it's not available in PyPI or whatever other reason) then you can't use the requirements.txt file to get it installed in your GAE flex app.
The proper way to satisfy such dependencies would be to build your own custom runtime. From About Custom Runtimes:
Custom runtimes allow you to define new runtime environments, which
might include additional components like language interpreters or
application servers.
Yes, that means providing a custom Docker file. In your particular case you'd be installing the Instant Client and libaio inside this Dockerfile. See also Building Custom Runtimes.
Answering your first question, I think that the instructions in the oracle website just show that you have to install said library for your application to work.
In the case of App engine flex, they way to ensure that the libraries are present in the deployment is with the requirements.txt textfile. There is a documentation page which does explain how to do so.
On the other hand, I will assume that "Instant Client Files" are not libraries, but necessary data for your App to run. You should use Google Cloud Storage to serve them, or any other alternative of Storage within Google Cloud.
I believe that, if this is all what you need for your App to work, pushing your own custom container should not be necessary.

Difference between devpi and pypi server

Had a quick question here, I am used to devpi and was wondering what is the difference between devpi and pypi server?
Is one better than another? Which of this one scale better?
PyPI (Python Package Index)- is the official repository for third-party Python software packages. Every time you use e.g. pip to install a package that is not in the standard it will get downloaded from the PyPI server.
All of the packages that are on PyPI are publicly visible. So if you upload your own package then anybody can start using it. And obviously you need internet access in order to use it.
devpi (not sure what the acronym stands for) - is a self hosted private Python Package server. Additionally you can use it for testing and releasing of your own packages.
Being self hosted it's ideal for proprietary work that maybe you wouldn't want (or can't) share with the rest of the world.
So other features that devpi offers:
PyPI mirror - cache locally any packages that you download form PyPI. This is excellent for CI systems. Don't have to worry if a package or server goes missing. You can even still use it if you don't have internet access.
multiple indexes - unlike PyPI (which has only one index) in devpi you can create multiple indexes. For example a main index for packages that are rock solid and development where you can release packages that are still under development. Although you have to be careful with this because a large amount of indexes can make things hard to track.
The server has a simple web interface where you can you and search for packages.
You can integrate it with pip so that you can use your local devpi server as if you were using PyPI.
So answering you questions:
Is one better than the other? - well these are two different tools really. No clear answer here, depends on what your needs are.
Which scales better? - definitely devpi.
The official website is very useful with good examples: http://doc.devpi.net/latest/

python: PyPi public modules: How to determine if secure and safe?

I am have completed my python 3 application, and it is using multiple public modules from PyPi.
However, before I deploy it to run within my company's enterprise which will be handling credentials of our customers and accessing 3rd party APIs, I need to do due diligence that they are both secure and safe.
What steps must I perform:
Validate security of PyPi modules and safe to use, and it is important to note that the target Python 3 app will be handling credentials?
What is the most recommended way validate PyPi modules' signature?
Can PyPi module signature be trusted?
By the way, the Python 3 application will be running within a Docker container.
Thank you
These are 3 separate questions, so:
You'll have to audit the package (or get someone else to do that) to know if it's secure. No easy way around it.
All pypi packages have md5 signature attached (link in parentheses after the file). Some of them also attach the pgp signature which shows up in the same place, but it's up to the author whether they're published or not. (https://pypi.python.org/pypi/rpc4django for example includes both md5 and pgp) Md5 verifies integrity. Pgp verifies integrity and origin, so it's a better choice when available.
Just as much as any other signature.
If you're worried about dependencies to that level, I think you should look at maintaining your internal pypi repository. It gives you better verification (just sign the packages yourself after initial download and only accept your signature). It gives you better reliability and speed (you can still build the software if pypi goes down). And it avoids issues with replaced / updated packages which you haven't audited/approved yet.

Categories