Artifactory PyPi repo layout with build promotion - python

Q1:
I have an Artifactory PyPi enabled repo my-pypi-repo where I can publish my packages. When uploading via python setup.py -sdist, I get a structure like this:
my-pypi-repo|
|my_package|
|x.y.z|
|my_package-x.y.z.tar.gz
The problem is this structure will not match any "allowed" repo layout in Artifactory, since [org] or [orgPath] are mandatory:
Pattern '[module]/[baseRev]/[module]-[baseRev].[ext]' must at-least
contain the tokens 'module', 'baseRev' and 'org' or 'orgPath'.
I managed to publish to a path by 'hacking' the package name to myorg/my_package, but then pip cannot find it, so it's pretty useles.
Q2:
Has anyone tried the "ci-repo" and "releases-repo" with promotion for Python using Artifactory?
What I would like to achieve:
CI repo:
my_package-1.2.3+build90.tar.gz When this artifact gets promoted build metadata gets dropped
Releases repo:
my_package-1.2.3.tar.gz
I can achieve this via repo layouts (providing I resolve Q1). The problem is how to deal with the "embedded" version inside my Python script, hardcoded in setup.py.
I'd rather not rebuild the package again, for best practices.

I am running into the same issue in regards to your first question/problem. When configuring my system to publish to artifactory using pip, it uses the format you described.
As you mentioned, the [org] or [orgPath] is mandatory and this basically breaks all the REST API functionality, like searching for latest version, etc. I'm currently using this as my Artifact Path Pattern:
[org]/[module]/[baseRev].([fileItegRev])/[module]-[baseRev].([fileItegRev]).[ext]
The problem is that pip doesn't understand the concept of [org] in this case. I'm temporarily using a python script to publish my packages to Artifactory to get around this. Hopefully this is something that can be addressed by the jFrog team.
The python script simply uses Artifactory's REST API to publish to my local pypi repository, tacking on a few properties so that some of the REST API functions work properly, like Artifact Latest Version Search Based on Properties.
I need to be able to use that call because we're using Chef in-house and we use that method to get the latest version. The pypi.version property that gets added when publishing via python setup.py sdist upload -r local doesn't work with the REST API so I have to manually add the version property. Painful to be honest since we can't add properties when using the upload option with setup.py. Ideally I'd like to be able to do everything using pip, but at the moment this isn't possible.
I'm using the requests package and the upload method in the Artifactory documentation here. Here is the function I'm using to publish adding a few properties (feel free to add more if you need):
def _publish_artifact(name, version, path, summary):
base_url = 'http://server:8081/artifactory/{0}'.format(PYPI_REPOSITORY)
properties = ';version={0};pypi.name={1};pypi.version={0};pypi.summary={2}'\
.format(version, name, summary)
url_path = '/Company/{0}/{1}/{0}-{1}.zip'.format(name, version)
url = '{0}{1}{2}'.format(base_url, properties, url_path)
dist_file = r'{0}\dist\{1}-{2}.zip'.format(path, name, version)
files = {'upload_file': open(dist_file, 'rb')}
s = requests.Session()
s.auth = ('username', 'password')
reply = s.put(url, files=files)
logger.info('HTTP reply: {0}'.format(reply))

A1: Artifactory layouts aren't enforcive, you can deploy any file under any path to any repo. Some layout-related features, like snapshots cleanup won't work then, but I don't think you need them anyway.
A2: The best solution will be to code your promotion in a promotion user plugin. Renaming artifacts on the fly during their promotion to another repo is one of the most popular scenarios of this kind of plugin.

Related

Does PyPI have simple urls for package downloads?

Does PyPI support simple download urls? The reason I want to do this, is that I have a PC with curl installed, but not pip. So I would be able to install the package with:
pip install ppci
But since pip is not available, what I want to do is download this package with curl and untar it.
Now I can do this:
curl https://pypi.python.org/packages/4c/e8/fd7241885330ace50d2f7598a2652d4e80c1d922faece7bba88529cf6cfe/ppci-0.5.4.tar.gz
tar xfz ppci-0.5.4.tar.gz
But what I want is a cleaner url, like this:
curl https://pypi.python.org/packages/ppci/0.5.4/ppci-0.5.4.tar.gz
So, that in future I can easily upgrade the version to this:
curl https://pypi.python.org/packages/ppci/0.5.5/ppci-0.5.5.tar.gz
Does this url, or something alike exist, such that I can easily increase the version number and get the newer version without the long hashcode in it?
The right url is:
https://pypi.io/packages/source/p/ppci/ppci-0.5.4.tar.gz
Note that this url will redirect, but curl can handle it with the -L option.
The url format is, as explained below in the comments:
https://pypi.io/packages/source/{ package_name_first_letter }/{ package_name }/{ package_name }-{ package_version }.tar.gz
These all appear to work as of 2019-10-30, and redirect one to the next:
https://pypi.io/packages/source/p/pip/pip-19.3.1.tar.gz
https://pypi.org/packages/source/p/pip/pip-19.3.1.tar.gz
https://files.pythonhosted.org/packages/source/p/pip/pip-19.3.1.tar.gz
https://files.pythonhosted.org/packages/ce/ea/9b445176a65ae4ba22dce1d93e4b5fe182f953df71a145f557cffaffc1bf/pip-19.3.1.tar.gz
This answer describes a way to fetch wheels using a similar index built by Debian: https://stackoverflow.com/a/53176862/881629
PyPI documentation actively discourages using the conveyor service as above, as it's mostly for legacy support, and we "should generally query the index for package URLs rather than guessing". https://warehouse.readthedocs.io/api-reference/integration-guide.html#querying-pypi-for-package-urls
(Thanks to Wolfgang Kuehn for the pointer to Warehouse documentation, but note that to get a correct wheel we need to select the appropriate entry for the target platform from the urls field in the API response. We can't grab a static element from the list, as order appears to vary between packages.)
The url for wheels is, by example invoke
https://files.pythonhosted.org/packages/py3/i/invoke/invoke-1.6.0-py3-none-any.whl
or in general
file_name := {distribution}-{version}(-{build tag})?-{python tag}-{abi tag}-{platform tag}.whl
first_letter := first letter of distribution
https://files.pythonhosted.org/packages/{python tag}/{first_letter}/{distribution}/{file_name}
I don't know if this is an official contract of PyPI Warehouse.
You can always query, in a RestFull manner, its JSON API like so
https://pypi.org/pypi/invoke/1.6.0/json
The download url is then at document path /urls[1]/url

List all the Mercurial projects in Mercurial repo with python api

I want to list all the projects using Mercurial with Python.
I downloaded the package "import hglib" but I did not find in the documentation, the functions, etc... which could help me. Somebody is would know how can it be done?
PS: i found some informations in this links:
- https://www.mercurial-scm.org/wiki/MercurialApi
- http://pythonhosted.org/hgapi/index.html#hgapi.hgapi.Repo.command
but it wasn't what i was looking for...
So, what I Understood from your question is probably you want to list the nested repositories inside a mercurial repo. In much more easier terms you must be working with a mercurial forest and you want to list the sub repos deep under and it seems that there is a python package named hgnested written to deal with mercurial forests.
I took a patch of code and played with it to meet what we want and this is what I came up with.
from mercurial import hg, ui
import hgnested
def getNestedRepos(ui, source, **opts):
origsource = ui.expandpath(source)
remotesource, remotebranch =hg.parseurl(origsource,opts.get('branch'))
if hasattr(hg, 'peer'):
remoterepo = hg.peer(ui, opts, remotesource)
localrepo = remoterepo.local()
if localrepo:
remoterepo = localrepo
else:
remoterepo = hg.repository(hg.remoteui(ui, opts), remotesource)
return remoterepo.nested
print getNestedRepos(ui.ui(), <path to mercurial forest>)
But there is another scrapy way where you need to visit all the sub directories recursively and look for the presence of .hg file.
Note: If you want to list repos from a remote repository make sure hgnested package is installed in that remote server and the repo path you are passing is a mercurial forest head.

How do I add python libraries to an AWS lambda function for Alexa?

I was following the tutorial to create an Alexa app using Python:
Python Alexa Tutorial
I was able to successfully follow all the steps and get the app to work.I now want to modify the python code and use external libraries such as import requests
or any other libraries that I install using pip. How would I setup my lambda function to include any pip packages that I install locally on my machine?
As it is described in the Amazon official documentation link here It is as simple as just creating a zip of all the folder contents after installing the required packages in your folder where you have your python lambda code.
As Vineeth pointed above in his comment, The very first step in moving from an inline code editor to a zip file upload approach is to change your lambda function handler name under configuration settings to include the python script file name that holds the lambda handler.
lambda_handler => {your-python-script-file-name}.lambda_handler.
Other solutions like python-lambda and lambda-uploader help with simplifying the process of uploading and the most importantly LOCAL TESTING. These will save a lot of time in development.
The official documentation is pretty good. In a nutshell, you need to create a zip file of a directory containing both the code of your lambda function and all external libraries you use at the top level.
You can simulate that by deactivating your virtualenv, copying all your required libraries into the working directory (which is always in sys.path if you invoke a script on the command line), and checking whether your script still works.
You may want to look into using frameworks such as zappa which will handle packaging up and deploying the lambda function for you.
You can use that in conjunction with flask-ask to have an easier time making Alexa skills. There's even a video tutorial of this (from the zappa readme) here
To solve this particular problem we're using a library called juniper. In a nutshell, all you need to do is create a very simple manifest file that looks like:
functions:
# Name the zip file you want juni to create
router:
# Where are your dependencies located?
requirements: ./src/requirements.txt.
# Your source code.
include:
- ./src/lambda_function.py
From this manifest file, calling juni build will create the zip file artifact for you. The file will include all the dependencies you specify in the requirements.txt.
The command will create this file ./dist/router.zip. We're using that file in conjunction with a sam template. However, you can then use that zip and upload it to the console, or through the awscli.
Echoing #d3ming's answer, a framework is a good way to go at this point. Creating the deployment package manually isn't impossible, but you'll need to be uploading your packages' compiled code, and if you're compiling that code on a non-linux system, the chance of running into issues with differences between your system and the Lambda function's deployed environment are high.
You can then work around that by compiling your code on a linux machine or Docker container.. but between all that complexity you can get much more from adopting a framework.
Serverless is well adopted and has support for custom python packages. It even integrates with Docker to compile your python dependencies and build the deployment package for you.
If you're looking for a full tutorial on this, I wrote one for Python Lambda functions here.
Amazon created a repository that deals with your situation:
https://github.com/awsdocs/aws-lambda-developer-guide/tree/master/sample-apps/blank-python
The blank app is an example on how to push a lambda function that depends on requirements, with the bonus that being made by Amazon.
Everything you need to do is to follow the step by step, and update the repository based on your needs.
For some lambda POCs and fast lambda prototyping you can include and use the following function _install_packages, you can place a call to it before lambda handling function (for lambda init time package installation, if your deps need less than 10 seconds to install) or place the call at the beginning of the lambda handler (this will call the function exactly once at the first lambda event). Given pip install options included, packages to be installed must provide binary installable versions for manylinux.
_installed = False
def _install_packages(*packages):
global _installed
if not _installed:
import os
import sys
import time
_started = time.time()
os.system("mkdir -p /tmp/packages")
_packages = " ".join(f"'{p}'" for p in packages)
print("INSTALLED:")
os.system(
f"{sys.executable} -m pip freeze --no-cache-dir")
print("INSTALLING:")
os.system(
f"{sys.executable} -m pip install "
f"--no-cache-dir --target /tmp/packages "
f"--only-binary :all: --no-color "
f"--no-warn-script-location {_packages}")
sys.path.insert(0, "/tmp/packages")
_installed = True
_ended = time.time()
print(f"package installation took: {_ended - _started:.2f} sec")
# usage example before lambda handler
_install_packages("pymssql", "requests", "pillow")
def lambda_handler(event, context):
pass # lambda code
# usage example from within the lambda handler
def lambda_handler(event, context):
_install_packages("pymssql", "requests", "pillow")
pass # lambda code
Given examples install python packages: pymssql, requests and pillow.
An example lambda that installs requests and then calls ifconfig.me to obtain it's egress IP address.
import json
_installed = False
def _install_packages(*packages):
global _installed
if not _installed:
import os
import sys
import time
_started = time.time()
os.system("mkdir -p /tmp/packages")
_packages = " ".join(f"'{p}'" for p in packages)
print("INSTALLED:")
os.system(
f"{sys.executable} -m pip freeze --no-cache-dir")
print("INSTALLING:")
os.system(
f"{sys.executable} -m pip install "
f"--no-cache-dir --target /tmp/packages "
f"--only-binary :all: --no-color "
f"--no-warn-script-location {_packages}")
sys.path.insert(0, "/tmp/packages")
_installed = True
_ended = time.time()
print(f"package installation took: {_ended - _started:.2f} sec")
# usage example before lambda handler
_install_packages("requests")
def lambda_handler(event, context):
import requests
return {
'statusCode': 200,
'body': json.dumps(requests.get('http://ifconfig.me').content.decode())
}
Given single quote escaping is considered when building pip's command line, you can specify a version in a package spec like this pillow<9, the former will install most recent 8.X.X version of pillow.
I too struggled for a while with this. The after deep diving into aws resources I got to know the lambda function on aws runs locally on a a linux. And it's very important to have the the python package version which matches with the linux version.
You may find more information on this on :
https://aws.amazon.com/lambda/faqs/
Follow the steps to download the version.
1. Find the .whl image of the package from pypi and download it on you local.
2. Zip the packages and add them as layers in aws lambda
3. Add the layer to the lambda function.
Note: Please make sure that version you're trying to install python package matches the linux os on which the aws lambda performs computes tasks.
References :
https://pypi.org/project/Pandas3/#files
A lot of python libraries can be imported via Layers here: https://github.com/keithrozario/Klayers, or your can use a framework like serverless that has plugins to package packages directly into your artifact.

What is the cleanest way to add a directory of third-party packages to the beginning of the Python path?

My context is appengine_config.py, but this is really a general Python question.
Given that we've cloned a repo of an app that has an empty directory lib in it, and that we populate lib with packages by using the command pip install -r requirements.txt --target lib, then:
dirname ='lib'
dirpath = os.path.join(os.path.dirname(__file__), dirname)
For importing purposes, we can add such a filesystem path to the beginning of the Python path in the following way (we use index 1 because the first position should remain '.', the current directory):
sys.path.insert(1, dirpath)
However, that won't work if any of the packages in that directory are namespace packages.
To support namespace packages we can instead use:
site.addsitedir(dirpath)
But that appends the new directory to the end of the path, which we don't want in case we need to override a platform-supplied package (such as WebOb) with a newer version.
The solution I have so far is this bit of code which I'd really like to simplify:
sys.path, remainder = sys.path[:1], sys.path[1:]
site.addsitedir(dirpath)
sys.path.extend(remainder)
Is there a cleaner or more Pythonic way of accomplishing this?
For this answer I assume you know how to use setuptools and setup.py.
Assuming you would like to use the standard setuptools workflow for development, I recommend using this code snipped in your appengine_config.py:
import os
import sys
if os.environ.get('CURRENT_VERSION_ID') == 'testbed-version':
# If we are unittesting, fake the non-existence of appengine_config.
# The error message of the import error is handled by gae and must
# exactly match the proper string.
raise ImportError('No module named appengine_config')
# Imports are done relative because Google app engine prohibits
# absolute imports.
lib_dir = os.path.join(os.path.dirname(os.path.realpath(__file__)), 'libs')
# Add every library to sys.path.
if os.path.isdir(lib_dir):
for lib in os.listdir(lib_dir):
if lib.endswith('.egg'):
lib = os.path.join(lib_dir, lib)
# Insert to override default libraries such as webob 1.1.1.
sys.path.insert(0, lib)
And this piece of code in setup.cfg:
[develop]
install-dir = libs
always-copy = true
If you type python setup.py develop, the libraries are downloaded as eggs in the libs directory. appengine_config inserts them to your path.
We use this at work to include webob==1.3.1 and internal packages which are all namespaced using our company namespace.
You may want to have a look at the answers in the Stack Overflow thread, "How do I manage third-party Python libraries with Google App Engine? (virtualenv? pip?)," but for your particular predicament with namespace packages, you're running up against a long-standing issue I filed against site.addsitedir's behavior of appending to sys.path instead of inserting after the first element. Please feel free to add to that discussion with a link to this use case.
I do want to address something else that you said that I think is misleading:
My context is appengine_config.py, but this is really a general Python
question.
The question actually arises from the limitations of Google App Engine and the inability to install third-party packages, and hence, seeking a workaround. Rather than manually adjusting sys.path and using site.addsitedir. In general Python development, if your code uses these, you're Doing It Wrong.
The Python Packaging Authority (PyPA) describes the best practices to put third party libraries on your path, which I outline below:
Create a virtualenv
Mark out your dependencies in your setup.py and/or requirements files (see PyPA's "Concepts and Analyses")
Install your dependencies into the virtualenv with pip
Install your project, itself, into the virtualenv with pip and the -e/--editable flag.
Unfortunately, Google App Engine is incompatible with virtualenv and with pip. GAE chose to block this toolset in an attempt sandbox the environment. Hence, one must use hacks to work around the limitations of GAE to use additional or newer third party libraries.
If you dislike this limitation and want to use standard Python tooling for managing third-party package dependencies, other Platform as a Service providers out there eagerly await your business.

Using pip within a python script

I am writing a utility in python that needs to check for (and if necessary, install and even upgrade) various other modules with in a target project/virtualenv, based on user supplied flags and/or input. I am currently trying to utilize 'pip' directly/programatically (because of it's existing support for the various repo types I will need to access), but I am having difficulty in finding examples or documentation on using it this way.
This seemed like the direction to go:
import pip
vcs = pip.vcs.VersionControl(url="http://path/to/repo/")
...but it gives no joy.
I need help with some of the basics aparently - like how can I use pip to pull/export a copy of an svn repo into a given local directory. Ultimately, I will also need to use it for git and mercurial checkouts as well as standard pypi installs. Any links, docs or pointers would be much appreciated.
Pip uses a particular format for vcs urls. The format is
vcsname+url#rev
#rev is optional, you can use it to reference a specific commit/tag
To use pip to retrieve a repository from a generic vcs to a local directory you may do this
from pip.vcs import VcsSupport
req_url = 'git+git://url/repo'
dest_path = '/this/is/the/destination'
vcs = VcsSupport()
vc_type, url = req_url.split('+',1)
backend = vcs.get_backend(vc_type)
if backend:
vcs_backend = backend(req_url)
vcs_backend.obtain(dest_path)
else:
print('Not a repository')
Check https://pip.pypa.io/en/stable/reference/pip_install/#id8 to know which vcs are supported

Categories