How to contact specific pypi server using Chef and python_pip

How to contact specific pypi server using Chef and python_pip - python

We're running our own PyPi server. Now we're starting to use Chef to handle deployments. I'm trying to figure out the best approach to pip install from within a Chef recipe, contacting our custom server and passing credentials.
Typically we would intall packages like this:
pip install -i http://<server address:portno>/simple extremely_cool_package
The server prompts for username and password. The server speaks basic access authentication since it's behind our firewall.
Can python_pip do all this, and if so how? If not, what's the best practice?

The following isn't optimal, but it gets the job done:
python_pip "extremely_cool_package" do
action :install
options "--index-url=http://username:password#server address:portno/simple"
end

I was facing the same problem when trying to install from a pip repository.
The easiest way is to introduce your credentials ~/.pip/pip.conf in the form:
[global]
index-url = http://user:password#server_address:portno/simple
You can also put pip.conf in the root of your virtualenv folder and use pip install inside the virtualenv session.
Edit:
In order of avoiding your default pip library being replaced by the new one you should use
[global]
extra-index-url = http://user:password#server_address:portno/simple

Related

Hostnames/IP adresses of packages from pypi, CRAN, maven,

We have server behind proxy and we want this server to be able to run commands such as:
python: pip install module
R: install.packages("fortunes")
...
Simply to install packages from these sources. Since we are behind proxy, we cannot install these unless the proxy has them whitelisted (otherwise the proxy probihits the connection between the server and wherever the package resides).
My question is: what should we whitelist to be able to run these commands?
I am not sure how the package websites actually works (whether they store the packages themselves or it is just the index and the actual packages resides on other domains/hostnames/...). I believe pypi is quite friendly here (packages are actually found there), but CRAN or Maven = don't know. We are running Spark servers, so our primary concerns are python, R, Java or Scala libraries/packages.

Maven: is actually storing packages. Regarding mirroring, see this answer. It also contains the url of the central repository.
Pypi: From the documentation on how to upload a package to the index, it seems like it is also physically storing the packages.
CRAN: also hosts the packages. There are several mirrors, you will need to whitelist one you want to use
You might want to consider setting up an internal mirror where you put your dependencies once, and then don't need to go to the outside internet.

Installing a Git-hosted module with pip on OpenShift

I have a project with a requirements.txt resembling this:
-e git+https://some.gitlab.com/some_group/some_repo#egg=repo
selenium
pywinauto
I made a source secret on OpenShift with my username and password and started the build. Cloning the project goes through, but cloning some_repo fails with an Error: "Can't find Username".
I'm a bit confused because the main project was successfully cloned with the credentials provided in the secret, but it doesn't seem like Pip is reusing those.
What's more confusing is that OpenShift seems to store the credentials in a .gitconfig file, that should be known to Pip:
I0107 15:35:14.756570 1 password.go:84] Adding username/password credentials to git config:
# credential git config
[credential]
helper = store --file=/tmp/gitcredentials.324456941
Any idea ?
P.S. I wanted to try with an SSHKey but for some reason the admins doesn't want to enable this option on the company's GitLab. And I don't want to put some credentials in the url inside the requirements.txt.
Edit : I have no problem with this on my workstation

pip expects you to add the username and password as a part of the URL if you are not using ssh keys. You could set the secrets as environment variables and refer to them in your pip.conf.
[global]
index = https://$username:$password#some.gitlab.com/some_group/some_repo

Python flask saml throwing saml2.sigver.SigverError Error Message

Has anyone succesfully implemented flask-saml using Windows as dev environment, Python 3.6 and Flask 1.0.2?
I was given the link to the SAML METADATA XML file by our organisation and had it configured on my flask app.
app.config.update({
'SECRET_KEY': 'changethiskeylaterthisisoursecretkey',
'SAML_METADATA_URL': 'https://<url>/FederationMetadata.xml',
})
flask_saml.FlaskSAML(app)
According to the documentation this extension will setup the following routes:
/saml/logout/: Log out from the application. This is where users go
if they click on a “Logout” button.
/saml/sso/: Log in through SAML.
/saml/acs/: After /saml/sso/ has sent you to your IdP it sends you
back to this path. Also your IdP might provide direct login without
needing the /saml/sso/ route.
When I go to one of the routes http://localhost:5000/saml/sso/ I get the error below
saml2.sigver.SigverError saml2.sigver.SigverError: Cannot find
['xmlsec.exe', 'xmlsec1.exe']
I then went to this site https://github.com/mehcode/python-xmlsec/releases/tag/1.3.5 to get xmlsec and install it. However, I'm still getting the same issue.
Here is a screenshot of how I installed xmlsec
where does not seem to find the xmlsec.exe

documentationis asking to have xmlsec1 pre-installed. What you installed is a python binding to xmlsec1.
Get a windows build of xmlsec1 from here or build it from source
And make it available in the PATH.

xmlsec won't work properly in windows, better use Linux environment
Type the below command before giving pip install xmlsec
sudo apt-get install xmlsec1

Checking out private repository on pip install

I've been working on an update to a server that requires a private GitHub repository. I can download the repo on my machine, because I'm able to enter a password when prompted. When I try to do this with my server, which is running on an Amazon EC2 instance, I don't have these prompts, so the module from the GitHub repository is not installed. Is there a way for me to provide the username and password in the installing file that I'm using for pip, so I can have the private repo module install successfully?
I'm using the -e git+<url>#egg=<name> in my requirements.txt

You can use SSH links instead of HTTPS links, like git#github.com:username/projectname.git instead of https://github.com/username/projectname.git, and use authentication keys instead of a password.
Step by step, you have to:
Change URL in requirements.txt to git#....
Create a key pair for your deployment machine and store it in ~/.ssh/ directory.
Add the key to you Github account.
Read the GitHub help pages for more detailed instructions.

Heroku: Python dependencies in private repos without storing my password

The Problem
My problem is exactly like How do I install in-house requirements for Python Heroku projects? and How to customize pip's requirements.txt in Heroku on deployment?. Namely, I have a private repo from which I need a Python dependency installed into my Heroku app. The canonical answer, given by Heroku's own Kenneth Reitz, is to put something like
-e git+https://username:password#github.com/kennethreitz/requests.git#v0.10.0#egg=requests
in your requirements.txt file.
My security needs prevent my storing my password in a repo. (I also do not want to put the dependency inside my app's repo; they're separate pieces of software and need to be in separate repos.) The only place I can give my password (or, preferably, a GitHub OAuth token or deployment key) to Heroku, is in an environment variable like
heroku config:add GITHUB_OAUTH_TOKEN=12312312312313
Attempted Solutions
I could use a custom .profile in my app's repo, but then I'd be downloading and installing my dependency each time a process (web, worker, etc) restarts.
This leaves using a custom buildpack and the Heroku Labs addon that exposes my heroku config environment before the buildpack compiles. I tried building one on top of Buildpack Multi. The idea is Buildpack Multi is the primary buildpack, and using the .buildpacks file in my app's repo, it first downloads the normal Heroku Python buildpack, then my custom one.
The trouble is even after Buildpack Multi successfully runs the Python buildpack, the Python binary and Pip package are not visible to my buildpack once Buildpack Multi runs. So the custom buildpack just fails outright. (In my tests, the GITHUB_OAUTH_TOKEN environment variable was correctly exposed to the buildpacks.)
The only other thing I can think to try is to make my own fork of the Python buildpack that installs my dependency when it installs everything from requirements.txt, or even rewrites requirements.txt directly. Both of these seem like really heavy solutions to what I would think is a very common problem.
Update: Current Workaround
My custom buildpack (linked above) now downloads and saves my closed-source dependency ("foo") into the vendor directory that the geos buildpack uses. I committed into my app the dependencies that foo itself has into my app's requirements.txt. Thus Pip installs foo's dependencies through my app's requirements.txt and the buildpack adds the vendored copy of foo to my app's environment's PYTHONPATH (so foo's setup.py install never runs).
The biggest problem with this approach is coupling my (admittedly badly written) buildpack with my app. The second problem is that my app's requirements.txt should just list foo as a dependency and leave foo's dependencies to foo to determine. Lastly, there isn't a good way to give myself in six months from now when I forget how I did all this an error message if I forget to set my GITHUB_OAUTH_TOKEN environment variable (or, producing even less useful error feedback would be if the token expires and the environment variable still exists but is no longer valid).
Cry for Help
What (likely obvious) thing am I missing? How have you solved this problem in your apps? Any suggestions on getting my build pack to work, or hopefully an even simpler solution?

I created a buildpack to solve this problem using a custom ssh key stored as an environment variable. As the buildpack is technology agnostic, it can be used to download dependencies using any tool like composer for php, bundler for ruby, npm for javascript, etc: https://github.com/simon0191/custom-ssh-key-buildpack
Add the buildpack to your app:
$ heroku buildpacks:add --index 1 https://github.com/simon0191/custom-ssh-key-buildpack
Generate a new SSH key (lets say you named it deploy_key)
Add the public key to your private repository account. For example:
Github: https://help.github.com/articles/adding-a-new-ssh-key-to-your-github-account/
Bitbucket: https://confluence.atlassian.com/bitbucket/add-an-ssh-key-to-an-account-302811853.html
Encode the private key as a base64 string and add it as the CUSTOM_SSH_KEY environment variable of the heroku app.
Make a comma separated list of the hosts for which the ssh key should be used and add it as the CUSTOM_SSH_KEY_HOSTS environment variable of the heroku app.
# MacOS
$ heroku config:set CUSTOM_SSH_KEY=$(base64 --input ~/.ssh/deploy_key) CUSTOM_SSH_KEY_HOSTS=bitbucket.org,github.com
# Ubuntu
$ heroku config:set CUSTOM_SSH_KEY=$(base64 ~/.ssh/deploy_key) CUSTOM_SSH_KEY_HOSTS=bitbucket.org,github.com
Deploy your app and enjoy :)

I faced the same problem. Like you, I am amazed how difficult it is to find good documentation on how to install private dependency (whatever the language and the service used).
Because this is not a main concern of service providers, I now try a systematic approach relying as few as possible on idiosyncratic features. I try to find the easier solution for each of these steps:
Pass the credentials to the build environment using a secure channel. For python, use an environment variable containing a SSH key as a base64 string. For js, same with the npm token.
Configure the build process to use these credentials. In the best case it involves configuring ssh to use a deploy key. Otherwise it can as basic as cloning the dependency for later use. For your specific case with python and heroku, you can use the hook 'pre_compile'.
I detailed the process for my future self here: https://gist.github.com/michelbl/a6163522d95540cf0c8b6667bd35d5f5
I need to give access to a private dependency. It can happen for continuous integration or deployment.
Here we use python and github, using the services CircleCI and Heroku. However, the principles applies everywhere.
What is a deploy key?
See https://developer.github.com/v3/guides/managing-deploy-keys/
There are 4 ways of granting access to a private dependency, but deploy keys are a good compromise in term of security and ease of use for projects that do not require too many dependencies (in that case, prefer a machine user). In any case, do not use username/password of a developer account or oauth token as they do not provide privilege limitation.
Create a deploy key:
ssh-keygen -t rsa -b 4096 -C "myself#my_company.com"
Give the public part to gihub.
Give the private part to the service needing access. See below.
General strategy
Whatever the service or the technology that I use, the goal is to access the git repo using ssh, using the deploy key.
Obviously, I do not want to put the deploy key in the repo. But most services (CI, deployment) provide a way to set protected environment variables that can be used at build time. The key can be encoded using base64:
cat deploy-key | base64
cat deploy-key.pub | base64
Most services also provide a way to tailor the build procedure. This is needed to configure ssh to use the deploy key.
CircleCI
Set the deploy key using env variables, encode with base64.
In config.yml, add a step:
echo $DEPLOY_KEY_PRIVATE | base64 --decode > ~/.ssh/deploy-key
chmod 400 ~/.ssh/deploy-key
echo $DEPLOY_KEY_PUBLIC | base64 --decode > ~/.ssh/deploy-key.pub
ssh-add ~/.ssh/deploy-key
# Run this to check which private key is used. If the checkout key is used,
# github replies "Hi my_org/my_package". If the deploy key is used as wished,
# github replies "Hi my_org/my_dependency".
#ssh -i ~/.ssh/deploy-key -T git#github.com || true
# Now pip connects to git+ssh using the deploy key
export GIT_SSH_COMMAND="ssh -i ~/.ssh/deploy-key"
pip install -r requirements.txt
requirements.txt can be something like:
# The purpose of this file is to install the private dependency *before*
# setup.py is run.
# Be sure ssh is configured to use a ssh key with read permission to the repo.
git+ssh://git#github.com/my_org/my_dependency#1.0.10
# Run setup.py. The private dependency is already installed with the good
# version so pip doesn't try to fetch it from PyPI.
--editable .
and setup.py does not care about the dependency beeing private:
from distutils.core import setup
setup(
name='my_package',
version='1.0',
packages=[
'my_package',
],
install_requires=[
# Beware, the following package is a private dependency.
# Python provides several way to install private dependencies, none
# are really satisfactory.
# 1. Use dependency_links / --process-dependency-links. Good luck with
# that!
# 2. Maintain a private package repository. Good luck with that!
# 3. Install the private dependency separately before setup.py is run.
# This is now the prefered way. Be sure that ssh is properly
# configured to use a ssh key with read permission to the github repo
# of the private dependency, then run:
# `pip install -r requirements.txt`
'my_dependency==1.0.10',
... # my normal dependencies
'unidecode==1.0.22',
'uwsgi==2.0.15',
'nose==1.3.7', # tests
'flake8==3.5.0', # style
],
)
Heroku
For python, there is no need to write a custom buildpack. First, set the deploy key using env variables, encode with base64.
Then add the hook bin/pre_compile:
# This script configures ssh on Heroku to use the deploy key.
# This is needed to install private dependencies.
#
# Note that this does not work with Heroku review apps. Indeed review apps can
# inherits env variables from their parents, but they access their values after
# the build. You would need a way to pass the ssh key to this script another
# way.
#
# See also
# * https://stackoverflow.com/questions/21297755/heroku-python-dependencies-in-private-repos-without-storing-my-password#
# * https://github.com/bjeanes/ssh-private-key-buildpack
# Ensure we have an ssh folder
if [ ! -d ~/.ssh ]; then
mkdir -p ~/.ssh
chmod 700 ~/.ssh
fi
# Create the key files
cat $ENV_DIR/DEPLOY_KEY | base64 --decode > ~/.ssh/deploy-key
chmod 400 ~/.ssh/deploy-key
cat $ENV_DIR/DEPLOY_KEY | base64 --decode > ~/.ssh/deploy-key.pub
#ssh-add ~/.ssh/deploy-key
# If you want to disable host verification, you could use that.
#ssh -oStrictHostKeyChecking=no -T git#github.com 2>&1
# Run that if you want to check that ssh uses the correct key.
#ssh -i ~/.ssh/deploy-key -T git#github.com || true
# Configure ssh to use the correct deploy key when connecting to github.
# Disables host verification.
echo -e "Host github.com\n"\
" IdentityFile ~/.ssh/deploy-key\n"\
" IdentitiesOnly yes\n"\
" UserKnownHostsFile=/dev/null\n"\
" StrictHostKeyChecking no"\
>> ~/.ssh/config
# Unfortunately this does not seem to work.
#export GIT_SSH_COMMAND="ssh -i ~/.ssh/deploy-key"
# The vanilla python buildpack can now install all the dependencies in
# requirement.txt

Create a private PyPI server
If you create your own PyPI server, you can simply list your packages in your requirements.txt file and then store the url for your server (including username and password) in the config variable, PIP_EXTRA_INDEX_URL.
For example:
heroku config:set PIP_EXTRA_INDEX_URL='https://username:password#privateserveraddress.com/simple'
Note that this is the same as using the pip install command line option, --extra-index-url. (See https://pip.pypa.io/en/stable/user_guide/#environment-variables)
The primary index url will still be the default (https://pypi.org/simple). This means that pip will first attempt to resolve package names in your requirements file at the default PyPI server, and then try your private server second.
If you need packages in your private server that have the same name as packages in PyPI, then you need the primary index url to be your server and the --extra-index-url option to be the default server's url. You would need to do this if you want to host your own version of an existing package without changing the package name. I haven't tried this, but it currently looks like you would need to to create a fork of heroku's official python buildpack and make a small change to the bin/steps/pip-install file.
The reason pip has access to the PIP_EXTRA_INDEX_URL is because of this block in that file:
# Set Pip env vars
# This reads certain environment variables set on the Heroku app config
# and makes them accessible to the pip install process.
#
# PIP_EXTRA_INDEX_URL allows for an alternate pypi URL to be used.
if [[ -r "$ENV_DIR/PIP_EXTRA_INDEX_URL" ]]; then
PIP_EXTRA_INDEX_URL="$(cat "$ENV_DIR/PIP_EXTRA_INDEX_URL")"
export PIP_EXTRA_INDEX_URL
mcount "buildvar.PIP_EXTRA_INDEX_URL"
fi
Code like this is necessary to read config variables in buildpacks (see https://devcenter.heroku.com/articles/buildpack-api#buildpack-api), but you should be able to simply duplicate this codeblock, replacing PIP_EXTRA_INDEX_URL with PIP_INDEX_URL. Then set PIP_INDEX_URL to your private server's url and PIP_EXTRA_INDEX_URL to the default PyPI url.
If you are using another source instead of a private PyPI server, such as github, and simply need a way to avoid hardcoding a username and password in your requirements.txt file, then also note that you can use environment variables in requirements.txt (see https://pip.pypa.io/en/stable/reference/pip_install/#using-environment-variables). You would just have to export them in bin/steps/pip-install as you would for PIP_INDEX_URL.

You could use a pre-compile step as described here to run something like M4 to do substitutions on your requirements.txt to file in the password from the environment variable.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.