Using git behind a proxy in python scripts

Using git behind a proxy in python scripts - python

I work with a proxy which doesn't like git. In most of the cases, I can use export http_proxy and git config --global url."http://".insteadOf git://.
But when I use Yocto's python script, this workaround doesn't work anymore. I'm systematically stopped at Getting branches from remote repo git://git.yoctoproject.org/linux-yocto-3.14.git.... I suspect these lines to be responsible :
gitcmd = "git ls-remote %s *heads* 2>&1" % (giturl)
tmp = subprocess.Popen(gitcmd, shell=True, stdout=subprocess.PIPE).stdout.read()
I think that after these lines, others will try to connect to git url. The script I use (yocto-bsp) calls others scripts, which call scripts, so it's difficult to say.
I have tried to add os.system(git config --global url."http://".insteadOf git://) just before, but it does peanuts.
Of course, I could try and modify all the url manually (or with a parsing script) to replace git:// by http:// manually, but this solution is... hideous. I'd like the modification(s) to be as small as possible and reproductible easily. But most of all, I'd like a working script.
EDIT : according to this page, the git url is git://git.yoctoproject.org/linux-yocto-3.14 but the correspondant http url is http://git.yoctoproject.org/git/linux-yocto-3.14, so I can't just parse to replace git:// by http://. Definitely not cool.

Well, rewriting the git url does indeed work, also when using YP.
However, you're rewriting scheme doesn't work that well... You're just replacing the git:// part or the url with http://, but if you look at e.g. linux-yocto-3.14, you'll see that this repo is available through the following two URL's:
git://git.yoctoproject.org/linux-yocto-3.14
http://git.yoctoproject.org/git/linux-yocto-3.14
That is you need to rewrite git://git.yoctoproject.org to http://git.yoctoproject.org/git. Thus, you'll need to do this instead:
git config --global url."http://git.yoctoproject.org/git".insteadOf git://git.yoctoproject.org
Which means that you'll have to repeat this exercise for all repositories that are accessed through the git protocol.

Related

option --decorate-refs is ignored when calling git-log from python subprocess

I am stuck at this error. I have tried to search a bunch of things, i tried following the call using debugger. I am none the wiser.
My problem:
I run this command from command line
git log --format=format:%D --simplify-by-decoration --decorate-refs=*platVer*
and i get the expected list of tags
tag: platVer/222.3.4123, tag: myplatVer-222.3.4123
tag: platVer-20.07.000
tag: platVer-20.06.000
tag: platVer-20.05.000
if I run this from python on command line, i also get the expected list
>>> from subprocess import call, Popen, PIPE
>>> pp = Popen(['git', 'log', '--decorate-refs=*platVer*', '--format=format:%D', '--simplify-by-decoration'])
tag: platVer/222.3.4123, tag: myplatVer-222.3.4123
tag: platVer-20.07.000
tag: platVer-20.06.000
tag: platVer-20.05.000
Running this line in idle or in a script the output is not captured (as expected), to enable capture of stdout, popen needs stdout parameter set to PIPE.
but if I run with stdout=PIPE, it appears to ignore the '--decorate-refs=*platVer*' and just list the entire set of refs
>>> pp = Popen(['git', 'log', '--decorate-refs=*platVer*', '--format=format:%D', '--simplify-by-decoration'], stdout=PIPE)
>>> pp.stdout.read()
b'HEAD -> feature/ps2python, origin/feature/ps2python\ntag: platVer/222.3.4123, tag: myplatVer-222.3.4123, tag: mao_test ....
I get the same when I run this from a script or in idle.
from subprocess import Popen, PIPE
pp = Popen(['git', 'log', '--decorate-refs=*platVer*', '--format=format:%D', '--simplify-by-decoration'], stdout=PIPE)
print( pp.stdout.read().decode('ascii' ) )
gives me this
HEAD -> feature/ps2python, origin/feature/ps2python
tag: platVer/222.3.4123, tag: myplatVer-222.3.4123, tag: mao_test
show-current, develop
tag: platVer-20.07.000,
... (cut the remaining many many lines of refs)
I am running on windows 10 (Version 10.0.18363.778)
git version 2.29.2.windows.2
python version 3.8.5
I tried with shell=Tre/False, universal_newlines=True/False
I tried it in WSL (ubuntu)
All gave same result
Then I tried in a virtual ubuntu 18.LTS. with git version 2.17. And here I got the wired results, where '--decorate-refs=*platVer*' is ignored. from command line.
I then updated the git to newer version (2.29.2) on this ubuntu. And now the command work exactly as expected....
I then tried the same commands from python, same result as on the win10 machine.
Please help. Can't figure out how setting stdout=PIPE can change the behaviour of the git command.
edit:
I did check that the same version of git is called with and without PIPE
Edit2:
I marked #torek 's answer as the accepted as it solves my question perfectly.
However I should have stated the goal of my use of git-log to allow for broader answers.
My goal is to find the tag that is the first tag found when traveling
back in history (topological or graph ordering) and that matches a
regular expression.
I was previously using rev-list, but found no documentation that this would deliver tags in the order i wanted, maybe i missed something.
The reason I use a simple glob pattern in my command when I at the same time state that I need a regex match, is that I assume the globbing to be faster, and therefore use it as a prefilter to shorten the list that needs to be parsed by the regular expression in python. I expect that the list of tags, in a few years, to contain 1000+ tags and growing. where the tags with the word 'platVer' will be around 1% of that list.

Add --decorate=full or --decorate=short to your git log arguments. You can also use --decorate=true or --decorate=1, but full and short are the documented value these days. Full includes the full name (e.g., refs/heads/somebranch) while short shortens to branch or tag names.
Long (but optional) useful background info
The default log.decorate setting is auto (since Git 2.9 anyway; before that it was no/0/false, and at various points various bugs were introduced and then fixed in later versions; it's been stable since Git 2.13). The auto setting mean short if a human is reading the output, no if a program is reading the output.1
The decorations themselves are required (i.e., must be turned on) for --simplify-by-decoration --decorate-refs=... to work. Probably either of these options should imply --decorate=short if it's currently still auto from being unset in the Git configuration.2
This all points to a more general problem with using git log programmatically, e.g., from Python with subprocess: git log is what Git calls a porcelain command, which means it obeys user configurations. If the user has a log.decorate setting, that overrides any defaults. Now that you know about log.decorate and the --decorate= argument, you can force correct behavior in your program using the --decorate= argument (which overrides any user configuration). But what other user-configurable items exist in git log that could break your program? What about future versions of Git, where git log might acquire new configuration items? There is nothing you can do about this more-general problem today, unfortunately, but since some things that git log does cannot be done by any of the so-called plumbing commands—these are commands that don't change behavior based on user configuration, and hence are useful from other programs as they have a known, fixed output format—git log needs an option to make it behave well. (The git status command has the --porcelain option for this; git log just needs its own version of that.)
1Git doesn't actually know if a human is reading the output. Instead, it approximates this by examining the standard output stream: if the standard output (file descriptor 1) responds with a true value for the isatty C library call, or git log output is being fed to a pager, it assumes a human is reading the output. Use of pipes in subprocess means that stdout is not a tty, which by default disables the pager too. However, there's a user configuration setting that forces the pager to be used: see the "more general problem" paragraph.
2In general, the way Git configurations work is this:
First, the program sets any automatic defaults, such as log.decorate=auto (this is typically just open-coded, rather than using the configuration mechanism).
Next, Git reads the system configuration file. If this has a setting such as log.decorate=short in it, that setting applies, overriding the automatic default. (This usually works through callbacks, from the configuration mechanism to the program.)
Next, Git reads your personal global configuration file. If this has a setting such as log.decorate=auto in it, that setting applies. If the previous configuration had a setting, this overwrites that previous setting.
Next, Git reads the configuration file for this particular Git repository. If this has a setting such as log.decorate=full, that setting applies, overwriting any previous setting as before.
Last, Git applies command-line argument settings. These therefore override any settings picked up in any of the previous steps.
This is how, for instance, you can arrange your user.name and/or user.email to be different for one particular Git repository. You set these in your global config, which Git reads before it reads the per-repository config; then you set them to the different value in the per-repository config, and that overrides the global config.
In relatively recent versions of Git, you can also set up a per-worktree configuration: git config --worktree. This is read after the per-repository config file, but used before command line arguments, so it has the second-highest priority. For the per-worktree setting to take effect, you must enable extensions.worktreeConfig. Be careful here as there were some bugs with this extension for a little while.

From Python (3.6), copy file(s) with a given user's permission to a root controlled space

I have a problem and a solution, but frankly I'm not very happy with my solution and think there might be something better.
What I want to achieve:
I start as root (this will be executed from cron eventually). I want to copy a file with a given path, which belongs to a user over to a space that root controls (which implies a change in permissions). This file may be large. I also want to obey the file access permissions of the particular given user while I read the file, as the script will be acting on behalf of the user (and not a different user). I'd also like to do some sensible debugging should some part of this copy fail.
The rest of my code is Python, so ideally I'd like a pure Python way to do this.
In BASH, I can do it like this:
sudo -u <user> dd if=<in_file> | dd of=<out_file>
I have missed out some other flags to simplify things. Sudo looses scope after the pipe so the copied file is written out as root, which is what I want.
After that command, I can query ${PIPESTATUS[*]} to see if the first part or the second part failed, without having to try and parse the error messages.
What I have done to Pythonize it
templateDD="\
sudo -u {user} \
dd if=\'{inFile}\' bs={blockSize} status=none | \
dd of=\'{outFile}\' status=none ; \
echo ${{PIPESTATUS[*]}} \
"
subprocess.run(
templateDD.format(**fileCopyD),
shell=True,
executable='/usr/bin/bash',
stdout=subprocess.PIPE,
stderr=subprocess.PIPE
)
I haven't included fileCopyD, but I'm sure imagination can fill that gap.
Returned is a CompletedProcess object with a single bytestring of the two pipe error states as stdout.
This is a bit ugly, in fact having to make system calls at all is unpleasant as one would rather control everything within Python. Secondly, this command is very dependant on BASH as the PIPESTATUS array is specific to that. In general it is good for code to be more portable, even if BASH is most places.
My thoughts on better solutions
The first part of this that I suspect can be improved is that I could probably get rid of the second dd and the querying of PIPESTATUS. The first dd would then be writing to stdout, which could probably be somehow captured, read into a buffer and written out to a file within the scope of Python code. I don't know how to do that though as even the subprocess.run command that I have used is at the edge of my experience.
I wonder if the sudo -u <user> could also be replaced by something in python though? I had a hint that I might be able to use os.setuid and os.setgid. My perceived problems with this are:
I need to look up the uid of the user, but the UIDs are not in passwd but come from SSSD.
I need to make sure that all group affiliations of the user are taken into account (also from SSSD).
I need to be confident that root has properly become the user with regard to all matters related to read access. I may not properly understand all environmental variables and such which must be set.
If problem 2 was solved, I guess that solves my BASH problem, but I need to make sure that the copied file is written with root permission.
Many thanks in advance for help on this.

ServiceNow GlideRecord sysparm_query Python

Has anyone used the GlideRecord library for python? I can't seem to get it to perform some fairly basic functionality. I want to add a few sysparm_query parameters. This is just a code snippet, I had to manually edit it for security purposes. Hopefully I didn't introduce any typo errors.
for i in glide1, glide2:
i.set_credentials('xxxx', 'xxxx')
i.set_server("https://<instance>.service-now.com/")
i.addQuery("active", "true")
def getIncidents(glide1):
group = "mygroup"
glide1.addQuery('assignment_group', group)
print glide1.query_data['sysparm_query'] + '\n'
print glide1.getQuery()[50:] #just to avoid too much output
gives me the output:
active=true^assignment_group=mygroup
displayvalue=true&JSONv2&sysparm_record_count=100&sysparm_action=getRecords&sysparm_query=
I cannot get the query data to append. Perhaps I should look at doing the queries manually? Here is a link to the GlideRecord git:
https://github.com/bazizi/ServiceNow_GlideRecord_API/blob/master/GlideRecord/init.py
Cheers, Arthur

I just realized that the getQuery() member function I had defined only returned the base query URL (not including the query itself). I had initially added this function for testing purposes, and wrongfully added this to the documentation.
I just fixed this issue and committed to the GitHub repository. Please pull from the git repository again or if you installed using PIP, run the following commands to re-install it from scratch:
pip uninstall GlideRecord
pip install GlideRecord
In terms of setting the assignment group by name, however, I still need to find out how ServiceNow hashes the assignment_group, or if there is another way this query can be added; That is, I have no fix for now.
Thanks
Behnam

How to setup environment variables with behave (Python BDD framework)?

So our test environments dynamically change depending on the release that we are working on.
For example:
for abc release the URL for the test environment would be feature-abc.mycompany.com, for xyz release the URL for the test environment would be feature-xyz.company.com and so on so forth.
Same thing would be for staging: release-abc.mycompany.com, release-xyz.mycompany.com, etc..
Production is just static URL: platform.mycompany.com
With this being said, I need to specify on which URL I would like my tests to be executed using behave BDD framework for Python.
To be specific Im looking for the equivalent functionality that cucumber has for Ruby using: features/support/env.rb file to define multiple URL (qa, staging, production, etc) so that on the command-line (terminal) I would just say xyz (having qa = feature(the release).mycompany.com
Something like: How can I test different environments (e.g. development|test|production) in Cucumber?

Ok, so for this there is a Pull Request (PR #243) to be able to do this in behave's github repo.
In the meantime as a workaround they suggested me to use os.getenv('variable_name', 'default_value'), and then at the command line I would just say export variable_name='another_value' ; behave
Please see more detailed on this on our short thread:
https://github.com/behave/behave/issues/250

behave-1.2.5 introduced the userdata concept.
behave -D BUILD_STAGE=develop …
Load the corresponding configuration for this stage in the before_all() hook.

How do I apply a patch from gist to the Django source?

I want to try out a patch on gist that modifies the source code of Django:
gist: 550436
How do I do it? I have never used git so a step by step instruction would be greatly appreciated.

You can use patch to apply diffs. Make sure you're in your django source directory (or wherever you want to apply the patch), and run something like patch -p1 < downloaded-patch.diff.
You may want to experiment with the -p argument if it fails; -p tells patch to strip some of the directory prefix for each file in the diff (look at the first line in the diff).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.