Gittle. How to get a true list of modified files? - python

I try to get a list of all modified files like this:
repo_path = '.'
repo_url = 'git#github.com:username/myfolder.git'
repo = Gittle(repo_path, origin_uri = repo_url)
repo.modified_files
But the list of files I get is rather huge and contains garbage. In fact, it has nothing to do with the list that I get if I run in console:
$ git status
So, my question is how to get a true list of modified files with Gittle library. I want this list in order to perform addition and commits.

Sounds like you are executing from the wrong path.
import os
path = '/path/to/repo' # or use input from command line
savedPath = os.getcwd()
# do init stuff
os.chdir(path)
# do repo stuff
repo_path = '.'
repo_url = 'git#github.com:username/myfolder.git'
repo = Gittle(repo_path, origin_uri=repo_url)
print repo.modified_files
# finish repo stuff
os.chdir(savedPath)
# do other stuff

Related

GitPython with remotes

I'm trying to solve an easy task with git in python. Actually I just want something like this
git clome remote-repo
git pull
edit file
git commit file
git push
rm -rf remote-repo
So just change a file and commit it and push it right away. But I have my problems get my head arround the documentation http://gitpython.readthedocs.org/en/stable/tutorial.html
This is what I got so far
import git
import shutil
import os
repo_url = 'git#git.internal.server:users/me/mytest'
repo_dir = '/tmp/test/repo'
work_file_name = 'testFile'
work_file = os.path.join(repo_dir, work_file_name)
if os.path.isdir(repo_dir):
shutil.rmtree(repo_dir)
repo = git.Repo.clone_from(repo_url, repo_dir)
git = repo.git
git.pull()
new_file_path = os.path.join(repo.working_tree_dir, work_file_name)
f = open(new_file_path, 'w')
f.write("just some test")
f.close()
repo.index.commit("test commit")
git.push()
shutil.rmtree(repo_dir)
It actually creates a commit and pushes it to the server, but it contains no changes, so the file I tried to write is empty. Any idea what I'm missing?
You have to stage the file/change before the commit:
repo.index.add([new_file_path])
As a side note: The following line overrides your import and might lead to errors if you ever expand your script:
git = repo.git

git push through python subprocess.Popen() with authentication

I am trying to use git with python subprocess.Popen()
So far this is my code
import subprocess
gitPath = 'C:/path/to/git/cmd.exe'
repoPath = 'C:/path/to/my/repo'
repoUrl = 'https://www.github.com/login/repo'
#list to set directory and working tree
dirList = ['--git-dir='+repoPath+'/.git','--work-tree='+repoPath]
#init git
subprocess.Popen([gitPath] + ['init',repoPath],cwd=repoPath)
#add remote
subprocess.Popen([gitPath] + dirList + ['remote','add','origin',repoUrl],cwd=repoPath)
#Check status, returns files to be committed etc, so a working repo exists there
subprocess.Popen([gitPath] + dirList + ['status'],cwd=repoPath)
#Adds all files in folder
subprocess.Popen([gitPath] + dirList + ['add','.'],cwd=repoPath)
#Push, gives error:
subprocess.Popen([gitPath] + dirList + ['push','origin','master],cwd=repoPath)
This works, except for the last command. That's where I get this error:
bash.exe: warning: could not find /tmp, please create!
fatal: 'git' does not appear to be a git repository
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
Of course I wouldn't expect it to work, since I did not put my login details anywhere in the code. I do not have any idea how I can add it though. I have a folder /.ssh in the C:/users/myUser directory. I tried changing the last line of my code to this:
env = {'HOME' : 'C:/users/myUser'}
subprocess.Popen([gitPath] + dirList + ['push','origin','master'],cwd=repoPath,env=env)
in the hope of git finding the /.ssh folder, but without any luck. I also tried without 'dirList', but it didn't matter. I also tried changing the name 'origin' into an url, but that also didn't work.
I do not mind if I am using the .ssh keys I already created, or if I have to use a method with login/password. I am not looking to use a git library though.
There might be a race condition in this script. Earlier subprocess
might not finish while a next one errors out because a git command depends on the previous ones. Instead it is possible to start a subprocess and wait until it is finished with
subprocess.run() or subprocess.check_ouput(). Might still need adjustment depending on layout
import subprocess
import shutil
# get a location of git
gitPath = shutil.which('git')
repoPath = 'C:/path/to/my/repo'
repoUrl = 'https://www.github.com/login/repo'
# list to set directory and working tree
dirList = ['--git-dir='+repoPath+'/.git', '--work-tree='+repoPath]
# init git
subprocess.run([gitPath] + ['init', repoPath], cwd=repoPath)
# add remote
subprocess.run([gitPath] + dirList + ['remote', 'add', 'origin', repoUrl], cwd=repoPath)
# check status, returns files to be committed etc, so a working repo exists there
subprocess.run([gitPath] + dirList + ['status'], cwd=repoPath)
# adds all files in folder
subprocess.run([gitPath] + dirList + ['add', '.'], cwd=repoPath)
# push
subprocess.run([gitPath] + dirList + ['push', 'origin', 'main'], cwd=repoPath)
For credentials could use Windows Credential Manager, cache, manager, --global, etc. - depends on what's needed.
git config --global credential.helper cache
# or in python
subprocess.run([gitPath] + ['config', 'credential.helper', 'cache', repoPath], cwd=repoPath)

How to create symlinks in windows using Python?

I am trying to create symlinks using Python on Windows 8. I found This Post and this is part of my script.
import os
link_dst = unicode(os.path.join(style_path, album_path))
link_src = unicode(album_path)
kdll = ctypes.windll.LoadLibrary("kernel32.dll")
kdll.CreateSymbolicLinkW(link_dst, link_src, 1)
Firstly, It can create symlinks only when it is executed through administrator cmd. Why is that happening?
Secondly, When I am trying to open those symlinks from windows explorer I get This Error:
...Directory is not accessible. The Name Of The File Cannot Be Resolved By The System.
Is there a better way of creating symlinks using Python? If not, How can I solve this?
EDIT
This is the for loop in album_linker:
def album_Linker(album_path, album_Genre, album_Style):
genre_basedir = "E:\Music\#02.Genre"
artist_basedir = "E:\Music\#03.Artist"
release_data_basedir = "E:\Music\#04.ReleaseDate"
for genre in os.listdir(genre_basedir):
genre_path = os.path.join(genre_basedir, "_" + album_Genre)
if not os.path.isdir(genre_path):
os.mkdir(genre_path)
album_Style_list = album_Style.split(', ')
print album_Style_list
for style in album_Style_list:
style_path = os.path.join(genre_path, "_" + style)
if not os.path.isdir(style_path):
os.mkdir(style_path)
album_path_list = album_path.split("_")
print album_path_list
#link_dst = unicode(os.path.join(style_path, album_path_list[2] + "_" + album_path_list[1] + "_" + album_path_list[0]))
link_dst = unicode(os.path.join(style_path, album_path))
link_src = unicode(album_path)
kdll = ctypes.windll.LoadLibrary("kernel32.dll")
kdll.CreateSymbolicLinkW(link_dst, link_src, 1)
It takes album_Genre and album_Style And then It creates directories under E:\Music\#02.Genre . It also takes album_path from the main body of the script. This album_path is the path of directory which i want to create the symlink under E:\Music\#02.Genre\Genre\Style . So album_path is a variable taken from another for loop in the main body of the script
for label in os.listdir(basedir):
label_path = os.path.join(basedir, label)
for album in os.listdir(label_path):
album_path = os.path.join(label_path, album)
if not os.path.isdir(album_path):
# Not A Directory
continue
else:
# Is A Directory
os.mkdir(os.path.join(album_path + ".copy"))
# Let Us Count
j = 1
z = 0
# Change Directory
os.chdir(album_path)
Firstly, It can create symlinks only when it is executed through administrator cmd.
Users need "Create symbolic links" rights to create a symlink. By default, normal users don't have it but administrator does. One way to change that is with the security policy editor. Open a command prompt as administrator, run secpol.msc and then go to Security Settings\Local Policies\User Rights Assignment\Create symbolic links to make the change.
Secondly, When I am trying to open those symlinks from windows explorer I get This Error:
You aren't escaping the backslashes in the file name. Just by adding an "r" to the front for a raw string, the file name changes. You are setting a non-existant file name and so explorer can't find it.
>>> link_dst1 = "E:\Music\#02.Genre_Electronic_Bass Music\1-800Dinosaur-1-800-001_[JamesBlake-Voyeur(Dub)AndHolyGhost]_2013-05-00"
>>> link_dst2 = r"E:\Music\#02.Genre_Electronic_Bass Music\1-800Dinosaur-1-800-001_[JamesBlake-Voyeur(Dub)AndHolyGhost]_2013-05-00"
>>> link_dst1 == link_dst2
False
>>> print link_dst1
E:\Music\#02.Genre_Electronic_Bass Music☺-800Dinosaur-1-800-001_[JamesBlake-Voyeur(Dub)AndHolyGhost]_2013-05-00
os.symlink works out of the box since python 3.8 on windows, as long as Developer Mode is turned on.
If you're just trying to create a link to a directory, you could also create a "Junction", no admin privileges required:
import os
import _winapi
src_dir = "C:/Users/joe/Desktop/my_existing_folder"
dst_dir = "C:/Users/joe/Desktop/generated_link"
src_dir = os.path.normpath(os.path.realpath(src_dir))
dst_dir = os.path.normpath(os.path.realpath(dst_dir))
if not os.path.exists(dst_dir):
os.makedirs(os.path.dirname(dst_dir), exist_ok=True)
_winapi.CreateJunction(src_dir, dst_dir)

Get the current git hash in a Python script

I would like to include the current git hash in the output of a Python script (as a the version number of the code that generated that output).
How can I access the current git hash in my Python script?
No need to hack around getting data from the git command yourself. GitPython is a very nice way to do this and a lot of other git stuff. It even has "best effort" support for Windows.
After pip install gitpython you can do
import git
repo = git.Repo(search_parent_directories=True)
sha = repo.head.object.hexsha
Something to consider when using this library. The following is taken from gitpython.readthedocs.io
Leakage of System Resources
GitPython is not suited for long-running processes (like daemons) as it tends to leak system resources. It was written in a time where destructors (as implemented in the __del__ method) still ran deterministically.
In case you still want to use it in such a context, you will want to search the codebase for __del__ implementations and call these yourself when you see fit.
Another way assure proper cleanup of resources is to factor out GitPython into a separate process which can be dropped periodically
This post contains the command, Greg's answer contains the subprocess command.
import subprocess
def get_git_revision_hash() -> str:
return subprocess.check_output(['git', 'rev-parse', 'HEAD']).decode('ascii').strip()
def get_git_revision_short_hash() -> str:
return subprocess.check_output(['git', 'rev-parse', '--short', 'HEAD']).decode('ascii').strip()
when running
print(get_git_revision_hash())
print(get_git_revision_short_hash())
you get output:
fd1cd173fc834f62fa7db3034efc5b8e0f3b43fe
fd1cd17
The git describe command is a good way of creating a human-presentable "version number" of the code. From the examples in the documentation:
With something like git.git current tree, I get:
[torvalds#g5 git]$ git describe parent
v1.0.4-14-g2414721
i.e. the current head of my "parent" branch is based on v1.0.4, but since it has a few commits on top of that, describe has added the number of additional commits ("14") and an abbreviated object name for the commit itself ("2414721") at the end.
From within Python, you can do something like the following:
import subprocess
label = subprocess.check_output(["git", "describe"]).strip()
Here's a more complete version of Greg's answer:
import subprocess
print(subprocess.check_output(["git", "describe", "--always"]).strip().decode())
Or, if the script is being called from outside the repo:
import subprocess, os
print(subprocess.check_output(["git", "describe", "--always"], cwd=os.path.dirname(os.path.abspath(__file__))).strip().decode())
Or, if the script is being called from outside the repo and you like pathlib:
import subprocess
from pathlib import Path
print(subprocess.check_output(["git", "describe", "--always"], cwd=Path(__file__).resolve().parent).strip().decode())
numpy has a nice looking multi-platform routine in its setup.py:
import os
import subprocess
# Return the git revision as a string
def git_version():
def _minimal_ext_cmd(cmd):
# construct minimal environment
env = {}
for k in ['SYSTEMROOT', 'PATH']:
v = os.environ.get(k)
if v is not None:
env[k] = v
# LANGUAGE is used on win32
env['LANGUAGE'] = 'C'
env['LANG'] = 'C'
env['LC_ALL'] = 'C'
out = subprocess.Popen(cmd, stdout = subprocess.PIPE, env=env).communicate()[0]
return out
try:
out = _minimal_ext_cmd(['git', 'rev-parse', 'HEAD'])
GIT_REVISION = out.strip().decode('ascii')
except OSError:
GIT_REVISION = "Unknown"
return GIT_REVISION
If subprocess isn't portable and you don't want to install a package to do something this simple you can also do this.
import pathlib
def get_git_revision(base_path):
git_dir = pathlib.Path(base_path) / '.git'
with (git_dir / 'HEAD').open('r') as head:
ref = head.readline().split(' ')[-1].strip()
with (git_dir / ref).open('r') as git_hash:
return git_hash.readline().strip()
I've only tested this on my repos but it seems to work pretty consistantly.
This is an improvement of Yuji 'Tomita' Tomita answer.
import subprocess
def get_git_revision_hash():
full_hash = subprocess.check_output(['git', 'rev-parse', 'HEAD'])
full_hash = str(full_hash, "utf-8").strip()
return full_hash
def get_git_revision_short_hash():
short_hash = subprocess.check_output(['git', 'rev-parse', '--short', 'HEAD'])
short_hash = str(short_hash, "utf-8").strip()
return short_hash
print(get_git_revision_hash())
print(get_git_revision_short_hash())
if you want a bit more data than the hash, you can use git-log:
import subprocess
def get_git_hash():
return subprocess.check_output(['git', 'log', '-n', '1', '--pretty=tformat:%H']).strip()
def get_git_short_hash():
return subprocess.check_output(['git', 'log', '-n', '1', '--pretty=tformat:%h']).strip()
def get_git_short_hash_and_commit_date():
return subprocess.check_output(['git', 'log', '-n', '1', '--pretty=tformat:%h-%ad', '--date=short']).strip()
for full list of formating options - check out git log --help
I ran across this problem and solved it by implementing this function.
https://gist.github.com/NaelsonDouglas/9bc3bfa26deec7827cb87816cad88d59
from pathlib import Path
def get_commit(repo_path):
git_folder = Path(repo_path,'.git')
head_name = Path(git_folder, 'HEAD').read_text().split('\n')[0].split(' ')[-1]
head_ref = Path(git_folder,head_name)
commit = head_ref.read_text().replace('\n','')
return commit
r = get_commit('PATH OF YOUR CLONED REPOSITORY')
print(r)
I had a problem similar to the OP, but in my case I'm delivering the source code to my client as a zip file and, although I know they will have python installed, I cannot assume they will have git. Since the OP didn't specify his operating system and if he has git installed, I think I can contribute here.
To get only the hash of the commit, Naelson Douglas's answer was perfect, but to have the tag name, I'm using the dulwich python package. It's a simplified git client in python.
After installing the package with pip install dulwich --global-option="--pure" one can do:
from dulwich import porcelain
def get_git_revision(base_path):
return porcelain.describe(base_path)
r = get_git_revision("PATH OF YOUR REPOSITORY's ROOT FOLDER")
print(r)
I've just run this code in one repository here and it showed the output v0.1.2-1-gfb41223, similar to what is returned by git describe, meaning that I'm 1 commit after the tag v0.1.2 and the 7-digit hash of the commit is fb41223.
It has some limitations: currently it doesn't have an option to show if a repository is dirty and it always shows a 7-digit hash, but there's no need to have git installed, so one can choose the trade-off.
Edit: in case of errors in the command pip install due to the option --pure (the issue is explained here), pick one of the two possible solutions:
Install Dulwich package's dependencies first:
pip install urllib3 certifi && pip install dulwich --global-option="--pure"
Install without the option pure: pip install dulwich. This will install some platform dependent files in your system, but it will improve the package's performance.
If you don't have Git available for some reason, but you have the git repo (.git folder is found), you can fetch the commit hash from .git/fetch/heads/[branch].
For example, I've used a following quick-and-dirty Python snippet run at the repository root to get the commit id:
git_head = '.git\\HEAD'
# Open .git\HEAD file:
with open(git_head, 'r') as git_head_file:
# Contains e.g. ref: ref/heads/master if on "master"
git_head_data = str(git_head_file.read())
# Open the correct file in .git\ref\heads\[branch]
git_head_ref = '.git\\%s' % git_head_data.split(' ')[1].replace('/', '\\').strip()
# Get the commit hash ([:7] used to get "--short")
with open(git_head_ref, 'r') as git_head_ref_file:
commit_id = git_head_ref_file.read().strip()[:7]
If you are like me :
Multiplatform so subprocess may crash one day
Using Python 2.7 so GitPython not available
Don't want to use Numpy just for that
Already using Sentry (old depreciated version : raven)
Then (this will not work on shell because shell doesn't detect current file path, replace BASE_DIR by your current file path) :
import os
import raven
BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
print(raven.fetch_git_sha(BASE_DIR))
That's it.
I was looking for another solution because I wanted to migrate to sentry_sdk and leave raven but maybe some of you want to continue using raven for a while.
Here was the discussion that get me into this stackoverflow issue
So using the code of raven without raven is also possible (see discussion) :
from __future__ import absolute_import
import os.path
__all__ = 'fetch_git_sha'
def fetch_git_sha(path, head=None):
"""
>>> fetch_git_sha(os.path.dirname(__file__))
"""
if not head:
head_path = os.path.join(path, '.git', 'HEAD')
with open(head_path, 'r') as fp:
head = fp.read().strip()
if head.startswith('ref: '):
head = head[5:]
revision_file = os.path.join(
path, '.git', *head.split('/')
)
else:
return head
else:
revision_file = os.path.join(path, '.git', 'refs', 'heads', head)
if not os.path.exists(revision_file):
# Check for Raven .git/packed-refs' file since a `git gc` may have run
# https://git-scm.com/book/en/v2/Git-Internals-Maintenance-and-Data-Recovery
packed_file = os.path.join(path, '.git', 'packed-refs')
if os.path.exists(packed_file):
with open(packed_file) as fh:
for line in fh:
line = line.rstrip()
if line and line[:1] not in ('#', '^'):
try:
revision, ref = line.split(' ', 1)
except ValueError:
continue
if ref == head:
return revision
with open(revision_file) as fh:
return fh.read().strip()
I named this file versioning.py and I import "fetch_git_sha" where I need it passing file path as argument.
Hope it will help some of you ;)

How do you define the host for Fabric to push to GitHub?

Original Question
I've got some python scripts which have been using Amazon S3 to upload screenshots taken following Selenium tests within the script.
Now we're moving from S3 to use GitHub so I've found GitPython but can't see how you use it to actually commit to the local repo and push to the server.
My script builds a directory structure similar to \images\228M\View_Use_Case\1.png in the workspace and when uploading to S3 it was a simple process;
for root, dirs, files in os.walk(imagesPath):
for name in files:
filename = os.path.join(root, name)
k = bucket.new_key('{0}/{1}/{2}'.format(revisionNumber, images_process, name)) # returns a new key object
k.set_contents_from_filename(filename, policy='public-read') # opens local file buffers to key on S3
k.set_metadata('Content-Type', 'image/png')
Is there something similar for this or is there something as simple as a bash type git add images command in GitPython that I've completely missed?
Updated with Fabric
So I've installed Fabric on kracekumar's recommendation but I can't find docs on how to define the (GitHub) hosts.
My script is pretty simple to just try and get the upload to work;
from __future__ import with_statement
from fabric.api import *
from fabric.contrib.console import confirm
import os
def git_server():
env.hosts = ['github.com']
env.user = 'git'
env.passowrd = 'password'
def test():
process = 'View Employee'
os.chdir('\Work\BPTRTI\main\employer_toolkit')
with cd('\Work\BPTRTI\main\employer_toolkit'):
result = local('ant viewEmployee_git')
if result.failed and not confirm("Tests failed. Continue anyway?"):
abort("Aborting at user request.")
def deploy():
process = "View Employee"
os.chdir('\Documents and Settings\markw\GitTest')
with cd('\Documents and Settings\markw\GitTest'):
local('git add images')
local('git commit -m "Latest Selenium screenshots for %s"' % (process))
local('git push -u origin master')
def viewEmployee():
#test()
deploy()
It Works \o/ Hurrah.
You should look into Fabric. http://docs.fabfile.org/en/1.4.1/index.html. Automated server deployment tool. I have been using this quite some time, it works pretty fine.
Here is my one of the application which uses it, https://github.com/kracekumar/sachintweets/blob/master/fabfile.py
It looks like you can do this:
index = repo.index
index.add(['images'])
new_commit = index.commit("my commit message")
and then, assuming you have origin as the default remote:
origin = repo.remotes.origin
origin.push()

Categories