How to clone a private repository from Github using python?
I found some good information about git and python, but I started learning python few days back.
Just run the git command with subprocess.check_call:
import subprocess
subprocess.check_call(["git", "clone", ...])
There is a library, libgit2, which enables git to be used as a shared library more helpful to your cause is the python binding's pygit.
To answer your question using pygit to clone a repo:
>>> from pygit2 import clone_repository
>>> repo_url = 'git://github.com/libgit2/pygit2.git'
>>> repo_path = '/path/to/create/repository'
>>> repo = clone_repository(repo_url, repo_path) # Clones a non-bare repository
>>> repo = clone_repository(repo_url, repo_path, bare=True) # Clones a bare repository
You can view the repository based docs here
Here are my two cents since there's no answer yet to the repo being private. The way I usually do it is I create a special SSH key pair for the script and upload the public one to GitHub (or whatever hosting you're using).
You can have the script use the private key by running:
GIT_SSH_COMMAND='ssh -i private_key_file' git clone git#github.com:user/repo.git
import pygit2
repo_url = 'git://github.com/libgit2/pygit2.git'
repo_path = '/path/to/create/repository'
callbacks = pygit2.RemoteCallbacks(pygit2.UserPass("<your-personal-token>", 'x-oauth-basic'))
repo = pygit2.clone_repository(repo_url, repo_path, callbacks=callbacks)
Related
I have a python script.
I need to access the name of the git branch from which I'm running the python script, through python, during runtime.
Is there a way to do this?
Edit:
os.system("git rev-parse --abbrev-ref HEAD") outputs to the cli, I don't see how I would get access to it from python...
I would like to have sth like git_branch = <python commands>
You could use GitPython, something like the following:
>>> import git
>>> import os
>>> git_branch = git.Repo(os.getcwd()).active_branch.name
>>> git_branch
'master'
Otherwise, as already pointed out by Yevhen Kuzmovych in the comments, you could use PyGit.
I am having a GitHub Private Repository which has 3 .json files on its parent directory.
Suppose the three json files are:
1.json
2.json
3.json
I am trying to write a function through which I can just push any one of the .json file through the python function with contents and it makes a commit and push the changes.
I tried using solution from this but it seems outdated or unsupported: Python update files on Github remote repo without local working directory
Function should be liked this:
def update_file_to_repo(file_name,file_content):
# Do the push..
file_name has the 1.json or any other file name as string and file_content has contents as string imported through json.dumps() by me in main function..
Though I didn't found any ways to do it without cloning the repo in local directory
However, here's the way if you wanted to do while using a local directory:
Function for cloning the private repo to local directory. You will also need to create a Personal Access Token(you can create here) and make sure to give repo permissions for cloning and making changes to the private repo.
def initialize_repo():
os.system("git config --global user.name \"your.username\"")
os.system("git config --global user.email \"your_github_email_here\"")
os.system(r"git clone https://username:token#github.com/username/repo.git folder-name")
Function for pulling the repo:
def pull_repo():
os.system(r"cd /app/folder-name/ && git pull origin master")
return
Function for pushing the repo:
def push(pull="no"):
PATH_OF_GIT_REPO = r'/app/folder-name/.git' # make sure .git folder is properly configured
COMMIT_MESSAGE = 'commit done by python script'
# try:
repo = Repo(PATH_OF_GIT_REPO)
if pull=="yes":
pull_repo()
repo.git.add(update=True)
repo.index.commit(COMMIT_MESSAGE)
origin = repo.remote(name='origin')
origin.push()
# except:
# print('Some error occured while pushing the code')
return
Now using these functions, just make a pull, then make the changes to the json or any other file you want and then do the push :)
I want to download single file from my git repository using python.
Currently I am using gitpython lib. Git clone is working fine with below code but I don't want to download entire directory.
import os
from git import Repo
git_url = 'stack#127.0.1.7:/home2/git/stack.git'
repo_dir = '/root/gitrepo/'
if __name__ == "__main__":
Repo.clone_from(git_url, repo_dir, branch='master', bare=True)
print("OK")
Don't think of a Git repo as a collection of files, but a collection of snapshots. Git doesn't allow you to select what files you download, but allows you to select how many snapshots you download:
git clone stack#127.0.1.7:/home2/git/stack.git
will download all snapshots for all files, while
git clone --depth 1 stack#127.0.1.7:/home2/git/stack.git
will only download the latest snapshot of all files. You will still download all files, but at least leave out all of their history.
Of these files you can simply select the one you want, and delete the rest:
import os
import git
import shutil
import tempfile
# Create temporary dir
t = tempfile.mkdtemp()
# Clone into temporary dir
git.Repo.clone_from('stack#127.0.1.7:/home2/git/stack.git', t, branch='master', depth=1)
# Copy desired file from temporary dir
shutil.move(os.path.join(t, 'setup.py'), '.')
# Remove temporary dir
shutil.rmtree(t)
You can also use subprocess in python:
import subprocess
args = ['git', 'clone', '--depth=1', 'stack#127.0.1.7:/home2/git/stack.git']
res = subprocess.Popen(args, stdout=subprocess.PIPE)
output, _error = res.communicate()
if not _error:
print(output)
else:
print(_error)
However, your main problem remains.
Git does not support downloading parts of the repository. You have to download all of it. But you should be able to do this with GitHub. Reference
You need to request the raw version of the file! You can get it from raw.github.com
I don't want to flag this as a direct duplicate, since it does not fully reflect the scope of this question, but part of what Lucifer said in his answer seems the way to go, according to this SO post. In short, git does not allow for a partial download, but certain providers (like GitHub) do, via raw content.
That being said, Python does provide quite a number of different libraries to download, with the best-known being urllib.request.
I am working in windows and attempting to run a git diff command in the pre-commit script (Python) of a repository. My Python call looks like this:
repo_dir = 'D:/git/current_uic/src/gtc/resource'
cmd = ['diff', '--name-only']
print(Popen(['git', '--git-dir={}'.format(repo_dir + '/.git'),
'--work-tree={}'.format(repo_dir)] + cmd,
stdin=PIPE, stdout=PIPE).communicate())
Whenever I go to commit in the "D:/git/current_uic/src/gtc" repo, I get the following:
fatal: unable to read 6ff96bd371691b9e93520e133ebc4d84c74cd0f6
Note that this is a pre-commit hook for the 'D:/git/current_uic/src/gtc' repository and that 'D:/git/current_uic/src/gtc/resource' is a submodule of 'D:/git/current_uic/src/gtc'. Also note that if I pop open Git bash and run the following:
git --git-dir=D:/git/current_uic/src/gtc/resource/.git
--work-tree=D:/git/current_uic/src/gtc/resource diff --name-only
or if I just run the script straight from Git bash I get exactly what I want, regardless of working directory.
Any ideas as to what is going on here?
The Problem:
Upon running a hook, Git sets some environment variables that are accessible by the hook script. The problem is that Git itself uses these environment variables, and the normal way in which Git sets/uses them seems to be overridden by the values set when the hook gets fired off. In this particular instance, the environment variable GIT_INDEX_FILE has been set to the path to the index file corresponding to the repository which had called the hook (D:/git/current_uic/src/.git/modules/gtc/index), causing a mismatch between the (incorrect) index and the (correct) change tree.
The Fix:
In the hook script, set the environment variable GIT_INDEX_FILE to the correct value before making any git calls. In this case, you could do the following:
set GIT_INDEX_FILE=D:/git/current_uic/src/.git/modules/gtc/modules/resource/index
git --git-dir=D:/git/current_uic/src/gtc/resource/.git
--work-tree=D:/git/current_uic/src/gtc/resource diff --name-only
Additional Info
More information about these Git environment variables and which hooks set them can be found here.
Got exactly same issue but using gitpython.
I solved it like this:
repo = git.Repo()
for submodule in repo.submodules:
back_index = os.getenv('GIT_INDEX_FILE')
os.environ['GIT_INDEX_FILE'] = submodule.module().index.path
commit = submodule.module().head.commit
print([item.a_path for item in commit.diff(None)])
os.environ['GIT_INDEX_FILE'] = back_index
hello is there an good utility or package that handles git folder download ?
example
getsomething = {
'htmlpurifier' : 'http://repo.or.cz/w/htmlpurifier.git'
}
for key in vendors:
# someutility.get(http://repo.or.cz/w/htmlpurifier.git,htmlpurifier)
someutility.get(vendors[key],key)
# get http://repo.or.cz/w/htmlpurifier folder to /htmlpurifier on localstorage ?
if there is anything similar?
I prefer to use git commands directly and wrap it using subprocess module.
How ever, if you are looking for modules to interact with Git, I can think of
dulwich : http://www.samba.org/~jelmer/dulwich/docs/index.html
git-python: http://gitorious.org/projects/git-python/
For git-python, particularly, please look at class : Repo. It has a function:
fork_bare(path, **kwargs)
Fork a bare git repository from this repo
path is the full path of the new repo (traditionally ends with name.git)
options is any additional options to the git clone command
Returns git.Repo (the newly forked repo)
Also you can checkout: http://packages.python.org/GitPython/0.3.2/tutorial.html#using-git-directly
git = repo.git
git.checkout('head', b="my_new_branch")
GitPython is a python library used to interact with git repositories
-- GitPython docs
If by "git folder download" you mean clone the Git repository this should do it:
from git import Repo
repo_url = "http://repo.or.cz/w/htmlpurifier.git"
local_dir = "/Users/user1/gitprojects/"
Repo.clone_from(repo_url, local_dir)