How to clone github repo branch from existing branch by python script

How to clone github repo branch from existing branch by python script - python

My organization is migrating from GitLab to GitHub and we were using some existing Python scripts to check commit difference and to create multiple release branches in one go by cloning the previous release branch. I know how we do this for GitLab but not able to find some solution to the same in GitHub. If someone can please help me with how can I do the same in GitHub too it will be really helpful.
We are using below code in gitlab for now
def createBranch(projectName, existingBranch, newBranchName):
projectId=projMap[projectName]
gl = gitlab.Gitlab('github URL', private_token='git token', ssl_verify=False)
project = gl.projects.get(projectId)
try:
project.branches.get(newBranchName)
log.info(" %s %s already exist", projectName, newBranchName)
return 0
except:
log.info("createBranch %s from:%s to:%s", project.name, existingBranch, newBranchName)
try:
project.branches.create({"branch": newBranchName,
"ref": existingBranch})
except:
raise Exception(project.name, " error creating " + newBranchName + " from " + existingBranch)
projMap -> this is one text file which stores all project name and their projectIDs.
I tried multiple threads in Stackoverflow but none one seems to help me.

PyGithub has the functionality you're looking for.
Connecting to the GitHub API
gh = Github(params)
Getting a repo:
repo = gh.get_repo(id)
Getting a branch:
branch = repo.get_branch(branch)
Making a branch:
repo.create_git_ref(ref='refs/heads/' + 'new_branch_name', sha=branch.commit.sha)
All a copy paste from the PyGithub reference page.

Related

Where is the repository information(like url , working branch) stored in .git folder?

I am trying to write a github automation script but there is an issue that i am facing i.e. i don't know where the repository information i being stored in .git folder. By searching the files manually i found that .git\config file has some information but it is really inconsistent as sometimes the information is not there. Can someone tell me which file should i look into for repository information(i.e. URL , working branch).
This is my code to get the URL and branch from the .git\config file:-
import os
mypath = os.getcwd()
infofile = mypath + '/.git/config'
def takeInfo():
print('No Existing repo info found\n')
url = str(input('Enter the Github Repo URL: '))
branch = str(input('Enter the branch: '))
info = ['n' , url , branch]
return info
def checkinfoInDir():
if (os.path.exists(infofile)):
print('Repo Info Found:-')
with open(infofile, "r") as f:
info = f.readlines()
# print(info)
for ele in info:
if('url' in ele):
url = info[info.index(ele)].split()[2]
if('branch' in ele):
branch = info[info.index(ele)].split()[1].split('"')[1]
info = [url , branch]
else:
info = takeInfo()
return info

To read the current branch, use git symbolic-ref or git rev-parse:
git symbolic-ref works in all cases where HEAD is a symbolic ref, including the case where it is a symbolic name for a branch that does not yet exist.1 It fails when HEAD is detached, i.e., points directly to a commit.
git rev-parse --symbolic-full-name HEAD works in all cases where HEAD identifies some commit. It fails when HEAD is a symbolic name for a branch that does not yet exist, and when HEAD is detached, it comes up with the name HEAD.
To find a URL for some remote, use git config --get remote.remote.url. Note that if Git is configured for a triangular workflow, this is the fetch URL, not the push URL, so check for the result of git config --get remote.remote.pushurl to see if there is a separate push URL.
Note that there can be any number of remotes, including none.
1Git calls this an orphan branch or an unborn branch, depending on which part of Git is doing the calling. To get into this state, create a new empty repository—you'll be on whatever branch name you choose as your initial name, but it won't exist as there are no commits and no branch names can exist until there are commits—or use git checkout --orphan or git switch --orphan.
To get into detached HEAD state, use git checkout with any valid commit identifier that is not a branch name, or use git checkout --detach or git switch --detach with or without any valid commit identifier.
To get out of detached HEAD state, use git checkout or git switch with a valid branch name (including the various cases where these commands create a new branch name, then attach HEAD to it).

This is an example config file
[core]
repositoryformatversion = 0
filemode = false
bare = false
logallrefupdates = true
symlinks = false
ignorecase = true
[submodule]
active = .
[remote "origin"]
url = https://gitlab.com/sample/sample.git
fetch = +refs/heads/*:refs/remotes/origin/*
[branch "master"]
remote = origin
merge = refs/heads/master
it has ini file format, so you use a simple ini python library to parse this file.

Pushing local repository to remote repository using python Github

The code should do the following in order:
It should download/clone the public Github repository locally.
The should remove all the git history (and branches)
Use the Github API to create a new Github repository initialized with the input github repository content that you downloaded locally. The new repository should be named
using name supplied.
I am able to do 1 and 3 but asks for log-in 2 times. I am not able to initialize the new remote repo with local repo.
local_repo = repo1 how?
And removing git history? where can I find git history in the cloned repo.
import git,os,tempfile,os,fnmatch,sys
from github import Github
username = sys.argv[1]
password = sys.argv[2]
input_repo_url = sys.argv[3]
output_repo_name = sys.argv[4]
tempdir=tempfile.mkdtemp(prefix="",suffix="")
predictable_filename = "myfile"
saved_umask = os.umask(77)
path = os.path.join(tempdir,predictable_filename)
print("Cloning the repository at "+path)
local_repo = git.Repo.clone_from(input_repo_url,path, branch="master")
print("Clone successful!")
g = Github(username,password)
user = g.get_user()
repo1 = user.create_repo(output_repo_name)
print("New repository created at "+username+" account ")
print(repo1)
target_url = "https://github.com/"+username+"/"+output_repo_name+".git"
print(target_url)
print("Pushing cloned repo to target repo")
local_repo.create_remote("new",url=target_url)
local_repo.git.push("new")
print("Success!!")

PyGitHub: Unable to access private repositories of my team

I want to access a private repository of a team that I am part of. However, I am not able to access it. It throws an exception as follows:
UnknownObjectException: 404 {u'documentation_url': u'https://developer.github.com/v3/repos/#list-teams', u'message': u'Not Found'}
My code:
from github import Github
import pandas as pd
git = Github("token")
org = git.get_organization('org')
org.get_repo('repo_name')
It throws n error at the above statement.
I want to access this repository and get the count of number of teams who have access to the repository. However, I got the above mentioned error at the last line of the above code.
Can someone help me to fix this?

For future readers who are security-minded like me and want a read-only Personal Access Token, to read your private repos, you will need this enabled (and the OP will have to generate a new token).

For Github Enterprise:
from github import Github
g = Github(base_url="https://your_host_name/api/v3", login_or_token="your_access_token")
org = g.get_organization("your_org")
repo = org.get_repo(repo_name) # getting the repo
print(repo)
For Github :
from github import Github
g = Github(username,password))
repo = g.get_repo(repo_name) # getting the repo
print(repo)

Which repo_name is used?
Example: team_X/repo_1
If using github() directly: repo = github().get_repo("team_X/repo_1")
If using org object to get repo: repo = org.get_repo("repo_1")

Push new local branch to remote using Gitpython

I looked at a few references but I am still having problems:
I want to clone a remote repo, create a new branch, and push the new branch back to remote using GitPython.
This seems to work:
import git
import subprocess
nm_brnch = 'new_branch'
# Clone
repo_url = r'my_remote.git'
repo = git.Repo.clone_from(repo_url, dnm_wrk, branch=r'some_branch')
# Create new branch
git = repo.git
git.checkout('HEAD', b=nm_brnch)
# Push new branch to remote
subprocess.call(f'git push -u origin {nm_brnch}')
But it's ugly, since it uses subprocess, instead of using GitPython.
I tried using GitPython, but without success:
repo.head.set_reference(nm_brnch)
repo.git.push("origin", nm_brnch)
I have consulted the following references:
Pushing local branch to remote branch
Use GitPython to Checkout a new branch and push to remote
Related GitHub issue/question
Tutorial from official docs

I'm using gitpython==2.1.11 with Python 3.7. Below is my push function in which I first try a high-level push, and then a low-level push as necessary. Note how I check the return value of either command. I also log the push actions, and this explains what's happening at every step.
class GitCommandError(Exception):
pass
class Git:
def _commit_and_push_repo(self) -> None:
repo = self._repo
remote = repo.remote()
remote_name = remote.name
branch_name = repo.active_branch.name
# Note: repo.index.entries was observed to also include unpushed files in addition to uncommitted files.
log.debug('Committing repository index in active branch "%s".', branch_name)
self._repo.index.commit('')
log.info('Committed repository index in active branch "%s".', branch_name)
def _is_pushed(push_info: git.remote.PushInfo) -> bool:
valid_flags = {push_info.FAST_FORWARD, push_info.NEW_HEAD} # UP_TO_DATE flag is intentionally skipped.
return push_info.flags in valid_flags # This check can require the use of & instead.
push_desc = f'active branch "{branch_name}" to repository remote "{remote_name}"'
log.debug('Pushing %s.', push_desc)
try:
push_info = remote.push()[0]
except git.exc.GitCommandError: # Could be due to no upstream branch.
log.warning('Failed to push %s. This could be due to no matching upstream branch.', push_desc)
log.info('Reattempting to push %s using a lower-level command which also sets upstream branch.', push_desc)
push_output = repo.git.push('--set-upstream', remote_name, branch_name)
log.info('Push output was: %s', push_output)
expected_msg = f"Branch '{branch_name}' set up to track remote branch '{branch_name}' from '{remote_name}'."
if push_output != expected_msg:
raise RepoPushError(f'Failed to push {push_desc}.')
else:
is_pushed = _is_pushed(push_info)
logger = log.debug if is_pushed else log.warning
logger('Push flags were %s and message was "%s".', push_info.flags, push_info.summary.strip())
if not is_pushed:
log.warning('Failed first attempt at pushing %s. A pull will be performed.', push_desc)
self._pull_repo()
log.info('Reattempting to push %s.', push_desc)
push_info = remote.push()[0]
is_pushed = _is_pushed(push_info)
logger = log.debug if is_pushed else log.error
logger('Push flags were %s and message was "%s".', push_info.flags, push_info.summary.strip())
if not is_pushed:
raise RepoPushError(f'Failed to push {push_desc} despite a pull.')
log.info('Pushed %s.', push_desc)

You have to define a remote repo, then push to it. e.g.
origin = repo.remote(name='origin')
origin.push()
See the Handling Remotes documentation for more examples of push/pull

Expanding on #Fraser's answer, here is the full code I used to successfully create a new branch:
from pathlib import Path
# initialize repo and remote origin
repo_path = Path("~/git/sandboxes/git-sandbox").expanduser()
repo = git.Repo(repo_path)
origin = repo.remote(name="origin")
# create new head and get it tracked in the origin
repo.head.reference = repo.create_head(branch_name)
repo.head.reference.set_tracking_branch(origin.refs.master).checkout()
# create a file for the purposes of this example
touch[f"{repo_path}/tmp1.txt"] & plumbum.FG
# stage the changed file and commit it
repo.index.add("tmp1.txt")
repo.index.commit("mod tmp1.txt")
# push the staged commits
push_res = origin.push(branch_name)[0]
print(push_res.summary)

Assuming it's the push that this is failing on in GitPython (as it was for me), just using GitPython I was able to solve this problem like this:
import git
repo = git.Repo('<your repo path>')
repo.git.checkout('HEAD', b=<your branch name>)
# -u fixed it for me
repo.git.push('origin', '-u', branch_name)

Python & gdata within Django app: "POST method does not support concurrency"

I am trying to use gdata within a Django app to create a directory in my google drive account. This is the code written within my Django view:
def root(request):
from req_info import email, password
from gdata.docs.service import DocsService
print "Creating folder........"
folder_name = '2015-Q1'
service_client = DocsService(source='spreadsheet create')
service_client.ClientLogin(email, password)
folder = service_client.CreateFolder(folder_name)
Authentication occurs without issue, but that last line of code triggers the following error:
Request Method: GET
Request URL: http://127.0.0.1:8000/
Django Version: 1.7.7
Exception Type: RequestError
Exception Value: {'status': 501, 'body': 'POST method does not support concurrency', 'reason': 'Not Implemented'}
I am using the following software:
Python 2.7.8
Django 1.7.7
PyCharm 4.0.5
gdata 2.0.18
google-api-python-client 1.4.0 (not sure if relevant)
[many other packages that I'm not sure are relevant]
What's frustrating is that the exact same code (see below) functions perfectly when I run it in its own, standalone file (not within a Django view).
from req_info import email, password
from gdata.docs.service import DocsService
print "Creating folder........"
folder_name = '2015-Q1'
service_client = DocsService(source='spreadsheet create')
service_client.ClientLogin(email, password)
folder = service_client.CreateFolder(folder_name)
I run this working code in the same virtual environment and the same PyCharm project as the code that produced the error. I have tried putting the code within a function in a separate file, and then having the Django view call that function, but the error persists.
I would like to get this code working within my Django app.

I don't recall if I got this to work within a Django view, but because Google has since required the use of Oauth 2.0, I had to rework this code anyways. I think the error had something to do with my simultaneous use of two different packages/clients to access Google Drive.
Here is how I ended up creating the folder using the google-api-python-client package:
from google_api import get_drive_service_obj, get_file_key_if_exists, insert_folder
def create_ss():
drive_client, credentials = get_drive_service_obj()
# creating folder if it does not exist
folder = get_file_key_if_exists(drive_client, 'foldername')
if folder: # if folder exists
print 'Folder "' + folder_name + '" already exists.'
else: # if folder doesn't exist
print 'Creating folder........"' + folder_name + '".'
folder = insert_folder(drive_client, folder_name)
After this code, I used a forked version (currently beta) of sheetsync to copy my template spreadsheet and populate the new file with my data. I then had to import sheetsync after the code above to avoid the "concurrency" error. (I could post the code involving sheetsync here too if folks want, but for now, I don't want to get too far off topic.)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.