Script/utility to rewrite all svn:externals in repository trunk

Script/utility to rewrite all svn:externals in repository trunk - python

Say that one wishes to convert all absolute svn:externals URLS to relative URLS throughout their repository.
Alternatively, if heeding the tip in the svn:externals docs ("You should seriously consider using explicit revision numbers..."), one might find themselves needing to periodically pull new revisions for externals in many places throughout the repository.
What's the best way to programmatically update a large number of svn:externals properties?
My solution is posted below.

Here's my class to extract parts from a single line of an svn:externals property:
from urlparse import urlparse
import re
class SvnExternalsLine:
'''Consult https://subversion.apache.org/docs/release-notes/1.5.html#externals for parsing algorithm.
The old svn:externals format consists of:
<local directory> [revision] <absolute remote URL>
The NEW svn:externals format consists of:
[revision] <absolute or relative remote URL> <local directory>
Therefore, "relative" remote paths always come *after* the local path.
One complication is the possibility of local paths with spaces.
We just assume that the remote path cannot have spaces, and treat all other
tokens (except the revision specifier) as part of the local path.
'''
REVISION_ARGUMENT_REGEXP = re.compile("-r(\d+)")
def __init__(self, original_line):
self.original_line = original_line
self.pinned_revision_number = None
self.repo_url = None
self.local_pathname_components = []
for token in self.original_line.split():
revision_match = self.REVISION_ARGUMENT_REGEXP.match(token)
if revision_match:
self.pinned_revision_number = int(revision_match.group(1))
elif urlparse(token).scheme or any(map(lambda p: token.startswith(p), ["^", "//", "/", "../"])):
self.repo_url = token
else:
self.local_pathname_components.append(token)
# ---------------------------------------------------------------------
def constructLine(self):
'''Reconstruct the externals line in the Subversion 1.5+ format'''
tokens = []
# Update the revision specifier if one existed
if self.pinned_revision_number is not None:
tokens.append( "-r%d" % (self.pinned_revision_number) )
tokens.append( self.repo_url )
tokens.extend( self.local_pathname_components )
if self.repo_url is None:
raise Exception("Found a bad externals property: %s; Original definition: %s" % (str(tokens), repr(self.original_line)))
return " ".join(tokens)
I use the pysvn library to iterate recursively through all of the directories possessing the svn:externals property, then split that property value by newlines, and act upon each line according to the parsed SvnExternalsLine.
The process must be performed on a local checkout of the repository. Here's how pysvn (propget) can be used to retrieve the externals:
client.propget( "svn:externals", base_checkout_path, recurse=True)
Iterate through the return value of this function, and and after modifying the property on each directory,
client.propset("svn:externals", new_externals_property, path)

Related

Where is the repository information(like url , working branch) stored in .git folder?

I am trying to write a github automation script but there is an issue that i am facing i.e. i don't know where the repository information i being stored in .git folder. By searching the files manually i found that .git\config file has some information but it is really inconsistent as sometimes the information is not there. Can someone tell me which file should i look into for repository information(i.e. URL , working branch).
This is my code to get the URL and branch from the .git\config file:-
import os
mypath = os.getcwd()
infofile = mypath + '/.git/config'
def takeInfo():
print('No Existing repo info found\n')
url = str(input('Enter the Github Repo URL: '))
branch = str(input('Enter the branch: '))
info = ['n' , url , branch]
return info
def checkinfoInDir():
if (os.path.exists(infofile)):
print('Repo Info Found:-')
with open(infofile, "r") as f:
info = f.readlines()
# print(info)
for ele in info:
if('url' in ele):
url = info[info.index(ele)].split()[2]
if('branch' in ele):
branch = info[info.index(ele)].split()[1].split('"')[1]
info = [url , branch]
else:
info = takeInfo()
return info

To read the current branch, use git symbolic-ref or git rev-parse:
git symbolic-ref works in all cases where HEAD is a symbolic ref, including the case where it is a symbolic name for a branch that does not yet exist.1 It fails when HEAD is detached, i.e., points directly to a commit.
git rev-parse --symbolic-full-name HEAD works in all cases where HEAD identifies some commit. It fails when HEAD is a symbolic name for a branch that does not yet exist, and when HEAD is detached, it comes up with the name HEAD.
To find a URL for some remote, use git config --get remote.remote.url. Note that if Git is configured for a triangular workflow, this is the fetch URL, not the push URL, so check for the result of git config --get remote.remote.pushurl to see if there is a separate push URL.
Note that there can be any number of remotes, including none.
1Git calls this an orphan branch or an unborn branch, depending on which part of Git is doing the calling. To get into this state, create a new empty repository—you'll be on whatever branch name you choose as your initial name, but it won't exist as there are no commits and no branch names can exist until there are commits—or use git checkout --orphan or git switch --orphan.
To get into detached HEAD state, use git checkout with any valid commit identifier that is not a branch name, or use git checkout --detach or git switch --detach with or without any valid commit identifier.
To get out of detached HEAD state, use git checkout or git switch with a valid branch name (including the various cases where these commands create a new branch name, then attach HEAD to it).

This is an example config file
[core]
repositoryformatversion = 0
filemode = false
bare = false
logallrefupdates = true
symlinks = false
ignorecase = true
[submodule]
active = .
[remote "origin"]
url = https://gitlab.com/sample/sample.git
fetch = +refs/heads/*:refs/remotes/origin/*
[branch "master"]
remote = origin
merge = refs/heads/master
it has ini file format, so you use a simple ini python library to parse this file.

Toolchain not downloading the tool

Hi I'm trying to set up a toolchain for the Fn project. The approach is to set up a toolchain per binary available in GitHub and then, in theory use it in a rule.
I have a common package which has the available binaries:
default_version = "0.5.44"
os_list = [
"linux",
"mac",
"windows"
]
def get_bin_name(os):
return "fn_cli_%s_bin" % os
The download part looks like this:
load(":common.bzl", "get_bin_name", "os_list", "default_version")
_url = "https://github.com/fnproject/cli/releases/download/{version}/{file}"
_os_to_file = {
"linux" : "fn_linux",
"mac" : "fn_mac",
"windows" : "fn.exe",
}
def _fn_binary(os):
name = get_bin_name(os)
file = _os_to_file.get(os)
url = _url.format(
file = file,
version = default_version
)
native.http_file(
name = name,
urls = [url],
executable = True
)
def fn_binaries():
"""
Installs the hermetic binary for Fn.
"""
for os in os_list:
_fn_binary(os)
Then I set up the toolchain like this:
load(":common.bzl", "get_bin_name", "os_list")
_toolchain_type = "toolchain_type"
FnInfo = provider(
doc = "Information about the Fn Framework CLI.",
fields = {
"bin" : "The Fn Framework binary."
}
)
def _fn_cli_toolchain(ctx):
toolchain_info = platform_common.ToolchainInfo(
fn_info = FnInfo(
bin = ctx.attr.bin
)
)
return [toolchain_info]
fn_toolchain = rule(
implementation = _fn_cli_toolchain,
attrs = {
"bin" : attr.label(mandatory = True)
}
)
def _add_toolchain(os):
toolchain_name = "fn_cli_%s" % os
native_toolchain_name = "fn_cli_%s_toolchain" % os
bin_name = get_bin_name(os)
compatibility = ["#bazel_tools//platforms:%s" % os]
fn_toolchain(
name = toolchain_name,
bin = ":%s" % bin_name,
visibility = ["//visibility:public"]
)
native.toolchain(
name = native_toolchain_name,
toolchain = ":%s" % toolchain_name,
toolchain_type = ":%s" % _toolchain_type,
target_compatible_with = compatibility
)
def setup_toolchains():
"""
Macro te set up the toolchains for the different platforms
"""
native.toolchain_type(name = _toolchain_type)
for os in os_list:
_add_toolchain(os)
def fn_register():
"""
Registers the Fn toolchains.
"""
path = "//tools/bazel_rules/fn/internal/cli:fn_cli_%s_toolchain"
for os in os_list:
native.register_toolchains(path % os)
In my BUILD file I call setup_toolchains:
load(":toolchain.bzl", "setup_toolchains")
setup_toolchains()
With this set up I have a rule which looks like this:
_toolchain = "//tools/bazel_rules/fn/cli:toolchain_type"
def _fn(ctx):
print("HEY")
bin = ctx.toolchains[_toolchain].fn_info.bin
print(bin)
# TEST RULE
fn = rule(
implementation = _fn,
toolchains = [_toolchain]
)
Workpace:
workspace(name = "basicwindow")
load("//tools/bazel_rules/fn:defs.bzl", "fn_binaries", "fn_register")
fn_binaries()
fn_register()
When I query for the different binaries with bazel query //tools/bazel_rules/fn/internal/cli:fn_cli_linux_bin they are there but calling bazel build //... results in an error which complains of:
ERROR: /Users/marcguilera/Code/Marc/basicwindow/tools/bazel_rules/fn/internal/cli/BUILD.bazel:2:1: in bin attribute of fn_toolchain rule //tools/bazel_rules/fn/internal/cli:fn_cli_windows: rule '//tools/bazel_rules/fn/internal/cli:fn_cli_windows_bin' does not exist. Since this rule was created by the macro 'setup_toolchains', the error might have been caused by the macro implementation in /Users/marcguilera/Code/Marc/basicwindow/tools/bazel_rules/fn/internal/cli/toolchain.bzl:35:15
ERROR: Analysis of target '//tools/bazel_rules/fn/internal/cli:fn_cli_windows' failed; build aborted: Analysis of target '//tools/bazel_rules/fn/internal/cli:fn_cli_windows' failed; build aborted
INFO: Elapsed time: 0.079s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded, 0 targets configured)
I tried to follow the toolchain tutorial in the documentation but I can't get it to work. Another interesting thing is that I'm actually using mac so the toolchain compatibility seems to also be wrong.
I'm using this toolchain in a repo so the paths vary but here's a repo containing only the fn stuff for ease of read.

Two things:
One, I suspect this is your actual issue: https://github.com/bazelbuild/bazel/issues/6828
The core of the problem is that, is the toolchain_type target is in an external repository, it always needs to be referred to by the fully-qualified name, never by the locally-qualified name.
The second is a little more fundamental: you have a lot of Starlark macros here that are generating other targets, and it's very hard to read. It would actually be a lot simpler to remove a lot of the macros, such as _fn_binary, fn_binaries, and _add_toolchains. Just have setup_toolchains directly create the needed native.toolchain targets, and have a repository macro that calls http_archive three times to declare the three different sets of binaries. This will make the code much easier to read and thus easier to debug.
For debugging toolchains, I follow two steps: first, I verify that the tool repositories exist and can be accessed directly, and then I check the toolchain registration and resolution.
After going several levels deep, it looks like you're calling http_archive, naming the new repository #linux, and downloading a specific binary file. This isn't how http_archive works: it expects to fetch a zip file (or tar.gz file), extract that, and find a WORKSPACE and at least one BUILD file inside.
My suggestions: simplify your macros, get the external repositories clearly defined, and then explore using toolchain resolution to choose the right one.
I'm happy to help answer further questions as needed.

How to use pyfig module in python?

This is the part of the mailer.py script:
config = pyfig.Pyfig(config_file)
svnlook = config.general.svnlook #svnlook path
sendmail = config.general.sendmail #sendmail path
From = config.general.from_email #from email address
To = config.general.to_email #to email address
what does this config variable contain? Is there a way to get the value for config variable without pyfig?

In this case config = a pyfig.Pyfig object initialised with the contents of the file named by the content of the string config_file.
To find out what that object does and contains you can either look at the documentation and/or the source code, both here, or you can print out, after the initialisation, e.g.:
config = pyfig.Pyfig(config_file)
print "Config Contains:\n\t", '\n\t'.join(dir(config))
if hasattr(config, "keys"):
print "Config Keys:\n\t", '\n\t'.join(config.keys())
or if you are using Python 3,
config = pyfig.Pyfig(config_file)
print("Config Contains:\n\t", '\n\t'.join(dir(config)))
if hasattr(config, "keys"):
print("Config Keys:\n\t", '\n\t'.join(config.keys()))
To get the same data without pyfig you would need to read and parse at the content of the file referenced by config_file within your own code.
N.B.: Note that pyfig seems to be more or less abandoned - no updates in over 5 years, web site no longer exists, etc., so I would strongly recommend converting the code to use a json configuration file instead.

How to get a list of tags and create new tags with python and dulwich in git?

I am having problems to retrieve the following information of a git repo using python:
I would like to get a list of all tags of this repository.
I would like to checkout another branch and also create a new branch for staging.
I would like to tag a commit with an annotated tag.
I have looked into the dulwich's documentation and the way it works seems very bare-bones. Are there also alternatives which are easier to use?

The simplest way to get all tags using Dulwich is:
from dulwich.repo import Repo
r = Repo("/path/to/repo")
tags = r.refs.as_dict("refs/tags")
tags is now a dictionary mapping tags to commit SHA1s.
Checking out another branch:
r.refs.set_symbolic_ref("HEAD", "refs/heads/foo")
r.reset_index()
Creating a branch:
r.refs["refs/heads/foo"] = head_sha1_of_new_branch

Now you can also get an alphabetically sorted list of tag labels.
from dulwich.repo import Repo
from dulwich.porcelain import tag_list
repo = Repo('.')
tag_labels = tag_list(repo)

Call git via subprocess. From one of my own programs:
def gitcmd(cmds, output=False):
"""Run the specified git command.
Arguments:
cmds -- command string or list of strings of command and arguments
output -- wether the output should be captured and returned, or
just the return value
"""
if isinstance(cmds, str):
if ' ' in cmds:
raise ValueError('No spaces in single command allowed.')
cmds = [cmds] # make it into a list.
# at this point we'll assume cmds was a list.
cmds = ['git'] + cmds # prepend with git
if output: # should the output be captured?
rv = subprocess.check_output(cmds, stderr=subprocess.STDOUT).decode()
else:
with open(os.devnull, 'w') as bb:
rv = subprocess.call(cmds, stdout=bb, stderr=bb)
return rv
Some examples:
rv = gitcmd(['gc', '--auto', '--quiet',])
outp = gitcmd('status', True)

Substitutions inside links in reST / Sphinx

I am using Sphinx to document a webservice that will be deployed in different servers. The documentation is full of URL examples for the user to click and they should just work. My problem is that the host, port and deployment root will vary and the documentation will have to be re-generated for every deployment.
I tried defining substitutions like this:
|base_url|/path
.. |base_url| replace:: http://localhost:8080
But the generated HTML is not what I want (doesn't include "/path" in the generated link):
http://localhost:8080/path
Does anybody know how to work around this?

New in Sphinx v1.0:
sphinx.ext.extlinks – Markup to shorten external links
https://www.sphinx-doc.org/en/master/usage/extensions/extlinks.html
The extension adds one new config value:
extlinks
This config value must be a dictionary of external sites, mapping unique short alias names to a base URL and a prefix. For example, to create an alias for the above mentioned issues, you would add
extlinks = {'issue':
('http://bitbucket.org/birkenfeld/sphinx/issue/%s', 'issue ')}
Now, you can use the alias name as a new role, e.g. :issue:`123`. This then inserts a link to http://bitbucket.org/birkenfeld/sphinx/issue/123. As you can see, the target given in the role is substituted in the base URL in the place of %s.
The link caption depends on the second item in the tuple, the prefix:
If the prefix is None, the link caption is the full URL.
If the prefix is the empty string, the link caption is the partial URL given in the role content (123 in this case.)
If the prefix is a non-empty string, the link caption is the partial URL, prepended by the prefix – in the above example, the link caption would be issue 123.
You can also use the usual “explicit title” syntax supported by other roles that generate links, i.e. :issue:`this issue <123>`. In this case, the prefix is not relevant.

I had a similar problem where I needed to substitute also URLs in image targets.
The extlinks do not expand when used as a value of image :target: attribute.
Eventually I wrote a custom sphinx transformation that rewrites URLs that start with a given prefix, in my case, http://mybase/. Here is a relevant code for conf.py:
from sphinx.transforms import SphinxTransform
class ReplaceMyBase(SphinxTransform):
default_priority = 750
prefix = 'http://mybase/'
def apply(self):
from docutils.nodes import reference, Text
baseref = lambda o: (
isinstance(o, reference) and
o.get('refuri', '').startswith(self.prefix))
basetext = lambda o: (
isinstance(o, Text) and o.startswith(self.prefix))
base = self.config.mybase.rstrip('/') + '/'
for node in self.document.traverse(baseref):
target = node['refuri'].replace(self.prefix, base, 1)
node.replace_attr('refuri', target)
for t in node.traverse(basetext):
t1 = Text(t.replace(self.prefix, base, 1), t.rawsource)
t.parent.replace(t, t1)
return
# end of class
def setup(app):
app.add_config_value('mybase', 'https://en.wikipedia.org/wiki', 'env')
app.add_transform(ReplaceMyBase)
return
This expands the following rst source to point to English wikipedia.
When conf.py sets mybase="https://es.wikipedia.org/wiki" the links would point to the Spanish wiki.
* inline link http://mybase/Helianthus
* `link with text <http://mybase/Helianthus>`_
* `link with separate definition`_
* image link |flowerimage|
.. _link with separate definition: http://mybase/Helianthus
.. |flowerimage| image:: https://upload.wikimedia.org/wikipedia/commons/f/f1/Tournesol.png
:target: http://mybase/Helianthus

Ok, here's how I did it. First, apilinks.py (the Sphinx extension):
from docutils import nodes, utils
def setup(app):
def api_link_role(role, rawtext, text, lineno, inliner, options={},
content=[]):
ref = app.config.apilinks_base + text
node = nodes.reference(rawtext, utils.unescape(ref), refuri=ref,
**options)
return [node], []
app.add_config_value('apilinks_base', 'http://localhost/', False)
app.add_role('apilink', api_link_role)
Now, in conf.py, add 'apilinks' to the extensions list and set an appropriate value for 'apilinks_base' (otherwise, it will default to 'http://localhost/'). My file looks like this:
extensions = ['sphinx.ext.autodoc', 'apilinks']
# lots of other stuff
apilinks_base = 'http://host:88/base/'
Usage:
:apilink:`path`
Output:
http://host:88/base/path

You can write a Sphinx extension that creates a role like
:apilink:`path`
and generates the link from that. I never did this, so I can't help more than giving this pointer, sorry. You should try to look at how the various roles are implemented. Many are very similar to what you need, I think.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.