pygit2 merge simulating -Xtheirs or -Xours

pygit2 merge simulating -Xtheirs or -Xours - python

I'm automating changing files in github using pygit2. Sometimes the files have changed in github while I am processing a repo so I want to pull() before I push().
Since this is automated I would like to avoid conflicts by either having my local changes always override the remote, or vice-versa. This seems like a very simple scenario but after hours of scouring the internet for examples I have found 0 examples of someone doing this. The pygit source itself has some examples that get close but the "handle conflicts" portion is just a "TODO" comment.
It looks like pygit2 should support it, but none of the APIs seem to do this.
For example,
Repository.merge_commits(ours, theirs, favor='normal', flags={}, file_flags={})
When I set favor="theirs" or favor="ours" and purposely force a conflict I still get conflicts.
I tried this:
ancestor_id = repo.merge_base(repo.head.target,remote_master_id)
repo.merge_trees(ancestor_id,repo.head,remote_master_id,favor="theirs")
No conflict now, but now I somehow end up with the repo in a state where both (ours and theirs) changes are in the commit history but the file itself is missing either change.
I'm just guessing here since I have no clue what merge_trees does (except "merge trees") and experimenting with values of ancestor_id.
Is there a way to get pygit2 to get it to do what I want?

Related

Python Pip automatically increment version number based on SCM

Similar questions like this were raised many times, but I was not able to find a solution for my specific problem.
I was playing around with setuptools_scm recently and first thought it is exactly what I need. I have it configured like this:
pyproject.toml
[build-system]
requires = ["setuptools_scm"]
build-backend = "setuptools.build_meta"
[project]
...
dynamic = ["version"]
[tool.setuptools_scm]
write_to = "src/hello_python/_version.py"
version_scheme = "python-simplified-semver"
and my __init__.py
from ._version import __version__
from ._version import __version_tuple__
Relevant features it covers for me:
I can use semantic versioning
it is able to use *.*.*.devN version strings
it increments minor version in case of feature-branches
it increments patch/micro version in case of fix-branches
This is all cool. As long as I am on my feature-branch I am able to get the correct version strings.
What I like particularly is, that the dev version string contains the commit hash and is thus unique across multiple branches.
My workflow now looks like this:
create feature or fix branch
commit, (push, ) publish
merge PR to develop-branch
As soon as I am on my feature-branch I am able to run python -m build which generated a new _version.py with the correct version string accordingly to the latest git tag found. If I add new commits, it is fine, as the devN part of the version string changes due to the commit hash. I would even be able to run a python -m twine upload dist/* now. My package is build with correct version, so I simply publish it. This works perfectly fine localy and on CI for both fix and feature branches alike.
The problem that I am facing now, is, that I need a slightly different behavior for my merged PullRequests
As soon as I merge, e.g. 0.0.1.dev####, I want to run my Jenkins job not on the feature-branch anymore, but instead on develop-branch. And the important part now is, I want to
get develop-branch (done by CI)
update version string to same as on branch but without devN, so: 0.0.1
build and publish
In fact, setuptools_scm is changing the version to 0.0.2.dev### now, and I would like to have 0.0.1.
I was tinkering a bit with creating git tags before running setuptools_scm or build, but I was not able to get the correct version string to put into the tag. At this point I am struggling now.
Is anyone aware of a solution to tackle my issue with having?:
minor increment on feature-branches + add .devN
patch/micro increment on fix-branches + add .devN
no increment on develop-branch and version string only containing major.minor.patch of merged branch

TLDR: turning off to write the version number to a file every time setuptools_scm runs could maybe solve your problem, alternatively add the version file to .gitignore.
Explanation:
I also just started using setuptools_scm, so I am not very confident in using it yet.
But, as far as I understand the logic to derive the version number is incremented according to the state of your repository (the detailed logic is documented here: https://github.com/pypa/setuptools_scm/#default-versioning-scheme).
When I am not mistaken, the tool now does exactly what it is expected to: it does NOT set a version name only derived from the tag, but also adds a devSomething because the tag you've set is not referencing the most current commit on the develop branch head in your case.
Also I had the problem that when letting setuptools_scm generate a version and also configuring it to write it to a file, this would lead to another state since the last commit, again generating a dev version number.
To get a "clean" (e.g. v0.0.1) version number I hat to do the tagging after merging (with merge commit) since the merge commit was also taken into account for the version numbering logic.
Still my setup is currently less complex than yours. Just feature and fix branches and just a main branch without develop. So fewer merge commits (I chose to do merge commits, so no linear history). Now after merging with commit I create a Tag manually and formulate its name myself.
And this also only works for me in case I opt-out for writing the version number into a file. This I have done by inserting the following into pyproject.toml:
[tool.setuptools_scm]
# intentionally empty/commented out
# write_to option leads to an unclean workspace during build
# which again is leading setuptools_scm to interpret this during build and producing wheels with unclean version numbers
# write_to = "version.txt"
Since setuptools_scm runs during build, a new version file is also generated, which pollutes your worktree. Since your worktree will never be clean this way, you always get a dev version number. To still have a version file and to let it ignore during build, add the file to your .gitignore.
My approach is not perfect, some manual steps, but for now it works for me.
Certainly not 100% applicable in your CI scenario, but maybe you could change the order of doing merges and tags. I hope this helps somehow.

Sphinx autoclass works locally, not readthedocs, but other classes and methods work on readthedocs?

I know there's several similar questions, but none of them seem to fit what is happening with me. When I build on readthedocs, it is successful. However, it doesn't show in the documentation, but will show locally. What my issue is, is that it doesn't show certain methods on readthedocs (even though there's a successful build), but it shows locally. In addition, there are certain instances where it shows neither the class and method I want it to show.
I'm not understanding what is going on and how to go about fixing it. I've made commits trying to fix it and I don't want to continue making unnecessary commits.
Links:
Docs example 1 (class shows, method doesn't)
Docs example 2 (class and method don't show, but it's setup the same as above?)
GitHub Project
Local build screenshot (what I should be seeing with the first example link):

Check your build log on RTD for "warning" or "error".
Although I suggested that the issue could be a typo of "pint" versus "point", it turned out to be a case of needing to add point to requirements.txt.

Batch call Dependency id requirements?

I have a script which does the following:
Create campaign
Create AdSet (requires campaign_id)
Create AdCreative (requires adset_id)
Create Ad (requires creative_id and adset_id)
I am trying to lump all of them into a batch request. However, I realized that my none of these gets created except for my campaign (step 1) if I use remote_create(batch=my_batch). This is probably due to the dependencies of the ids that are needed in by each of the subsequent steps.
I read the documentation and it mentions that one can "Specifying dependencies between operations in the request" (https://developers.facebook.com/docs/graph-api/making-multiple-requests) between calls via {result=(parent operation name):(JSONPath expression)}
Is this possible with the python API?
Can this be achieved with the way I am using remote_creates?

Unfortunately python sdk doesn't currently support this. There is a github issue for it: https://github.com/facebook/facebook-python-ads-sdk/issues/256.
I have also encountered this issue also and have described my workaround in the comments on the issue:
"I found a decent workaround for getting this behaviour without too much trouble. Basically I set the id fields that have dependencies with values like "{result=:$,id}" and prior to doing execute() on the batch object I iterate over ._batch and add as the 'name' entry. When I run execute sure enough it works perfectly. Obviously this solution does have it's limitations such where you are doing multiple calls to the same endpoint that need to be fed into other endpoints and you would have duplicated resource names and would need to customize the name further to string them together.
Anyways, hope this helps someone!"

Maintaining two versions of an ipython notebook

I often need to create two versions of an ipython notebook: One contains tasks to be carried out (usually including some python code and output), the other contains the same text plus solutions. Let's call them the assignment and the solution.
It is easy to generate the solution document first, then strip the answers to generate the assignment (or vice versa). But if I subsequently need to make changes (and I always do), I need to repeat the stripping process. Is there a reasonable workflow that will allow changes in the assignment to be propagated to the solutions document?
Partial self-answer: I have experimented with leveraging mercurial's hg copy, which will let two files with different names share history. But I can only get this to work if assignment and solution are in different directories, in two linked hg repositories. I would much prefer a simpler set-up. I've also noticed that diff gets very confused when one JSON file has more sections than another, making a VCS-based solution even less attractive. (To be clear: Ordinary use of a VCS with notebooks is fine; it's the parallel versions that stumble).
This question covers similar ground, but does not solve my problem. In fact an answer to my question would solve the OP's second remaining problem, "pulling changes" (see the Update section).

It sounds like you are maintaining an assignment and an answer key of some kind and want to be able to distribute the assignments (without solutions) to students, and still have the answers for yourself or a TA.
For something like this, I would create two branches "unsolved" and "solved". First write the questions on the "unsolved" branch. Then create the "solved" branch from there and add the solutions. If you ever need to update a question, update back to the "unsolved" branch, make the update and merge the change into "solved" and fix the solution.
You could try going the other way, but my hunch is that going "backwards" from solved to unsolved might be strange to maintain.

After some experimentation I concluded that it is best to tackle this by processing the notebook's JSON code. Version control systems are not the right approach, for the following reasons:
JSON doesn't diff very well when adding or deleting cells. A minimal change leads to mis-matched braces and a very messy diff.
In my use case, the superset version of the file (containing both the assignments and their solutions) must be the source document. This is because the assignment includes example code and output that depends on earlier parts, to be written by the students. This model does not play well with version control, as pointed out by #ChrisPhillips in his answer.
I ended up filtering the JSON structure for the notebook and stripping out the solution cells; they may be recognized via special metadata (which can be set interactively using the metadata button in the interface), or by pattern-matching on the cell contents. The following snippet shows how to filter out cells whose first line starts with # SOLUTION:
def stripcell(cell, pattern):
"""Check if the first line of the cell's content matches `pattern`"""
if cell["cell_type"] == "code":
content = cell["input"]
else:
content = cell["source"]
return ( len(content) > 0 and re.search(pattern, content[0]) )
pattern = r"^# SOLUTION:"
struct = json.load(open("input.ipynb"))
cells = struct["worksheets"][0]["cells"]
struct["worksheets"][0]["cells"] = [ c for c in cells if not stripcell(c, pattern) ]
json.dump(struct, open("output.ipynb", "wb"), indent=1)
I used the generic json library rather than the notebook API. If there's a better way to go about it, please let me know.

Updating files with a Perforce trigger before submit

I understand that this question has, in essence, already been asked, but that question did not have an unequivocal answer, so please bear with me.
Background: In my company, we use Perforce submission numbers as part of our versioning. Regardless of whether this is a correct method or not, that is how things are. Currently, many developers do separate submissions for code and documentation: first the code and then the documentation to update the client-facing docs with what the new version numbers should be. I would like to streamline this process.
My thoughts are as follows: create a Perforce trigger (which runs on the server side) which scans the submitted documentation files (such as .txt) for a unique term (such as #####PERFORCE##CHANGELIST##NUMBER###ROFL###LOL###WHATEVER#####) and then replaces it with the value of what the change list would be when submitted. I already know how to determine this value. What I cannot figure out, is how or where to update the files.
I have already determined that using the change-content trigger (whether possible or not), which
"fire[s] after changelist creation and file transfer, but prior to committing the submit to the database",
is the way to go. At this point the files need to exist somewhere on the server. How do I determine the (temporary?) location of these files from within, say, a Python script so that I can update or sed to replace the placeholder value with the intended value? The online documentation for Perforce which I have found so far have not been very explicit on whether this is possible or how the mechanics of a submission at this stage would work.
EDIT
Basically what I am looking for is RCS-like functionality, but without the unsightly special character sequences which accompany it. After more digging, what I am asking is the same as this question. However I believe that this must be possible, because the trigger is running on the server side and the files had already been transferred to the server. They must therefore be accessible by the script.
EXAMPLE
Consider the following snippet from a release notes document:
[#####PERFORCE##CHANGELIST##NUMBER###ROFL###LOL###WHATEVER#####] Added a cool new feature. Early retirement is in sight.
[52702] Fixed a really annoying bug. Many lives saved.
[52686] Fixed an annoying bug.
This is what the user submits. I then want the trigger to intercept this file during the submission process (as mentioned, at the change-content stage) and alter it so that what is eventually stored within Perforce looks like this:
[52738] Added a cool new feature. Early retirement is in sight.
[52702] Fixed a really annoying bug. Many lives saved.
[52686] Fixed an annoying bug.
Where 52738 is the final change list number of what the user submitted. (As mentioned, I can already determine this number, so please do dwell on this point.) I.e., what the user sees on the Perforce client console is.
Changelist 52733 renamed 52738.
Submitted change 52738.

Are you trying to replace the content of pending changelist files that were edited on a different client workspace (and different user)?
What type of information are you trying to replace in the documentation files? For example,
is it a date, username like with RCS keyword expansion? http://www.perforce.com/perforce/doc.current/manuals/p4guide/appendix.filetypes.html#DB5-18921
I want to get better clarification on what you are trying to accomplish in case there is another way to do what you want.
Depending on what you are trying to do, you may want to consider shelving ( http://www.perforce.com/perforce/doc.current/manuals/p4guide/chapter.files.html#d0e5537 )
Also, there is an existing Perforce enhancement request I can add your information to,
regarding client side triggers to modify files on the client side prior to submit. If it becomes implemented, you will be notified by email.

99w,
I have also added you to an existing enhancement request for Customizable RCS keywords, along
with the example you provided.
Short of using a post-command trigger to edit the archive content directly and then update the checksum in the database, there is currently not a way to update the file content with the custom-edited final changelist number.

One of the things I learned very early on in programming was to keep out of interrupt level as much as possible, and especially don't do stuff in interrupt that requires resources that can hang the system. I totally get that you want to resolve the internal labeling in sequence, but a better way to do it may be to just set up the edit during the trigger so that a post trigger tool can perform the file modification.
Correct me if I'm looking at this wrong, but there seems a bit of irony, or perhaps recursion, if you are trying to make a file change during the course of submitting a file change. It might be better to have a second change list that is reserved for the log. You always know where that file is, in your local file space. That said, ktext files and $ fields may be able to help.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.