How to automatically rebase all children branches onto master after squashing and merging the parent branch? - git-merge

Building on this question, I have a workflow where I'm constantly making PRs on top of PRs to make it easier for others to review my work. Goal is to have smaller PR sizes. So I often end up with situations like the following:
G--H--I <-- branch3
/
D--E--F <-- branch2
/
A--B--C <-- branch1
/
M <-- master
And so on for N branches after branch3. The problem is, after I squash and merge branch1, I have to manually rebase branches 2, 3...N:
G--H--I <-- branch3
/
D--E--F <-- branch2
/
A--B--C
/
M--S <-- master, origin/master (branch1 changes are squashed in S)
In the above case, I have to run:
git checkout branch2
git rebase --onto master (SHA-1 of C)
git checkout branch3
git rebase --onto branch2 (SHA-1 of F)
And so on...
Is there a way to automate this process by rebasing all branches automatically with a script? What I can't figure out is a way to automatically detect the correct SHA-1 to pass as parameter for each rebase.

There are a couple of fundamental problems, or maybe one fundamental problem, depending on how you look at it. That is:
branches do not have parent/child relationships, and/or
branches, in the sense you mean the word, don't exist. All that we have are branch names. The branches themselves are mirages, or something. (This doesn't really seem like the right way to look at it, but it helps shake one loose from the more rigid view of branches that most non-Git systems take.)
Let's start with a question that seems straightforward, but because Git is Git, is actually a trick question: which branch holds commits A-B-C?
Is there a way to automate this process by rebasing all branches automatically with a script? What I can't figure out is a way to automatically detect the correct SHA-1 to pass as parameter for each rebase.
There isn't a general solution to this problem. If you have exactly the situation you have drawn, however, there is a specific solution to your specific situation—but you'll have to write it yourself.
The answer to the trick question is that commits A-B-C are on every branch except master. A branch name like branch3 just identifies one particular commit, in this case commit I. That commit identifies another commit, in this case, commit H. Each commit always identifies some previous commit—or, in the case of a merge commit, two or more previous commits—and Git simply works backwards from the end. "The end" is precisely that commit whose hash ID is stored in the branch name.
Branch names lack parent/child relationships because every branch name can be moved or destroyed at any time without changing the hash ID stored in each other branch. New names can be created at any time too: the only constraint on creating a new name is that you must pick some existing commit for that name to point-to.
The commits have parent/child relationships, but the names do not. This leads to the solution to this specific situation, though. If commit Y is a descendant of commit X, that means there's some backwards path where we start at Y and can work our way back to X. This relationship is ordered—mathematically speaking, it forms a partial order over the set of commits—so that X ≺ Y (X precedes Y, i.e., X is an ancestor of Y), then Y ≻ X (Y succeeds X: Y is a descendant of X).
So we take our set of names, translate each name to a commit hash ID, and perform these is-ancestor tests. Git's "is-ancestor" operator actually tests for ≼ (precedes or is equal to), and the is-equal case occurs with:
...--X <-- name1, name2
where both names select the same commit. If that could occur we would have to analyze what our code might do with that case. It turns out that this usually doesn't require any special work at all (though I won't bother proving this).
Having found the "last" commit—the one for which every commit comes "before" the commit in question—we now need to do our rebase operation. We have:
G--H--I <-- branch3
/
D--E--F <-- branch2
/
A--B--C
/
M--S <-- master, origin/master (branch1 changes are squashed in S)
just as you showed, and we know that S represents the A-B-C sequence because we picked commit C (via the name branch1) when we made S. Since the last commit is commit I, we want to copy—as rebase does—every commit from D through I, with the copies landing after S. It might be best if Git didn't move any of these branch names at all, during the copying operation, and we can get that to happen using Git's detached HEAD mode:
git checkout --detach branch3 # i.e., commit `I`
or:
git checkout <hash-of-I> # detach and get to commit `I`
or:
git switch --detach ... # `git switch` always requires the --detach
which gets us:
G--H--I <-- branch3, HEAD
/
D--E--F <-- branch2
/
A--B--C
/
M--S <-- master, origin/master
We now run git rebase --onto master branch1 if the name branch1 is still available, or git rebase --onto master <hash-of-C> if not. This copies everything as desired:
G--H--I <-- branch3
/
D--E--F <-- branch2
/
A--B--C
/
M--S <-- master, origin/master
\
D'-E'-F'
\
G'-H'-I' <-- HEAD
Now all (?) we need to do is go back through those same sets of branch names and count how far they are along the chain of original commits. Because of the way Git works—backwards—we'll do this starting from wherever they end and working backwards to commit C. For this particular drawing, that's 3 for branch2 and 6 for branch3. We count how many commits we copied as well, which is also of course 6. So we subtract 3 from 6 for branch2, and 6 from 6 for branch3. That tells us where we should move those branch names now: zero steps back from I' for branch3, and three steps back from I' for branch2. So now we make one last loop through each name and re-set each name as appropriate.
(Then we probably should pick some name to git checkout or git switch to.)
There are some challenges here:
Where did we get this set of names? The names are branch1, branch2, branch3, and so on, but in reality they won't be so obviously related: why do we move branch fred but not branch barney?
How did we know that branch1 is the one that we shouldn't use here, but should use as the "don't copy this commit" argument to our git rebase-with-detached-HEAD?
How exactly do we do this is-ancestor / is-descendant test?
This question actually has an answer: git merge-base --is-ancestor is the test. You give it two commit hash IDs and it reports whether the left-hand one is an ancestor of the right-hand one: git merge-base --is-ancestor X Y tests X ≼ Y. Its result is its exit status, suitable for use in shell scripts with the if built in.
How do we count commits?
This question also has an answer: git rev-list --count stop..start starts at the start commit and works backwards. It stops working backwards when it reaches stop or any of its ancestors. It then reports a count of the number of commits visited.
How do we move a branch name? How do we figure out which commit to land on?
This one is easy: git branch -f will let us move an existing branch name, as long as we do not have that name currently checked-out. As we are on a detached HEAD after the copying process, we have no name checked-out, so all names can be moved. Git itself can do the counting-back, using the tilde and numeric suffix syntax: HEAD~0 is commit I', HEAD~1 is commit H', HEAD~2 is commit G', HEAD~3 is commit F', and so on. Given a number $n we just write HEAD~$n, so git branch -f $name HEAD~$n does the job.
You still have to solve the first two questions. The solution to that will be specific to your particular situation.
Worth pointing out, and probably the reason no one has written a proper solution for this—I wrote my own approximate solution many years ago but abandoned it many years ago as well—is that this whole process breaks down if you don't have this very specific situation. Suppose that instead of:
G--H--I <-- branch3
/
D--E--F <-- branch2
/
A--B--C <-- branch1
/
M <-- master
you begin with:
G--H--I <-- branch3
/
D--E--F <-- branch2
/
A--B--C <-- branch1
/
M <-- master
This time, ending at commit I and copying all commits that reach back through, but do not include, commit C fails to copy commit F. There is no F' to allow you to move branch name branch2 after copying D-E-G-H-I to D'-E'-G'-H'-I'.
This problem was pretty major, back in the twenty-aughts and twenty-teens. But git rebase has been smartened up a bunch, with the newfangled -r (--rebase-merges) interactive rebase mode. It now has almost all the machinery for a multi-branch rebase to Just Work. There are a few missing pieces that are still kind of hard here, but if we can solve the first two problems—how do we know which branch names to multi-rebase in the first place—we could write a git multirebase command that would do the whole job.

Related

GitLab API - get the overall # of lines of code

I'm able to get the stats (additions, deletions, total) for each commit, however how can I get the overall #?
For example, if one MR has 30 commits, I need the net # of lines of code added\deleted which you can see in the top corner.
This # IS NOT the sum of all #'s per commit.
So, I would need an API that returns the net # of lines of code added\removed at MR level (no matter how many commits are).
For example, if I have 2 commits: 1st one adds 10 lines, and the 2nd one removes the exact same 10 lines, then the net # is 0.
Here is the scenario:
I have an MR with 30 commits.
GitLab API provides support to get the stats (lines of code added\deleted) per Commit (individually).
If I go in GitLab UI, go to the MR \ Changes, I see the # of lines added\deleted that is not the SUM of all the Commits stats that I'm getting thru API.
That's my issue.
A simpler example: let's say I have 2 commits, one adds 10 lines of code, while the 2nd commit removes the exact same 10 lines of code. Using the API, I'm getting the sum, which is 20 LOCs added. However, if I go in the GitLab UI \ Changes, it's showing me 0 (zero), which is correct; that's the net # of chgs overall. This is the inconsistency I noticed.
To do this for an MR, you would use the MR changes API and count the occurrences of lines starting with + and - in the changes[].diff fields to get the additions and deletions respectively.
Using bash with gitlab-org/gitlab-runner!3195 as an example:
GITLAB_HOST="https://gitlab.com"
PROJECT_ID="250833"
MR_ID="3195"
URL="${GITLAB_HOST}/api/v4/projects/${PROJECT_ID}/merge_requests/${MR_ID}/changes"
DIFF=$(curl ${URL} | jq -r ".changes[].diff")
ADDITIONS=$(grep -E "^\+" <<< "$DIFF")
DELETIONS=$(grep -E "^\-" <<< "$DIFF")
NUM_ADDITIONS=$(wc -l <<< "$ADDITIONS")
NUM_DELETIONS=$(wc -l <<< "$DELETIONS")
echo "${MR_ID} has ${NUM_ADDITIONS} additions and ${NUM_DELETIONS} deletions"
The output is
3195 has 9 additions and 2 deletions
This matches the UI, which also shows 9 additions and 2 deletions
This, as you can see is a representative example of your described scenario since the combined total of the individual commits in this MR are 13 additions and 6 deletions.

Interactive rebase in main repo that affects submodule hashes

I have one main repo that contains multiple submodules that each have active development.
main_proj/
- sub1/
- sub2/
- sub3/
It is not uncommon for development in the main project to be performed in lock-step with one or more of the submodules.
1234abc main_proj commit 1
2345bcd main_proj commit 2
fedc987 submodule update1 for sub1 <-- only updates submodule hash
3456cde main_proj commit 3
edcb876 submodule update2 for sub1 <-- only updates submodule hash
aaaa001 submodule update1 for sub2 <-- updates main_proj AND submodule hash at same time
4567def main_proj commit 4
5678ef0 main_proj commit 5
cccc999 submodule update for sub1, sub2 and sub3 <-- updates submodule hashes
6789f01 main_proj commit 6
I want to run an interactive rebase on main_proj and rearrange the last two or three commits. Some of those commits, however, update the submodule hashes and therefore, causes merge conflicts:
Auto-merging sub1
CONFLICT (submodule): Merge conflict in sub1
error: could not apply 8792e483... Take 1
Resolve all conflicts manually, mark them as resolved with "git add/rm <conflicted_files>", then run "git rebase --continue".
You can instead skip this commit: run "git rebase --skip".
To abort and get back to the state before "git rebase", run "git rebase --abort".
Could not apply 8792e483... Take 1
Since the interactive rebase was intended to only work on files under main_proj, I will just re-apply the "submodule hash" commits once the work is done, so my ultimate commit will be pointing at all the currect submodule HEADs.
Is there a command during interactive rebase that I need to use in order to allow the commit to continue with "dirty" changes to the submodules?

SLURM releasing resources using scontrol update results in unknown endtime

I have a program that will dynamically release resources during job execution, using the command:
scontrol update JobId=$SLURM_JOB_ID NodeList=${remaininghosts}
However, this results in some very weird behavior sometimes. Where the job is re-queued. Below is the output of sacct
sacct -j 1448590
JobID NNodes State Start End NodeList
1448590 4 RESIZING 20:47:28 01:04:22 [0812,0827],[0663-0664]
1448590.0 4 COMPLETED 20:47:30 20:47:30 [0812,0827],[0663-0664]
1448590.1 4 RESIZING 20:47:30 01:04:22 [0812,0827],[0663-0664]
1448590 3 RESIZING 01:04:22 01:06:42 [0812,0827],0663
1448590 2 RESIZING 01:06:42 1:12:42 0827,tnxt-0663
1448590 4 COMPLETED 05:33:15 Unknown 0805-0807,0809]
The first lines show everything works fine, nodes are getting released but in the last line, it shows a completely different set of nodes with an unknown end time. The slurm logs show the job got requeued:
requeue JobID=1448590 State=0x8000 NodeCnt=1 due to node failure.
I suspect this might happen because the head node is killed, but the slurm documentation doesn't say anything about that.
Does anybody had an idea or suggestion?
Thanks
In this post there was a discussion about resizing jobs.
In your particular case, for shrinking I would use:
Assuming that j1 has been submitted with:
$ salloc -N4 bash
Update j1 to the new size:
$ scontrol update jobid=$SLURM_JOBID NumNodes=2
$ scontrol update jobid=$SLURM_JOBID NumNodes=ALL
And update the environmental variables of j1 (the script is created by the previous commands):
$ ./slurm_job_$SLURM_JOBID_resize.sh
Now, j1 has 2 nodes.
In your example, your "remaininghost" list, as you say, may exclude the head node that is needed by Slurm to shrink the job. If you provide a quantity instead of a list, the resize should work.

How to get information on latest successful pod deployment in OpenShift 3.6

I am currently working on making a CICD script to deploy a complex environment into another environment. We have multiple technology involved and I currently want to optimize this script because it's taking too much time to fetch information on each environment.
In the OpenShift 3.6 section, I need to get the last successful deployment for each application for a specific project. I try to find a quick way to do so, but right now I only found this solution :
oc rollout history dc -n <Project_name>
This will give me the following output
deploymentconfigs "<Application_name>"
REVISION STATUS CAUSE
1 Complete config change
2 Complete config change
3 Failed manual change
4 Running config change
deploymentconfigs "<Application_name2>"
REVISION STATUS CAUSE
18 Complete config change
19 Complete config change
20 Complete manual change
21 Failed config change
....
I then take this output and parse each line to know which is the latest revision that have the status "Complete".
In the above example, I would get this list :
<Application_name> : 2
<Application_name2> : 20
Then for each application and each revision I do :
oc rollout history dc/<Application_name> -n <Project_name> --revision=<Latest_Revision>
In the above example the Latest_Revision for Application_name is 2 which is the latest complete revision not building and not failed.
This will give me the output with the information I need which is the version of the ear and the version of the configuration that was used in the creation of the image use for this successful deployment.
But since I have multiple application, this process can take up to 2 minutes per environment.
Would anybody have a better way of fetching the information I required?
Unless I am mistaken, it looks like there are no "one liner" with the possibility to get the information on the currently running and accessible application.
Thanks
Assuming that the currently active deployment is the latest successful one, you may try the following:
oc get dc -a --no-headers | awk '{print "oc rollout history dc "$1" --revision="$2}' | . /dev/stdin
It gets a list of deployments, feeds it to awk to extract the name $1 and revision $2, then compiles your command to extract the details, finally sends it to standard input to execute. It may be frowned upon for not using xargs or the like, but I found it easier for debugging (just drop the last part and see the commands printed out).
UPDATE:
On second thoughts, you might actually like this one better:
oc get dc -a -o jsonpath='{range .items[*]}{.metadata.name}{"\n\t"}{.spec.template.spec.containers[0].env}{"\n\t"}{.spec.template.spec.containers[0].image}{"\n-------\n"}{end}'
The example output:
daily-checks
[map[name:SQL_QUERIES_DIR value:daily-checks/]]
docker-registry.default.svc:5000/ptrk-testing/daily-checks#sha256:b299434622b5f9e9958ae753b7211f1928318e57848e992bbf33a6e9ee0f6d94
-------
jboss-webserver31-tomcat
registry.access.redhat.com/jboss-webserver-3/webserver31-tomcat7-openshift#sha256:b5fac47d43939b82ce1e7ef864a7c2ee79db7920df5764b631f2783c4b73f044
-------
jtask
172.30.31.183:5000/ptrk-testing/app-txeq:build
-------
lifebicycle
docker-registry.default.svc:5000/ptrk-testing/lifebicycle#sha256:a93cfaf9efd9b806b0d4d3f0c087b369a9963ea05404c2c7445cc01f07344a35
You get the idea, with expressions like .spec.template.spec.containers[0].env you can reach for specific variables, labels, etc. Unfortunately the jsonpath output is not available with oc rollout history.
UPDATE 2:
You could also use post-deployment hooks to collect the data, if you can set up a listener for the hooks. Hopefully the information you need is inherited by the PODs. More info here: https://docs.openshift.com/container-platform/3.10/dev_guide/deployments/deployment_strategies.html#lifecycle-hooks

Re-structuring CVS repositories and retaining history

Currently, we have the following in our CVS Repository :
Module1
|
|
+-----A
|
+-----B
We want o restructure this module such that the sub directories A and B appears as high level modules. What I could do is to check Module1 out and then pull A and B out and then do a fresh cvs add for A and B individually, thus making them new cvs modules. But I am sure if I do this, I am going to lose the history as well as I would have to remove all internal CVS folders under A and B.
Q1: So is there a way to restructure this and retain the history?
What I essentially am trying to do is to filter out access between A and B.
So -
Q2: Is there a way to set up security so that certain users can check out Module1/A only and not Module1/B ? and vice-versa?
Q1: So is there a way to restructure this and retain the history?
Like you wrote in your comment, if you have sys privs you can mv modules around the repository and keep the history of all the files below A and B but in doing so, you lose the history that /A used to be Module1/A and /B used to be in Module1/B (not to mention build scripts probably break now). Subversion resolves this for you by offering the move (or rename) command which remembers the move/rename history of a module.
Q2: Is there a way to set up security so that certain users can check out Module1/A only and not Module1/B ? and vice-versa?
There sure is, used group permissions. From this page,
http://www.linux.ie/articles/tutorials/managingaccesswithcvs.php
Here's the snip I'm referring to in case that page ever goes away
To every module its group
We have seen earlier how creating a
cvsusers group helped with the
coordination of the work of several
developers. We can extend this
approach to permit directory level
check-in restrictions.
In our example, let's say that the
module "cujo" is to be r/w for jack
and john, and the module "carrie" is
r/w for john and jill. We will create
two groups, g_cujo and g_carrie, and
add the appropriate users to each - in
/etc/group we add
g_cujo:x:3200:john,jack
g_carrie:x:3201:john,jill>
Now in the repository (as root), run
find $CVSROOT/cujo -exec chgrp g_cujo {} \;
find $CVSROOT/carrie -exec chgrp g_carrie {} \;
ensuring, as before, that all
directories have the gid bit set.
Now if we have a look in the
repository...
john#bolsh:~/cvs$ ls -l
total 16
drwxrwsr-x 3 john cvsadmin 4096 Dec 28 19:42 CVSROOT
drwxrwsr-x 2 john g_carrie 4096 Dec 28 19:35 carrie
drwxrwsr-x 2 john g_cujo 4096 Dec 28 19:40 cujo
and if Jack tries to commit a change
to carrie...
jack#bolsh:~/carrie$ cvs update
cvs server: Updating .
M test
jack#bolsh:~/carrie$ cvs commit -m "test"
cvs commit: Examining .
Checking in test;
/home/john/cvs/carrie/test,v <-- test
new revision: 1.2; previous revision: 1.1
cvs [server aborted]: could not open lock file
`/home/john/cvs/carrie/,test,': Permission denied
jack#bolsh:~/carrie$
But in cujo, there is no problem.
jack#bolsh:~/cujo$ cvs update
cvs server: Updating .
M test
jack#bolsh:~/cujo$ cvs commit -m "Updating test"
cvs commit: Examining .
Checking in test;
/home/john/cvs/cujo/test,v <-- test
new revision: 1.2; previous revision: 1.1
done
jack#bolsh:~/cujo$
The procedure for adding a user is now
a little more complicated that it
might be. To create a new CVS user, we
have to create a system user, add them
to the groups corresponding to the
modules they may write to, and (if
you're using a pserver method)
generate a password for them, and add
an entry to CVSROOT/passwd.
To add a project, we need to create a
group, import the sources, change the
groups on all the files in the
repository and make sure the set gid
on execution bit is set on all
directories inside the module, and add
the relevant users to the group.
There is undoubtedly more
administration needed to do all this
than when we jab people with a pointy
stick. In that method, we never have
to add a system user or a group or
change the groups on directories - all
that is taken care of once we set up
the repository. This means that an
unpriveleged user can be the CVS admin
without ever having root priveleges on
the machine.