GitLab API: pipeline not returning all jobs - gitlab-ci

I'm using the GitLab api, to list out the jobs in a pipeline. It's always been fine in the past, but I've added a couple of extra items to the flow and now it doesn't return all of the jobs:
$ curl --globoff -sSH "$CURL_HEADER" https://.../api/v4/projects/$CI_PROJECT_ID/pipelines/$PIPEID/jobs?scope[]=success | jq --raw-output '.[] | "\(.id)"' | wc -l
20
The jobs that are missing aren't retries (as noted here).
I can see the missing jobids in the web interface.
Is there a maximum of 20 jobs via this method?

So turns out this API response is paginated, there's no indication in docs for this item.
There is a general item describing this here, but it doesn't give a list of routes it is related to. If it did it would probably show up in a search far easier.
All I needed to do was append &per_page=100 (qq-ing for the & for my use case). Alternatively you can check the return header for the X-Next-Page value and then append &page=X to get the subsequent pages...
Related page variables are:
x-next-page: 2
x-page: 1
x-per-page: 20
x-prev-page:
x-total: 23
x-total-pages: 2

Related

Gitlab API curl request

When I'm trying to use GitLab API GET request to get all commits from specific date range and specific branch, I receive only commits from NEXT day after I put since date.
I mean, if I define since date for example - from 2022-12-01T12:17:30.000+02:00 until 2022-12-15T15:01:36.000+01:00. But, my commits from curl request starting from 2 Dec 2022.
How does to include initial date to response?
curl -s --header "PRIVATE-TOKEN: <token>" https://gitlab.example.com/api/v4/projects/ID/repository/commits"?ref_name=${branch}&since=${since_date}&until=${until_date}" | jq -r '.[] | .committed_date + "\t" + .title'
Response which I receive:
2022-12-15T15:01:36.000+01:00
2022-12-15T14:39:44.000+02:00
2022-12-14T08:26:43.000+02:00
2022-12-13T20:55:03.000+02:00
2022-12-13T15:51:34.000+01:00
2022-12-13T15:43:26.000+01:00
2022-12-12T16:50:49.000+01:00
2022-12-07T16:38:26.000+01:00
2022-12-05T22:41:04.000+01:00
2022-12-02T09:23:58.000+01:00
By the way, I tried to use, but it didn't help me.
?first_parent=true
It could be a pagination problem. Try adding &per_page=100 to see if there are more commits. I think you can't get more than 100 items in one api call.
See https://docs.gitlab.com/ee/api/#pagination for more info

GitLab API - get the overall # of lines of code

I'm able to get the stats (additions, deletions, total) for each commit, however how can I get the overall #?
For example, if one MR has 30 commits, I need the net # of lines of code added\deleted which you can see in the top corner.
This # IS NOT the sum of all #'s per commit.
So, I would need an API that returns the net # of lines of code added\removed at MR level (no matter how many commits are).
For example, if I have 2 commits: 1st one adds 10 lines, and the 2nd one removes the exact same 10 lines, then the net # is 0.
Here is the scenario:
I have an MR with 30 commits.
GitLab API provides support to get the stats (lines of code added\deleted) per Commit (individually).
If I go in GitLab UI, go to the MR \ Changes, I see the # of lines added\deleted that is not the SUM of all the Commits stats that I'm getting thru API.
That's my issue.
A simpler example: let's say I have 2 commits, one adds 10 lines of code, while the 2nd commit removes the exact same 10 lines of code. Using the API, I'm getting the sum, which is 20 LOCs added. However, if I go in the GitLab UI \ Changes, it's showing me 0 (zero), which is correct; that's the net # of chgs overall. This is the inconsistency I noticed.
To do this for an MR, you would use the MR changes API and count the occurrences of lines starting with + and - in the changes[].diff fields to get the additions and deletions respectively.
Using bash with gitlab-org/gitlab-runner!3195 as an example:
GITLAB_HOST="https://gitlab.com"
PROJECT_ID="250833"
MR_ID="3195"
URL="${GITLAB_HOST}/api/v4/projects/${PROJECT_ID}/merge_requests/${MR_ID}/changes"
DIFF=$(curl ${URL} | jq -r ".changes[].diff")
ADDITIONS=$(grep -E "^\+" <<< "$DIFF")
DELETIONS=$(grep -E "^\-" <<< "$DIFF")
NUM_ADDITIONS=$(wc -l <<< "$ADDITIONS")
NUM_DELETIONS=$(wc -l <<< "$DELETIONS")
echo "${MR_ID} has ${NUM_ADDITIONS} additions and ${NUM_DELETIONS} deletions"
The output is
3195 has 9 additions and 2 deletions
This matches the UI, which also shows 9 additions and 2 deletions
This, as you can see is a representative example of your described scenario since the combined total of the individual commits in this MR are 13 additions and 6 deletions.

Is there a way to get a nice error report summary when running many jobs on DRMAA cluster?

I need to run a snakemake pipeline on a DRMAA cluster with a total number of >2000 jobs. When some of the jobs have failed, I would like to receive in the end an easy readable summary report, where only the failed jobs are listed instead of the whole job summary as given in the log.
Is there a way to achieve this without parsing the log file by myself?
These are the (incomplete) cluster options:
jobs: 200
latency-wait: 5
keep-going: True
rerun-incomplete: True
restart-times: 2
I am not sure if there is another way than parsing the log file yourself, but I've done it several times with grep and I am happy with the results:
cat .snakemake/log/[TIME].snakemake.log | grep -B 3 -A 3 error
Of course you should change the TIME placeholder for whichever run you want to check.

Splunk - Disabling alerts during maintenance window

I have a simple cvs file loaded in splunk called StandardMaintenance.csv which looks like this...
UnderMaintenance
NO
We currently get bombarded with alerts during our maintenance window. At the start of maintenance, I want to be able to change this to YES to stop the alerts (I have an easy way to do this). I am looking for something standard to add to all alert queries that check this csv for status (lookup as I understand it) and for the query to return nothing if UnderMaintenance = YES, thus not generate a match to the query.
It is basically a binary, ON or OFF. I would appreciate any help you could provide.
NOTE:
You cannot disable the alert by executing splunk query because the
Rest API requires a POST action.
Step 1: Maintain a csv file of all your savedsearches with owners by using below query. You can schedule the query as per your convenience. For example below search creates maintenance.csv and replaces all contents whenever executed.
| rest /servicesNS/-/search/saved/searches | table title eai:acl.owner | outputlookup maintenance.csv
This file would get created in $SPLUNK_HOME/etc/apps/<app name>/lookups
Step 2: Write a script to read data from maintenance.csv file and execute below command to disable searches. (Run before maintenance window)
curl -X POST -k -u admin:pass https://<splunk server>:8089/servicesNS/<owner>/search/saved/searches/<search title>/disable
Step 3: Do the same thing to enable all seaches, just change the command to below (Run after maintenance window)
curl -X POST -k -u admin:pass https://<splunk server>:8089/servicesNS/<owner>/search/saved/searches/<search title>/enable
EDIT 1:
Create StandardMaintenance.csv file under $SPLUNK_HOME/etc/apps/search/lookups.
The StandardMaintenance.csv file contains :
UnderMaintenance
"No"
Use below search query to get results of existing saved searches only if UnderMaintenance = No :
| rest /servicesNS/-/search/saved/searches
| eval UnderMaintenance = "No"
| table title eai:acl.owner UnderMaintenance
| join UnderMaintenance
[| inputlookup StandardMaintenance.csv ]
| table title eai:acl.owner
Hope this helps !
Before each query create a variable (say it's called foo) that you set to true if maintenance is NO and that you do not set otherwise, as below:
... | eval foo=case(maintenance=="NO","true")
Then you put the below at the end of your query:
| eval foo=$foo$
This will make your query execute only if maintenance is NO

How to get information on latest successful pod deployment in OpenShift 3.6

I am currently working on making a CICD script to deploy a complex environment into another environment. We have multiple technology involved and I currently want to optimize this script because it's taking too much time to fetch information on each environment.
In the OpenShift 3.6 section, I need to get the last successful deployment for each application for a specific project. I try to find a quick way to do so, but right now I only found this solution :
oc rollout history dc -n <Project_name>
This will give me the following output
deploymentconfigs "<Application_name>"
REVISION STATUS CAUSE
1 Complete config change
2 Complete config change
3 Failed manual change
4 Running config change
deploymentconfigs "<Application_name2>"
REVISION STATUS CAUSE
18 Complete config change
19 Complete config change
20 Complete manual change
21 Failed config change
....
I then take this output and parse each line to know which is the latest revision that have the status "Complete".
In the above example, I would get this list :
<Application_name> : 2
<Application_name2> : 20
Then for each application and each revision I do :
oc rollout history dc/<Application_name> -n <Project_name> --revision=<Latest_Revision>
In the above example the Latest_Revision for Application_name is 2 which is the latest complete revision not building and not failed.
This will give me the output with the information I need which is the version of the ear and the version of the configuration that was used in the creation of the image use for this successful deployment.
But since I have multiple application, this process can take up to 2 minutes per environment.
Would anybody have a better way of fetching the information I required?
Unless I am mistaken, it looks like there are no "one liner" with the possibility to get the information on the currently running and accessible application.
Thanks
Assuming that the currently active deployment is the latest successful one, you may try the following:
oc get dc -a --no-headers | awk '{print "oc rollout history dc "$1" --revision="$2}' | . /dev/stdin
It gets a list of deployments, feeds it to awk to extract the name $1 and revision $2, then compiles your command to extract the details, finally sends it to standard input to execute. It may be frowned upon for not using xargs or the like, but I found it easier for debugging (just drop the last part and see the commands printed out).
UPDATE:
On second thoughts, you might actually like this one better:
oc get dc -a -o jsonpath='{range .items[*]}{.metadata.name}{"\n\t"}{.spec.template.spec.containers[0].env}{"\n\t"}{.spec.template.spec.containers[0].image}{"\n-------\n"}{end}'
The example output:
daily-checks
[map[name:SQL_QUERIES_DIR value:daily-checks/]]
docker-registry.default.svc:5000/ptrk-testing/daily-checks#sha256:b299434622b5f9e9958ae753b7211f1928318e57848e992bbf33a6e9ee0f6d94
-------
jboss-webserver31-tomcat
registry.access.redhat.com/jboss-webserver-3/webserver31-tomcat7-openshift#sha256:b5fac47d43939b82ce1e7ef864a7c2ee79db7920df5764b631f2783c4b73f044
-------
jtask
172.30.31.183:5000/ptrk-testing/app-txeq:build
-------
lifebicycle
docker-registry.default.svc:5000/ptrk-testing/lifebicycle#sha256:a93cfaf9efd9b806b0d4d3f0c087b369a9963ea05404c2c7445cc01f07344a35
You get the idea, with expressions like .spec.template.spec.containers[0].env you can reach for specific variables, labels, etc. Unfortunately the jsonpath output is not available with oc rollout history.
UPDATE 2:
You could also use post-deployment hooks to collect the data, if you can set up a listener for the hooks. Hopefully the information you need is inherited by the PODs. More info here: https://docs.openshift.com/container-platform/3.10/dev_guide/deployments/deployment_strategies.html#lifecycle-hooks