Access Console of completed/failed tasks in Tekton - tekton

I am able to login into a taskrun pod as long as the task is being executed by:
kubectl exec $POD_NAME /bin/bash
However, if a task has failed or completed. I am unable to login by kubectl exec command, since it states, "cannot login to a completed tasks".
If need to debug on a failed tasks, is there any way to attach to a console of a failed/completed tasks in Tekton.
I am running on a minikube environment.

Tekton Tasks are Pods. When they complete, or when they fail: that pod exits, which leaves you unable to get in.
Troubleshooting, you may edit your Task, catch the error and start some "sleep" command, which might help figuring it out.
Or, without risking to impact other jobs, I would usually prefer to re-create the Pod corresponding to my failed task.
$ kubectl get pods -n <ci-namespace> | grep <taskrun-name>
NAME
<tasksrun>-xxx-yyy
$ kubectl get pods -n <ci-namespace <taskrun>-xxx-yyy -o yaml >check.yaml
Then, edit that yaml file. Remove all metadata unless name/namespace. Change metadata.name, making sure your pod has its own name. Remove the status block. Catch failure where it's needed and add your 'sleep'. Then kubectl create that file and enter your pod.
Depending on what you're troubleshooting, it may be easier to add some PVC workspace into your task, make sure your working directories, logs, built assets, ... end up in some volume that you could mount from a separate container, should you need to troubleshoot it.
Or: if you're fast enough, just re-run your pipeline/task, enter its container while it starts, and try troubleshooting it before it fails.

Related

How to use podman's ssh build flag?

I have been using the docker build --ssh flag to give builds access to my keys from ssh-agent.
When I try the same thing with podman it does not work. I am working on macOS Monterey 12.0.1. Intel chip. I have also reproduced this on Ubuntu and WSL2.
❯ podman --version
podman version 3.4.4
This is an example Dockerfile:
FROM python:3.10
RUN mkdir -p -m 0600 ~/.ssh \
&& ssh-keyscan github.com >> ~/.ssh/known_hosts
RUN --mount=type=ssh git clone git#github.com:ruarfff/a-private-repo-of-mine.git
When I run DOCKER_BUILDKIT=1 docker build --ssh default . it works i.e. the build succeeds, the repo is cloned and the ssh key is not baked into the image.
When I run podman build --ssh default . the build fails with:
git#github.com: Permission denied (publickey).
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
Error: error building at STEP "RUN --mount=type=ssh git clone git#github.com:ruarfff/a-private-repo-of-mine.git": error while running runtime: exit status 128
I have just begun playing around with podman. Looking at the docs, that flag does appear to be supported. I have tried playing around with the format a little, specifying the id directly for example but no variation of specifying the flag or the mount has worked so far. Is there something about how podman works that I may be missing that explains this?
Adding this line as suggested in the comments:
RUN --mount=type=ssh ssh-add -l
Results in this error:
STEP 4/5: RUN --mount=type=ssh ssh-add -l
Could not open a connection to your authentication agent.
Error: error building at STEP "RUN --mount=type=ssh ssh-add -l": error while running runtime: exit status 2
Edit:
I belive this may have something to do with this issue in buildah. A fix has been merged but has not been released yet as far as I can see.
The error while running runtime: exit status 2 does not to me appear to be necessarily related to SSH or --ssh for podman build. It's hard to say really, and I've successfully used --ssh like you are trying to do, with some minor differences that I can't relate to the error.
I am also not sure ssh-add being run as part of building the container is what you really meant to do -- if you want it to talk to an agent, you need to have two environment variables being exported from the environment in which you run ssh-add, these define where to find the agent to talk to and are as follows:
SSH_AUTH_SOCK, specifying the path to a socket file that a program uses to communicate with the agent
SSH_AGENT_PID, specifying the PID of the agent
Again, without these two variables present in the set of exported environment variables, the agent is not discoverable and might as well not exist at all so ssh-add will fail.
Since your agent is probably running as part of the set of processes to which your podman build also belongs to, at the minimum the PID denoted by SSH_AGENT_PID should be valid in that namespace (meaning it's normally invalid in the set of processes that container building is isolated to, so defining the variable as part of building the container would be a mistake). Similar story with SSH_AUTH_SOCK -- the path to the socket file dumped by starting the agent program, would not normally refer to a file that exists in the mount namespace of the container being built.
Now, you can run both the agent and ssh-add as part of building a container, but ssh-add reads keys from ~/.ssh and if you had key files there as part of the container image being built you wouldn't need --ssh in the first place, would you?
The value of --ssh lies in allowing you to transfer your authority to talk to remote services defined through your keys on the host, to the otherwise very isolated container building procedure, through use of nothing else but an SSH agent designed for this very purpose. That removes the need to do things like copying key files into the container. They (keys) should also normally not be part of the built container, especially if they were only to be used during building. The agent, on the other hand, runs on the host, securely encapsulates the keys you add to it, and since the host is where you'd have your keys that's where you're supposed to run ssh-add at to add them to the agent.

How to Completely Uninstall RUNDECK

I need a step by step procedure to uninstall RUNDECK. i am facing some STACK overflow issue which i wasn't able to resolve so i want to uninstall and install it from scratch
Stack error:
[2020-06-05 18:48:44.098] ERROR StackTrace --- [tp1284944245-71] Full Stack Trace:
org.grails.taglib.GrailsTagException: [views/layouts/base.gsp:184] Error executing tag <g:render>: [views/common/_sidebar.gsp:128] Error executing tag <g:ifMenuItems>: Method 'java.util.Set com.dtolabs.rundeck.core.authorization.providers.EnvironmentalContext.forProject(java.lang.String)' must be InterfaceMethodref constant
at org.grails.gsp.GroovyPage.throwRootCause(GroovyPage.java:473)
at org.grails.gsp.GroovyPage.invokeTag(GroovyPage.java:415)
at jdk.internal.reflect.GeneratedMethodAccessor217.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:98)
at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.in```
WAR based instance:
Make sure that the Rundeck process is down, identify the process
doing ps aux| grep -i rundeck and use kill -9 <PID> to shut
down.
Wipe the instance, you can delete all directory (and content)
defined in %RDECK_BASE%. All configurations and files are inside
this directory. If your system has a init script to launch rundeck, ensure that script doesn't have any reference to rundeck.
Re-install following this.
RPM-based (CentOS, RHEL, Fedora) instance:
Shutdown the Rundeck service: # systemctl stop rundeckd.
Make sure that the process is down: # systemctl status rundeckd.
Remove the package, do # yum remove rundeck.
Some files keep on the system, check and wipe the following paths:
/etc/rundeck, /var/lib/rundeck and /var/log/rundeck.
Re-install following this.
DEB-based (Debian, Ubuntu, Mint) instance:
Shutdown the Rundeck service: # systemctl stop rundeckd.
Make sure that the process is down doing # systemctl status rundeckd.
Remove the package, do # apt-get purge rundeck
Some files keep on the system, check and wipe the following paths:
/etc/rundeck, /var/lib/rundeck and /var/log/rundeck.
Reinstall following this.
Anyway, I recommend to make a backup of your instance / configurations before wiping it.
For testing the best option is to run Rundeck docker image, it saves a lot of time.
About the error, check your Rundeck version, maybe you're facing this issue.

Is it possible to debug a Gitlab CI build interactively?

I have a build in Gitlab CI that takes a long time (10mins+) to run, and it's very annoying to wait for the entire process every time I need to experiment / make changes. It seems like surely there's a way to access some sort of shell during the build process and run commands interactively instead of placing them all in a deploy script.
I know it's possible to run Gitlab CI tests locally, but I can't seem to find a way to access the deploy running in process, even after scouring the docs.
Am I out of luck or is there a way to manually control this lengthy build?
I have not found a clean way for now, but here is how I do it
I start building locally gitlab-runner exec docker your_build_name
I kill gitlab-runner using control + c right after the docker image to be built. You can still add the command sleep 1m as the first script line just to have time enough to kill gitlab-runner
Note: gitlab-runner will create a docker and then delete it once the job is done… killing it will ensure the docker is still there - no other alternative I know for now….
Manually log into the container docker exec -i -t <instance-id/tag-name> bash
Run your script commands manually…

Docker container exit(0) when using docker run command, but works with docker start command

I am attempting to dockerize a GUI app and have had some success. If I build the dockerfile into an image and then perform a docker run --name testcontainer testimage it appears that the process begins but the abruptly stops. I then check the container with docker ps to confirm no containers are running. Then I check docker ps -a and can see that it exited with status code exit(0). Then if I run the command docker start testcontainer, it appears to start the ENTRYPOINT command again, but this time it is able to continue and the GUI pops up.
My best guess is that I think that when I run the docker run command, the process begins but might be forked into a background process, causing the container to exit since the foreground process has ended. Although that could be way off because you would think the docker start command would result in the same outcome. I was thinking of trying to force the process to stay in the foreground, but do not know how to do that. Any suggestions?
UPDATE: I editted my Dockerfile to use supervisord to manage the starting of the GUI app. Now my docker run command will start supervisor, which will start my GUI app, and it works. Some things to note about this are that supervisor shows:
INFO spawned: myguiapp with pid 7
INFO success: myguiapp entered RUNNING state
INFO: exited: myguiapp (exit status 0; expected)
Supervisor and the container are still running at this point, which seems to indicate that the main process kicks off a child process. Since supervisor is still running, my container stays up and the GUI app does show up and I can use it. When I close the GUI, supervisor reports:
CRIT reaped unknown pid 93
Supervisor remains running, causing the container NOT to close. So I have to CTRL-C to kill supervisor. I'd rather not use supervisor, but if I need to, I would like for supervisor to close itself gracefully when that child process ends. If I could figure out how to get my container or supervisor to track child processes of the main process, then I think this would be solved.
The first issue is probably because your application requires a tty and you are not allocating a pseudo tty. Try running your container like this:
docker run -t --name testcontainer testimage
When you do a docker start the second time around it somehow allocates the pseudo-tty and the process keeps on running. I tried it myself. I couldn't find this info anywhere in the Docker docs though.
Also, if your UI is interactive you would want:
docker run -t -i --name testcontainer testimage

How to fail gitlab CI build?

I am trying to fail a build in gitlab CI and get email notification about it.
My build script is this:
echo "Listing files!"
ls -la
echo "##########################Preparing build##########################"
mkdir build
cd build
echo "Generating make files"
cmake -G "Unix Makefiles" -D CMAKE_BUILD_TYPE=Release -D CMAKE_VERBOSE_MAKEFILE=on ..
echo "##########################Building##########################"
make
I have commited the code that breaks build. However, instead of finishing, build seems to be stuck in "running" state after exiting make. Last line is:
make: *** [all] Error 2
I also get no notifications.
How can i diagnose what is happening?
Upd.: in runner, following is repeated in log:
Submitting build <..> to coordinator...response error: 500
In production.log and sideq.log of gitlab_ci, following is written:
ERROR: Error connecting to Redis on localhost:6379 (ECONNREFUSED)
Full message with stacktrace is here: pastebin.
I have the same problem, i can help you with a workaround but im trying to fully fix it.
1- most of the times he hangs but the jobs keeps on going and actually finishes it, you can see the processes inside the machine, example: in my case it compiles and in the end it uses docker to publish the build, so the process docker doesn't exist until he reaches that phase.
2- to workaround this issue you have to make the data persistent and "retry" the download over and over again until he downloads everything he needs.
PS: stating what kind of OS you are using always helps.