gitlab-runner's git clone fails with "Problem with the SSL CA cert (path? access rights?)" - ssl

For several months now I've had issues with gitlab-runner which is randomly failing with the following log:
Running with gitlab-runner 13.7.0 (943fc252)
on <gitlab-runner-name> <gitlab-runner-id>
Preparing the "shell" executor
00:00
Using Shell executor...
Preparing environment
00:00
Running on <hostname>...
Getting source from Git repository
00:00
Fetching changes...
Reinitialized existing Git repository in /var/gitlab-runner/builds/<gitlab-runner-id>/0/<gtlab-group>/<gitlab-project>/.git/
fatal: unable to access 'https://gitlab-ci-token:[MASKED]#<hostname>/<gtlab-group>/<gitlab-project>.git/': Problem with the SSL CA cert (path? access rights?)
ERROR: Job failed: exit status 1
This line is the crucial one:
fatal: unable to access 'https://gitlab-ci-token:[MASKED]#<hostname>/<gtlab-group>/<gitlab-project>.git/': Problem with the SSL CA cert (path? access rights?)
I tried unregistering the runner and registering a new one. It also failed with the same error after a while (the first run usually worked well).
Furthermore, runners on other machines are working correctly and never fail with the error message above.
I believe the issue is caused by the missing CI_SERVER_TLS_CA_FILE file in:
/var/gitlab-runner/builds/<gitlab-runner-id>/0/<gtlab-group>/<gitlab-project>.tmp/CI_SERVER_TLS_CA_FILE
I tried doing a git pull in the faulty directory and I got the same message. After I copied this missing file from another directory which had it, I got the following:
remote: HTTP Basic: Access denied
fatal: Authentication failed for 'https://gitlab-ci-token:<gitlab-runner-token>#gitlab.lab.sk.alcatel-lucent.com/<gtlab-group>/<gitlab-project>.git/'
As far as I know, these tokens are generated for a one-time use and are discarded after the job finishes. This leads me to believe the missing file is the issue.
Where is this file copied from? Why is it missing? What can I do to fix this issue?
I've been looking through the GitLab issues without luck.

It sounds like one or more of your runners doesn't trust the certificate on your gitlab host. You'll have to track down the root and intermediate certs used to sign your TLS cert, and add it to your runners' hosts.
For my runners on CentOS, I follow this guide (for CentOS, the commands are the same for higher versions): https://manuals.gfi.com/en/kerio/connect/content/server-configuration/ssl-certificates/adding-trusted-root-certificates-to-the-server-1605.html.

Related

Digital Ocean CyberPanel (on Ubuntu 18.04): ACME certificates blocked forbidden - 283 Failed to obtain SSL for domain. [issueSSLForDomain]

I installed a brand new DigitalOcean droplet using a marketplace base (so on paper everything should be OK out of the box).
When trying to issue certificates, i am getting this error:
[11.13.2019_04-48-28] /root/.acme.sh/acme.sh --issue -d thehouseinkorazim.co.il -d www.thehouseinkorazim.co.il --cert-file /etc/letsencrypt/live/thehouseinkorazim.co.il/cert.pem --key-file /etc/letsencrypt/live/thehouseinkorazim.co.il/privkey.pem --fullchain-file /etc/letsencrypt/live/thehouseinkorazim.co.il/fullchain.pem -w /home/thehouseinkorazim.co.il/public_html --force
[11.13.2019_04-48-28] [Errno 2] No such file or directory [Failed to obtain SSL. [obtainSSLForADomain]]
[11.13.2019_04-48-28] 283 Failed to obtain SSL for domain. [issueSSLForDomain]
[11.13.2019_04-48-34] Trying to obtain SSL for: thehouseinkorazim.co.il and: www.thehouseinkorazim.co.il
I checked and UFW is not installed.
I do have a network firewall but it is the same one as another droplet that does allow for certificates (same rules) so I think it is not the cause.
I searched all the answers online and no luck.
I even installed certboot to manually issue certificate but same error (i did it because I know you need to register initially to get certificates and I haven't so I thought it was the cause).
Any ideas? Thanks!
update: i did a clean droplet again, this is the issue without anything I did manually:
Cannot issue SSL. Error message: ln: failed to create symbolic link '/usr/local/lsws/admin/conf/cert/admin.crt': No such file or directory ln: failed to create symbolic link '/usr/local/lsws/admin/conf/cert/admin.key': No such file or directory 0,283 Failed to obtain SSL for domain. [issueSSLForDomain]
I checked and there is no folder "cert" under "conf" in the path written above.
There's an known SSL issue on recent version due to some environment/code changing. We already aware it and submitted a new version which has that issue fixed included. Please give it a day or two and you should be able to launch the new version from marketplace which comes with CyberPanel v1.9.2.
Best

Chef Server - How to deal with self signed certificate?

I am installing Chef Server version 12.8.0-1 on Debian 8.5.
By downloading the .deb package files direct from the chef.io website I have successfully got the chef-server and chef-manage modules installed, configured and running.
I have got stuck trying to install the push jobs server. I used the command below...
chef-server-ctl install opscode-push-jobs-server
when the command runs I get the following errors...
Chef Client failed. 0 resources updated in 06 seconds
[2016-07-12T12:02:23+01:00] FATAL: Stacktrace dumped to /var/opt/opscode/local-mode-cache/chef-stacktrace.out
[2016-07-12T12:02:23+01:00] FATAL: Please provide the contents of the stacktrace.out file if you file a bug report
[2016-07-12T12:02:24+01:00] FATAL: OpenSSL::SSL::SSLError: SSL_connect returned=1 errno=0 state=error: certificate verify failed
I believe the cause of the problem is a self signed certificate used on our corporate firewall to allow the security team to decode SSL traffic.
What I need to know is how to either get Chef to accept this certificate or get it to ignore self signed certs.
I know I could manually download and install the module but this issue will affect other things like installing cookbooks from the Chef supermarket so I'd rather find a solution that lets me use the Chef tools as intended.
Can anyone advise please?
Tensibai gave you the path for fixing Chef Server, you'll probably need to do it for the client too which is fortunately easier. Just drop the extra root cert in /etc/chef/trusted_certs.

Invalid SSL certificate when building a crate with cargo

While trying an example from the tutorial (guessing game) after defining a dependency (rand="0.3.0") I got this:
$ cargo build --verbose
Updating registry `https://github.com/rust-lang/crates.io-index`
Unable to update registry https://github.com/rust-lang/crates.io-index
Caused by:
failed to fetch `https://github.com/rust-lang/crates.io-index`
Caused by:
[16] The SSL certificate is invalid
Added this to cargo registry git repo, but without success:
[http]
sslVerify = false
Where to dig?
I ran into the same problem today and found that my $HOME/.gitconfig had this:
[url "git#github.com:"]
insteadOf = https://github.com/
I had added this to make go get to work over SSH for private repos. Commenting this out fixed the error.
As said in the comments this may be someone between you and Github modifying your communication (MITM) or a misconfiguration on your system (like missing certificates). (A problem on the side of Github is not likely.)
Do debug first try with plain git: git clone https://github.com/rust-lang/crates.io-index.git
To get the details on what exactly failed use openssl s_client -debug -showcerts -connect github.com:443 and if it doesn't exit on its own (because connecting worked) press CTRL-C to exit. The output contains information on what certificates were presented by the remote and how it was verified or failed to verify.
If it is someone modifying your communication please publish the output of this and of a traceroute github.com or something equivalent so others can avoid that provider.

Error: Could not run: SSL_connect SYSCALL returned=5 errno=0 state=SSLv3 read finished A

I am trying to copy a current Puppet Master server on one domain and move it to another. Im finding that its very hard to try to change all the config remanence. Is there an easy way to do this, or a step by step best practice? I have grepped most of the old fqdn name and changed it to the new one, yet when I delete all certs, and re-issue new ones on the master, it wants to keep pulling a cert for the old FQDN.
Edit 1: I have resolved many of the issues I was previously getting. However I can not get past this SSL issue for the life of me.
[root#puppet lib]# puppet resource service apache2 ensure=running
Error: Could not run: SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed: [unable to get local issuer certificate for /CN=puppet.foundry.test]
I have attempted to completely purge all certs from the master, using this link, and then regenerate all. But I still keep getting the same errors:
Error: Could not run: SSL_connect SYSCALL returned=5 errno=0 state=SSLv3 read finished A
Now Im not sure if I am having puppet SSL issues, or SSL issues in general.
Most likely you're connecting to a wrong server (default is hostname puppet).
Check your agent's config, you're mostly interested in server variable
puppet config print --section agent | grep "server = "
Also it's good to know where is puppet agent looking for its config:
$ puppet config print --section agent | grep "^config = "
config = /etc/puppetlabs/puppet/puppet.conf
Edit your config, set correct puppet master:
[agent]
server=puppet4.example.com
Just for sure, you can clean your cerfificate (on agent):
find /etc/puppetlabs/puppet/ssl -name $(hostname -f).pem -delete
on puppet server:
puppet cert clean {broken hostname}
And finally run puppet agent -t
You can use this link: http://bitcube.co.uk/content/puppet-errors-explained
Did you try to change the puppet master dns?
Try looking if the puppet master cert is the same as what you are writing in server on the node.
If not you can always use dns_alt_names = puppet_hostname.your_domain and all the names you want for the puppet master & CA.
Then try to restart the puppet master service, clean the slave certname from the master, remove all /var/lib/puppet/ssl/ folder from the slave, and run puppet again.
What puppet isn't telling you is that there is a cert mismatch. The master disconnects as soon as it determines that the cert is invalid or a mismatch. Because the disconnect is so sudden, puppet isn't told why it happens.
When this happens puppet could, for example, change that error message to be, "Hey! Here's a list of things you might check." and then suggest things like verify the cert expiration date, cert mismatch, etc. However why would anyone do that?
Here's one way you can get into this situation: Set up two puppet client machines with the same name by mistake. The second machine to use that name will work, but the first machine will no longer work.
How might someone get into that situation? Two machines can't have the same name! Of course not. But we have seen situations like this:
Machine A, B, C, D, E are all Puppet clients.
Machine C gets wiped and reloaded. The technician accidentally calls it "B". To get it working with Puppet, they "puppet cert clean B".
The technician realizes their mistake and reconfigures machine C with the proper name, performs "puppet cert clean C", and machine C now works fine.
A week later someone notices that machine B hasn't been able to talk to the master. It gets this error message. After hours of debugging they see that the client cert has one serial number but the master expects that client to have a very different serial number. Machine B's cert is cleaned, regenerated, etc. and everything continues.
Should Puppet Labs update the error message to hint that this may be the problem? They could, but then I wouldn't get rep points for writing this awesome answer. Besides, technicians should never make such a mistake, so why handle a case that obviously should never happen... except when it does.
Make sure that you are running puppet as root, or with sudo. I have received this exact error when I was my normal user and ran "puppet agent -t" without elevating my privileges.

TeamCity and Mercurial https

I try to connect from TeamCity to Mercurial repository over https.
But I can't, because appears error:
stderr: abort: error: _ssl.c:577: error: 14090086:SSL
routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed.
How can I disable sertificate verification by TeamCity?
Or how I can workaround this?
I have tried to load sertificate from IE and past it in cer file of Mercurial, but it is not resolve my issue.
I resolve my issue, only after put mercurial.ini in catalog C:\Windows\System32\config\systemprofile.
Editing of .hgrc not take effect. Only putting mercurial.ini in C:\Windows\System32\config\systemprofile and add loaded certificate to cacept.pem solve my issue.
Better than disabling certificate verification (where possible) is to let Mercurial know that you trust the certificate. (This is a Windows-specific answer).
The thing I missed for ages is that even if you import the certificate into the Trusted Root Certification Authority, this doesn't affect the Local System account, which TeamCity is running under if you have set it up to run as a service.
The full steps to get the Local System account to trust a certificate are in this answer, but I'll reproduce them in brief here:
First, get a copy of the certificate. You can export this to a file from the all the main browsers.
Then, run mmc.exe from the start menu. Add the Certificates snap-in. If TeamCity is running as the local system account you want to manage "Local computer" certificates. If TeamCity is running as an ordinary user you want to manage user certificates.
Navigate to "Trusted Root Certification Authorities". Then click "Action > All Tasks > Import" and import the certificate file.
A final note: You can use psexec.exe from PSTools to run powershell as Local System and test things are working before going back to TeamCity: (Reference)
psexec -i -s Powershell.exe