Public key pinning vs Certificate Pinning in mobile apps - ssl

Scenario :
I have pinned public key pin SHAs of 3 certificates : Root CA , Intermediate CA and Leaf CA in my android application.
What I have understood ( Please correct me if I'm wrong anyway here ) :
Public key pinning is used so that we can check if the public key of the cert that our server is issuing is changed or not. source
A certificate is valid if its public key SHA is the one which we have "pinned" in our application. To check the public key , first it will decrypt the signature using the public key and makes sure that the same public key is in the data of that signature also.
When the Leaf cert has expired but is corresponding to the valid "pinned" public key SHA, the chain of certificates is checked to see if they are valid and if one of them is valid , the certificate is accepted and the connection is established.
When the Leaf cert I got is having an invalid public key but is not expired , then that means I got a wrong certificate from someone which may be an attacker.
Question :
Does public key pinning in any way help in security , if an attacker compromises a client and installs his own trusted CA and then does an MITM on the client to intercept all communication by presenting his own forged certificate signed by the CA he has installed on the client device.
How does direct certificate pinning VS public key pinning make a difference here in any way ?
What is the implication of using a self signed certificate in the above questions.
Please help me understand this with as much detail as possible...

When the Leaf cert has expired but is corresponding to the valid "pinned" public key SHA, the chain of certificates is checked to see if they are valid and if one of them is valid , the certificate is accepted and the connection is established.
No. An expired certificate is not accepted. Pinning does not override that basic principal of TLS but enhances it to reduce the number of certificates accepted.
Does public key pinning in any way help in security , if an attacker compromises a client and installs his own trusted CA and then does an MITM on the client to intercept all communication by presenting his own forged certificate signed by the CA he has installed on the client device.
For browsers, manually installed trusted CAs are exempt from pinning requirements. To me this is a fundamental flaw in pinning. Though to be honest once you have access to install root certs on a machine it’s pretty much game over. Anyway, this exception is necessary to allow Virus scanners, Corporate proxies and other intercepting proxies to work - otherwise any pinned site could not be accessed when behind one of these proxies though it does weaken HPKP (HTTP Public Key Pining) in my mind.
For apps (your use case) pinning can be useful to prevent MITM attacks.
How does direct certificate pinning VS public key pinning make a difference here in any way ?
Don’t understand? When you pin a direct certificate you basically pin the public key of that certificate (well actually the SHA of the private key that cert is linked too).
This means you can reissue the certificate from same private key (bad practice in IMHO) and not have to update pins.
You can also pin from the intermediate or even root public key. This means you can get your CA to reissue a cert and again not have to update the pin. That of course ties you into that CA but at least doesn’t allow some random CA to issue a cert for your site.
What is the implication of using a self-signed certificate in the above questions.
For browsers, pinning basically can’t be used with a self-signed cert. because either it’s not recognised by browser (so pining won’t work) or its is by trusting by manually installing the issuer - at which point pinning is ignored as per point above.
For apps (again your use case), I understand self-signed certificates can be pinned. Though it depends on which HTTP library you use and how that can be configured.
One of the downsides of pinning the certificate itself (which might be the only way to do it, if it's a single leak self-signed certificate), is that reissuing the certificate will invalidate the old pins (unless you reuse the same private key but this may not be possible if the re-issue reason is due to key compromise). So if you app makes an HTTP call home to check if there is a new version or such like, then that call might fail if certificate is re-issued and new version of the app has not been downloaded yet.
Nearly browsers have deprecated HPKP as it was massively high risk compared to the benefits and there were numerous cases of breakages due to pinning. See Wikipedia: https://en.m.wikipedia.org/wiki/HTTP_Public_Key_Pinning. Monitoring for mis-issued certificates through Certificate Transparency is seen as a safer option.
Pinning still seems somewhat popular in mobile app space because you have greater control over an app and can re-release a new version in case of issues. But it is still complicated and risky.

My Answer Context
Scenario :
I have pinned public key pin SHAs of 3 certificates : Root CA , Intermediate CA and Leaf CA in my android application.
My answer will be in the context of pinning in a mobile app. anyway the new browsers don't support pinning any-more.
Pinning and Security
Does public key pinning in any way help in security
It helps a lot, because your mobile app only communicates with a server that presents the certificate with a matching pin. For example, if you do public key pinning and you rotate the certificates in your backend while signing it with a different private/public key pair then your mobile app will refuse to connect to your own server until you release a new version of the mobile app with the new pins.
MitM attack and Pinning
if an attacker compromises a client and installs his own trusted CA and then does an MITM on the client to intercept all communication by presenting his own forged certificate signed by the CA he has installed on the client device.
When you are pinning the connection the attacker will not succeed, because this is for what pinning was designed for, to protect against manipulation of the certificates trust store on the device in order to carry on a MitM attack. From Android API 24 user provided certificates are not trusted by default, unless the developer opts-in to trust on them via the network security config file:
<base-config cleartextTrafficPermitted="false">
<trust-anchors>
<!-- THE DEFAULT BEHAVIOUR -->
<certificates src="system" />
<!-- DEVELOPER ENABLES TRUST IN USER PROVIDED CERTIFICATES -->
<certificates src="user" />
</trust-anchors>
</base-config>
You can read the article I wrote to see pinning in action and not allowing for a MitM attack to succeed:
Securing HTTPS with Certificate Pinning:
In order to demonstrate how to use certificate pinning for protecting the https traffic between your mobile app and your API server, we will use the same Currency Converter Demo mobile app that I used in the previous article.
In this article we will learn what certificate pinning is, when to use it, how to implement it in an Android app, and how it can prevent a MitM attack.
In the article I go into detail how to implement pinning, but nowadays I recommend instead the use of the Mobile Certificate Pinning Generator online tool, that will generate the correct network security config file to add to your Android app. For more details on how to use this tool I recommend you to read the section Preventing MitM Attacks in this answer I gave to another question where you will learn how to implement static certificate pinning and how to bypass it:
The easiest and quick way you can go about implementing static certificate pinning in a mobile app is by using the [Mobile Certificate Pinning Generator](Mobile Certificate Pinning Generator) that accepts a list of domains you want to pin against and generates for you the correct certificate pinning configurations to use on Android and iOS.
Give it a list of domains to pin:
And the tool generates for you the Android configuration:
The tool even as instructions how to go about adding the configurations to your mobile app, that you can find below the certificate pinning configuration box. They also provide an hands on example Pin Test App for Android and for iOS that are a step by step tutorial.
This approach will not require a release of a new mobile app each time the certificate is renewed with the same private/public key pair.
Certificate Pinning vs Public Key Pinning
How does direct certificate pinning VS public key pinning make a difference here in any way ?
When using certificate pinning a new mobile app needs to be released and users forced to update each time the server certificate is rotated, while with public key pinning no need for this unless the private/public key pair used to sign the certificate changes. For example, if your server uses LetsEncrypt for the certificates you don't need to release a new mobile app version each time they are renewed.
Self Signed Certificates
What is the implication of using a self signed certificate in the above questions.
You will need to opt-in via the network security config file for the Android OS to trust in user provided certificates and instruct the user to add it to his mobile device. This will make an attacker life easier if pinning is not being used. I would recommend you to stay away of using self signed certificates.

Related

Why I need a SSL certificate?

I have a short question: why do I need a SSL certificate (I mean only the certificate not the SSL connection)?
In my case Google Chrome deteced, that the connection is encrypted and secure, but everything is red because I created the certificate by myself.
Why I need a SSL certificate, if the connection is secure?
Just because traffic to 192.168.xxx.xxx doesn't leave the boundary of your network doesn't mean that it's safe.
Especially if you have BYODs attached to the network (and even if not, you don't want to be a hard shell with a juicy interior), someone can bring a compromised laptop or phone, attach it to the network, and a virus can intercept everything going on the network (see firesheep).
So you have to assume that the network is malicious - treat your LAN as if it were the internet.
So now the question goes back - why can't I rely on a self-signed certificate (both on a local network as well as the internet)?
Well, what are you protecting against? TLS (SSL) protects against two things:
Interception - even if I MITM you (I become your router), I can't read what you're sending and receiving (so I can't read your Credit Card numbers or password)
Spoofing - I can't inject code between you and the server.
So how does it work?
I connect to the server and get a certificate signed by a CA. This CA is considered trusted by the browser (they have to go through all kinds of audits to get that trust, and they get evicted if they break it). They verify that you control the server and then sign your public key.
So when the client gets the signed public key from the server, he knows he's going to encrypt a message that only the destination server can decrypt, as the MITM wouldn't be able to substitute his own public key for the server's (his public key wouldn't be signed by a CA).
Now you can communicate securely with the server.
What would happen if the browser would accept any SSL cert (self signed)?
Remember how the browser can tell the official cert from a fake MITM cert? By being signed by a CA. If there's no CA, there's literally no way for the browser to know if it's talking to the official server or a MITM.
So self-signed certs are a big no-no.
What you can do, though, is you can generate a cert and make it a "root" cert (practically, start your own CA for your internal computers). You can then load it into your browsers CA store and you'll be able to communicate through SSL without having to go through something like letsencrypt (which, by the way, is how enterprise network monitoring tools work).
In cryptography, a certificate authority or certification authority
(CA) is an entity that issues digital certificates. A digital
certificate certifies the ownership of a public key by the named
subject of the certificate. This allows others (relying parties) to
rely upon signatures or on assertions made about the private key that
corresponds to the certified public key. A CA acts as a trusted third
party—trusted both by the subject (owner) of the certificate and by
the party relying upon the certificate. The format of these
certificates is specified by the X.509 standard.
(from https://en.wikipedia.org/wiki/Certificate_authority)
You are not a trusted CA. Basically, if you sign your own certificate then there is no one that is able to vouch that the server is truly what it is. If you had a valid, trusted third party vouch for you then the certificate would be "valid."
Having a self-signed certificate doesn't necessarily mean that the website is dangerous, its just that the identity of the server can't be verified and thus it is more risky for the vistor.
Self-created or Self Signing Certificate are not trusted by all browsers. As we know at this time all browsers are more strict towards security. Let’s be clear about something right up front, the browsers do not trust you. Period.
It may seem harsh but it’s just a fact, browsers’ jobs are to surf the internet while protecting their users and that requires them to be skeptical of everyone or everything.
The browsers do, however, trust a small set of recognized Certificate Authorities. This is because those CA’s follow certain guidelines, make available certain information are regular partners with the browsers. There’s even a forum, called the CA/B forum, where the CA’s and Browsers meet to discuss baseline requirements and new rules that all CA’s must abide to continue being recognized.
It’s highly regulated.
And you are not a part of the CA/B forum.
The better option is to obtain an SSL Certificate from a trusted certificate authority.
Here's what you need to know about a Self Signed SSL Certificate

What is the difference between SSL pinning (embedded in host) and normal certificates (presented by server)

I'm not quite understanding the necessity of certificate pinning in SSL connection establishment (to avoid Man in the Middle attacks).
SSL cert pinning requires embedding original server certificate in the host to verify with the one presented by server. what is the difference between the server certificate embedded in the host and the one presented by server to be validated by client?
What is that I am missing here?
what is the difference between the server certificate embedded in the host and the one presented by server to be validated by client?
There should be none and that's exactly the point of certificate pinning.
Without certificate pinning an application commonly accepts any certificate which matches the requested hostname and is issued by a locally trusted CA (certificate authority). Given that there are usually more than 100 CA in the local trust store it is sufficient that one of these got successfully attacked as in the case of DigiNotar in 2011. Thus it makes sense to limit the certificate you accept to a specific one, i.e. pinning.
Besides the certificate pinning by comparing the certificate received with a locally stored certificate there are other ways of pinning: for example one might just check against a fingerprint (hash) and not the full certificate. In case the certificate can expire it might be more useful to check only the public key and not the whole certificate because the public key is often kept on certificate renewal. Or one might pin to a specific CA which one considers trusted to issue certificates for this domain.
Note that to understand pinning you might need to understand how the authentication of the server works. One part of this is that the server certificate is validated (hostname, expiration, trust chain ...). But this is not enough since the certificate itself is public, i.e. everybody can get it and could send it inside the TLS handshake. Thus the other major part of the authentication is that the server proves that it is the owner of the certificate. This is done by signing some data using the private key matching the certificate. Since only the owner of the certificate should have the private key this proves ownership. Because of this anybody could embed the servers certificate for pinning but only the server itself can prove ownership of the certificate.
What is SSL pinning
Applications are configured to trust a select few certificates or certificate authority (CA), instead of the default behaviour: to trust all CAs that are pre-configured on the device/ machine. SSL pinning is not required.
Why use SSL Pinning (Why not to)
In many cases, the certificate returned by a server could be tampered as long as any Root (or intermediate root) CA was compromised (happens very rarely). Threat actors could use this compromised CA to generate a certificate for your website, and show visitors their website instead. This is bad. SSL pinning was designed to prevent this in some cases, but there are better ways (IMHO).
Having said that, I don' t know any website which uses SSL pinning so SSL pinning seems primarily discussed for mobile apps. It seems like SSL pinning only works when you can trust the source of the application (e.g. App Store, Play Store) Why? Because if you have to visit a website to get the cert, by then its too late (you might have already used a dodgy cert and accessed the fake website or was MITM'd). Therefore, it seems like the benefits Steffen mentioned are not so compelling, especially when there are better solutions already:
Better solution
I'm not sure if any-CA-compromise is a threat vector, even for banks. Instead, banks and other security conscious organisations will pick their CA wisely, and also configure a CAA record.
By using a CAA DNS record, they can restrict clients (e.g. browsers, mobile apps) to trust only certain certificates when accessing their specific website.
They pick the CA and create a cert only from this CA
They will have a backup plan for if a CA is compromised. Don't want to go into that here, but the backup plan for CAA records is IMHO much better than that of SSL pinning.
For example, Monzo.com (I used whatsmydns to find this) has a CAA record which restricts certificates to only 4 CAs (digicert, amazon, comodoca, buypass):
0 iodef "mailto:security#monzo.com"
0 issue "amazon.com"
0 issue "buypass.com"
0 issue "comodoca.com"
0 issue "digicert.com"
0 issue "letsencrypt.org"
0 issuewild "amazon.com"
0 issuewild "comodoca.com"
0 issuewild "digicert.com"
0 issuewild "letsencrypt.org"
These are popular CAs which people trust, we hope they don't let us down. If they do, the whole internet would be a free for all. The only way to prevent this is to be your own CA/ use self-signed certificates.
Summary
I don't see how SSL pinning will become ubiquitous, especially since it adds more overhead (maintenance regarding ssl expiry, or trusting one CA anyway - SPoF, or emulating what a CAA record does but with additional code/ maintenance burden). It also only supports your pre-installed applications, not websites.

How to prevent clients from retrieving my server's certificate

I have a secure API for mobile clients. I would like to perform certificate pinning and I achieved it. The problem is if run the command openssl s_client -connect xxx.xxxxxxxxx.com:443 then I can see my certificate. I believe whoever have the URL, they can also see the certificate and they connect to my APIs.
How I can prevent access to my certificate, so that only my mobile can access but not public?
Anyone who connects to an SSL / TLS server can view the server's certificate because is public. This is normal behavior.
But that does not mean it can connect to your API. Normally an authentication mechanism is added where the one that connects has to present credentials, for example user/password.
With SSL/TLS is also possible to require a client certificate to stablish the secure channel. This is called two ways authentication. But it is not usually used from mobile devices because of the difficulty of distributing the electronic certificates
I suggest adding authentication to your api if you have not already done so
Public key cryptography works by having one part (the certificate) freely available publicly. The corresponding private key is needed to decrypt and it should be kept secret.
Therefore there is no problem with openssl having access to the certificate - that's exactly how it should work! A web browser will also be able to grab the certificate for a website it had not been too.
Pinning adds a further layer of security that this but limiting the certificates that a website can use to those certificates that are "pinned" to the site. As discussed without the private key, someone can decrypt the traffic. However there are certain, reasonably sophisticated attacks that involve intercepting traffic and replacing the certificate with another using their own certificate/private key combination so they can read the traffic. Pinning prevents this by explicitly stating which certificate(s) should be allowed in this site.
Pinning does not stop the need for the key to be public, nor does it limit connections from your mobile app only - there are other solutions for that but pinning is not it. It merely is used to address one type of attack against the server.
Pinning is an advanced topic and it is easy to accidentally cut off access to your site by pinning a certificate and then not updating the pins when renewing, or otherwise changing, the certificate. Due to that risk, you should ensure you have a much greater understanding of how all this works before implementing pinning. At the moment you seem to have a misunderstanding of the basics so would advise against advanced topics like pinning.

SSL Pinning and certificate expiry

This question relates to the use of SSL Pinning in a client app against a web api and certificate
expiry.
Scenario:
I own example.com and have a subdomain where an api is hosted, as such: api.example.com
I wish to use the api over SSL, so an SSL Certificate is created for the subdomain.
After the certificate has been acquired, I have:
A Public Certificate
A Intermediate Certificate
A Private Key
It's my understanding that I install these certificates on my webserver.
I then wish for my client app to connect to the api. To mitigate against man-in-the-middle style
attacks, I wish to use SSL Pinning, so that the client will only communicate with my api, not
someone spoofing it.
In order to pin in the client app, I have two choices, either pin against the public or intermediate
certificate.
Let's say I implement this.
What happens when the certificate on api.example.com expires?
It's my understanding that the client app would no longer work.
Do I need to regenerate a complete set of public/intermediate/private items again? and then
put a new public or intermediate certificate in the app?
Question:
I would still like the client app to work until the certificate on api.example.com was updated.
Of course, a new certificate can be put in the client app, but things like roll-out take time.
How can I handle this?
I've read that Google updates their certificate every month, but somehow manages to keep the public key the same: How to pin the Public key of a certificate on iOS
If that's possible, then the solution is to simply extract the public key from the server and check it against the locally stored public key...but how do Google do it?
Thanks
Chris
Note: I'm more familiar with browser to server pinning (HTTP Public Key Pinning - HPKP) rather than app to server pinning, but I presume the principal is the same. In HPKP the pinning policy is provided by the server as a HTTP header but understand this is often built into the app rather than read from the HTTP response. So read below answer with all that in mind:
Pinning is usually against the key not the cert and can be a multiple levels. So you've several choices:
Reuse the same key/crt to generate a new cert. Some (rightly in my opinion!) recommend generating a new key each time you renew your cert but this is complicated when you use pinning. So does pinning encourage poor security habits like key reuse?
Have several back up keys in your pinning policy and rotate them around on cert renewal discarding your oldest and adding a new one with plenty of time and updates to never be caught short. Personally I prefer to generate the key at cert renewal time rather than have some backups around which may or may have been compromised so I'm not a particular fan of this either. And how many backups should you have? E.g. If you need to reissue a cert because of compromise around renewal and also mess it up? So 2? 3? 100?
Pin further up. Say the first intermediate or the root CA cert. So any newly issued cert is still trusted (providing it's issued by same cert path) The downside of this is four fold: i) You still leave yourself open to miss-issued certs issued by that pinned cert (not a massive deal IMHO as you've still massively reduced your attack surface but still a concern to some people), ii) you cannot guarantee the client will use that intermediate cert as there are sometimes multiple valid paths. This second one is a much bigger deal. You'd think that providing the intermediate cert would guarantee this would be used but that's not the case (plenty of sha-1 examples of this). iii) There's no guarantee new cert will be issued by same intermediate or root (especially when technologies change like introduction of sha2), so to me this whole option is a non-starter iv) It ties you in to using same cert provider (perhaps not a big deal but I like the freedom to move). Not sure if apps support this feature natively anyway but browsers certainly do.
Renew in advance and do not use the new key until policy cache expires. For example if you have one year certs and a 30 day pinning policy then you can renew after 11 months, add the new key to the policy, then wait 30 days so you can be sure everyone will have picked up new policy or at least the old policy will have expired, then switch keys and certs. Depends on a short policy and potentially wastes a portion of that though (at least 30 days in this example), unless cert provider provides cert in advance starting on day after old policy expires. For an app, if pinning policy is hard coded into it, then this might involve the length of time it takes to push out an update.
Ultimately, because certs do require renewing, I'm not a big fan of pinning. I don't think making something that is subject to periodic renewal, semi-permanent is the right answer. And there are even some talk of pre-loading pinning policies in browsers which just makes me shudder.
Pinning provides assurance that a rogue CA is not issuing certs for your domain but how likely is that really compared to the hassle of pinning? Something like Certificate Transparency - or even report only pinning may be a better answer to that problem even if they don't actually stop that attack.
Finally locally installed roots (e.g. for antivirus scanners or corporate proxies), bypass pinning checks (on the browser at least) which again reduces its effectiveness in my eyes.
So think carefully before using pinning and make sure you understand all the consequences.
The mozilla developer site recommends to pin the certificate of the intermediate CA that signed the server certificate.
"it is recommended to place the pin on the intermediate certificate of the CA that issued the server certificate, to ease certificates renewals and rotations."
For more information on implementing and testing public key pinning you can refer Implementing and Testing HTTP Public Key Pinning (HPKP)
Your application can store multiple certificates in its pin list. The procedure for changing the cert would then be:
Some time before the certificate expires, release a new version of your app with a replacement cert in the pin list, as well as the original cert
when the old certificate expires, replace it on the server - the app should then still work as the new cert will already be in the pin list
Some time after the cert expires, release a new version of your app removing the old cert
Remember your users have to update the app before the old cert expires

How does SSL really work?

How does SSL work?
Where is the certificate installed on the client (or browser?) and the server (or web server?)?
How does the trust/encryption/authentication process start when you enter the URL into the browser and get the page from the server?
How does the HTTPS protocol recognize the certificate? Why can't HTTP work with certificates when it is the certificates which do all the trust/encryption/authentication work?
Note: I wrote my original answer very hastily, but since then, this has turned into a fairly popular question/answer, so I have expanded it a bit and made it more precise.
TLS Capabilities
"SSL" is the name that is most often used to refer to this protocol, but SSL specifically refers to the proprietary protocol designed by Netscape in the mid 90's. "TLS" is an IETF standard that is based on SSL, so I will use TLS in my answer. These days, the odds are that nearly all of your secure connections on the web are really using TLS, not SSL.
TLS has several capabilities:
Encrypt your application layer data. (In your case, the application layer protocol is HTTP.)
Authenticate the server to the client.
Authenticate the client to the server.
#1 and #2 are very common. #3 is less common. You seem to be focusing on #2, so I'll explain that part.
Authentication
A server authenticates itself to a client using a certificate. A certificate is a blob of data[1] that contains information about a website:
Domain name
Public key
The company that owns it
When it was issued
When it expires
Who issued it
Etc.
You can achieve confidentiality (#1 above) by using the public key included in the certificate to encrypt messages that can only be decrypted by the corresponding private key, which should be stored safely on that server.[2] Let's call this key pair KP1, so that we won't get confused later on. You can also verify that the domain name on the certificate matches the site you're visiting (#2 above).
But what if an adversary could modify packets sent to and from the server, and what if that adversary modified the certificate you were presented with and inserted their own public key or changed any other important details? If that happened, the adversary could intercept and modify any messages that you thought were securely encrypted.
To prevent this very attack, the certificate is cryptographically signed by somebody else's private key in such a way that the signature can be verified by anybody who has the corresponding public key. Let's call this key pair KP2, to make it clear that these are not the same keys that the server is using.
Certificate Authorities
So who created KP2? Who signed the certificate?
Oversimplifying a bit, a certificate authority creates KP2, and they sell the service of using their private key to sign certificates for other organizations. For example, I create a certificate and I pay a company like Verisign to sign it with their private key.[3] Since nobody but Verisign has access to this private key, none of us can forge this signature.
And how would I personally get ahold of the public key in KP2 in order to verify that signature?
Well we've already seen that a certificate can hold a public key — and computer scientists love recursion — so why not put the KP2 public key into a certificate and distribute it that way? This sounds a little crazy at first, but in fact that's exactly how it works. Continuing with the Verisign example, Verisign produces a certificate that includes information about who they are, what types of things they are allowed to sign (other certificates), and their public key.
Now if I have a copy of that Verisign certificate, I can use that to validate the signature on the server certificate for the website I want to visit. Easy, right?!
Well, not so fast. I had to get the Verisign certificate from somewhere. What if somebody spoofs the Verisign certificate and puts their own public key in there? Then they can forge the signature on the server's certificate, and we're right back where we started: a man-in-the-middle attack.
Certificate Chains
Continuing to think recursively, we could of course introduce a third certificate and a third key pair (KP3) and use that to sign the Verisign certifcate. We call this a certificate chain: each certificate in the chain is used to verify the next certificate. Hopefully you can already see that this recursive approach is just turtles/certificates all the way down. Where does it stop?
Since we can't create an infinite number of certificates, the certificate chain obviously has to stop somewhere, and that's done by including a certificate in the chain that is self-signed.
I'll pause for a moment while you pick up the pieces of brain matter from your head exploding. Self-signed?!
Yes, at the end of the certificate chain (a.k.a. the "root"), there will be a certificate that uses it's own keypair to sign itself. This eliminates the infinite recursion problem, but it doesn't fix the authentication problem. Anybody can create a self-signed certificate that says anything on it, just like I can create a fake Princeton diploma that says I triple majored in politics, theoretical physics, and applied butt-kicking and then sign my own name at the bottom.
The [somewhat unexciting] solution to this problem is just to pick some set of self-signed certificates that you explicitly trust. For example, I might say, "I trust this Verisign self-signed certificate."
With that explicit trust in place, now I can validate the entire certificate chain. No matter how many certificates there are in the chain, I can validate each signature all the way down to the root. When I get to the root, I can check whether that root certificate is one that I explicitly trust. If so, then I can trust the entire chain.
Conferred Trust
Authentication in TLS uses a system of conferred trust. If I want to hire an auto mechanic, I may not trust any random mechanic that I find. But maybe my friend vouches for a particular mechanic. Since I trust my friend, then I can trust that mechanic.
When you buy a computer or download a browser, it comes with a few hundred root certificates that it explicitly trusts.[4] The companies that own and operate those certificates can confer that trust to other organizations by signing their certificates.
This is far from a perfect system. Some times a CA may issue a certificate erroneously. In those cases, the certificate may need to be revoked. Revocation is tricky since the issued certificate will always be cryptographically correct; an out-of-band protocol is necessary to find out which previously valid certificates have been revoked. In practice, some of these protocols aren't very secure, and many browsers don't check them anyway.
Sometimes an entire CA is compromised. For example, if you were to break into Verisign and steal their root signing key, then you could spoof any certificate in the world. Notice that this doesn't just affect Verisign customers: even if my certificate is signed by Thawte (a competitor to Verisign), that doesn't matter. My certificate can still be forged using the compromised signing key from Verisign.
This isn't just theoretical. It has happened in the wild. DigiNotar was famously hacked and subsequently went bankrupt. Comodo was also hacked, but inexplicably they remain in business to this day.
Even when CAs aren't directly compromised, there are other threats in this system. For example, a government use legal coercion to compel a CA to sign a forged certificate. Your employer may install their own CA certificate on your employee computer. In these various cases, traffic that you expect to be "secure" is actually completely visible/modifiable to the organization that controls that certificate.
Some replacements have been suggested, including Convergence, TACK, and DANE.
Endnotes
[1] TLS certificate data is formatted according to the X.509 standard. X.509 is based on ASN.1 ("Abstract Syntax Notation #1"), which means that it is not a binary data format. Therefore, X.509 must be encoded to a binary format. DER and PEM are the two most common encodings that I know of.
[2] In practice, the protocol actually switches over to a symmetric cipher, but that's a detail that's not relevant to your question.
[3] Presumable, the CA actually validates who you are before signing your certificate. If they didn't do that, then I could just create a certificate for google.com and ask a CA to sign it. With that certificiate, I could man-in-the-middle any "secure" connection to google.com. Therefore, the validation step is a very important factor in the operation of a CA. Unfortunately, it's not very clear how rigorous this validation process is at the hundreds of CAs around the world.
[4] See Mozilla's list of trusted CAs.
HTTPS is combination of HTTP and SSL(Secure Socket Layer) to provide encrypted communication between client (browser) and web server (application is hosted here).
Why is it needed?
HTTPS encrypts data that is transmitted from browser to server over the network. So, no one can sniff the data during transmission.
How HTTPS connection is established between browser and web server?
Browser tries to connect to the https://payment.com.
payment.com server sends a certificate to the browser. This certificate includes payment.com server's public key, and some evidence that this public key actually belongs to payment.com.
Browser verifies the certificate to confirm that it has the proper public key for payment.com.
Browser chooses a random new symmetric key K to use for its connection to payment.com server. It encrypts K under payment.com public key.
payment.com decrypts K using its private key. Now both browser and the payment server know K, but no one else does.
Anytime browser wants to send something to payment.com, it encrypts it under K; the payment.com server decrypts it upon receipt. Anytime the payment.com server wants to send something to your browser, it encrypts it under K.
This flow can be represented by the following diagram:
I have written a small blog post which discusses the process briefly. Please feel free to take a look.
SSL Handshake
A small snippet from the same is as follows:
"Client makes a request to the server over HTTPS. Server sends a copy of its SSL certificate + public key. After verifying the identity of the server with its local trusted CA store, client generates a secret session key, encrypts it using the server's public key and sends it. Server decrypts the secret session key using its private key and sends an acknowledgment to the client. Secure channel established."
Mehaase has explained it in details already. I will add my 2 cents to this series. I have many blogposts revolving around SSL handshake and certificates. While most of this revolves around IIS web server, the post is still relevant to SSL/TLS handshake in general. Here are few for your reference:
SSL Handshake and IIS
Client certificate Authentication in SSL Handshake
Do not treat CERTIFICATES & SSL as one topic. Treat them as 2 different topics and then try to see who they work in conjunction. This will help you answer the question.
Establishing trust between communicating parties via Certificate Store
SSL/TLS communication works solely on the basis of trust. Every computer (client/server) on the internet has a list of Root CA's and Intermediate CA's that it maintains. These are periodically updated. During SSL handshake this is used as a reference to establish trust. For exampe, during SSL handshake, when the client provides a certificate to the server. The server will try to cehck whether the CA who issued the cert is present in its list of CA's . When it cannot do this, it declares that it was unable to do the certificate chain verification. (This is a part of the answer. It also looks at AIA for this.) The client also does a similar verification for the server certificate which it receives in Server Hello.
On Windows, you can see the certificate stores for client & Server via PowerShell. Execute the below from a PowerShell console.
PS Cert:> ls Location : CurrentUser StoreNames : {TrustedPublisher, ClientAuthIssuer, Root, UserDS...}
Location : LocalMachine StoreNames : {TrustedPublisher,
ClientAuthIssuer, Remote Desktop, Root...}
Browsers like Firefox and Opera don't rely on underlying OS for certificate management. They maintain their own separate certificate stores.
The SSL handshake uses both Symmetric & Public Key Cryptography. Server Authentication happens by default. Client Authentication is optional and depends if the Server endpoint is configured to authenticate the client or not. Refer my blog post as I have explained this in detail.
Finally for this question
How does the HTTPS protocol recognize the certificate? Why can't HTTP work with certificates when it is the certificates which do all the trust/encryption/authentication work?
Certificates is simply a file whose format is defined by X.509 standard. It is a electronic document which proves the identity of a communicating party.
HTTPS = HTTP + SSL is a protocol which defines the guidelines as to how 2 parties should communicate with each other.
MORE INFORMATION
In order to understand certificates you will have to understand what certificates are and also read about Certificate Management. These is important.
Once this is understood, then proceed with TLS/SSL handshake. You may refer the RFC's for this. But they are skeleton which define the guidelines. There are several blogposts including mine which explain this in detail.
If the above activity is done, then you will have a fair understanding of Certificates and SSL.