Client-side SSL theoretical question - authentication

I work at company X and we want to engage in a B2B transaction with company Y. In doing so, Y is requiring client side authentication; they already provide server-side authentication - so this would be a mutual SSL transaction.
My understanding is that I simply need to provide my CA-signed cert as part of my client side HTTPS communication. In doing so, asymmetric cryptographic guarantees (i.e., public/private key technology) ensures that I am who I claim I am - in effect, it is not possible to impersonate me. (the root CA has ensured I am who I am, that is why they signed my cert, and it's possible to prove that I didn't "manufacture" the certificate).
Here's my question: is this all I need to provide company Y?
Alternatively, do I have to provide them with another key in advance, so that they can ensure I am not a rogue who happens to have "gotten a hold" of a root-signed certificate for me (company X)? It would seem that if one has to provide this extra key to all parties he wishes to engage with on the client side, it would seem to make client-side SSL not as scalable as server-side SSL. My guess is that it's not possible for one to "get a hold" of a client side cert; that the actual transmission of the client side cert is further encoded by some state of the transaction (which is not reverse-engineerable, feasibly).
Does this make sense and am I right or am I wrong? If I'm wrong, does Y in effect need to get an "extra" pre-communication key from every party that connects to it? (which they want to verify)?
Thanks.
===
Thanks for the responses below, at least the first two have been helpful so far (others may arrive later).
Let me talk in a bit more detail about my technical concern.
Suppose company Y attempts to "re-use" my client side cert to impersonate me in another client-side transaction with another company (say "Z"). Is this even possible? I am thinking - again, perhaps poorly stated - that some part of the transmission of the client side certificate prevents the entire key from being compromised, i.e., that is not technically feasible to "re-use" a client certificate received because you could not (feasibly) reverse engineer the communication which communicate the client cert.
If this is not the case, couldn't Y re-use the cert, impersonating X when communicating (client-side) to Z?
ps: I realize that security is never 100%, just trying to understand what is technically feasible here and what is not.
Thanks very much!
===
Further technical details, I appreciate any additional input very much - this has been very quite helpful.
From a layman's point of view, my concern is that when a client sends his client cert, what is he sending? Well, he has to encrypt that cert using his private key. He sends the ciphertext with the public key, and then a receiving party can use that public key to decrypt the private-key-encoded payload, right? That makes sense, but then I wonder -- what prevents someone from hearing that communication and simply re-using the private key encoded payload in a replay attack. Just replay the exact ones and zeroes sent.
Here's what I believe prevents that -- it can be found in the TLS RFC, in multiple places, but for example in F.1.1:
F.1.1. Authentication and key exchange
TLS supports three authentication modes: authentication of both
parties, server authentication with an unauthenticated client, and
total anonymity. Whenever the server is authenticated, the channel is
secure against man-in-the-middle attacks, but completely anonymous
sessions are inherently vulnerable to such attacks. Anonymous
servers cannot authenticate clients. If the server is authenticated,
its certificate message must provide a valid certificate chain
leading to an acceptable certificate authority. Similarly,
authenticated clients must supply an acceptable certificate to the
server. Each party is responsible for verifying that the other's
certificate is valid and has not expired or been revoked.
The general goal of the key exchange process is to create a
pre_master_secret known to the communicating parties and not to
attackers. The pre_master_secret will be used to generate the
master_secret (see Section 8.1). The master_secret is required to
generate the certificate verify and finished messages, encryption
keys, and MAC secrets (see Sections 7.4.8, 7.4.9 and 6.3). By sending
a correct finished message, parties thus prove that they know the
correct pre_master_secret.
I believe it's the randomization associated with the session that prevents these replay attacks.
Sound right or am I confusing things further?

As long as you keep the private key safe (and the CA keeps their signing key safe), "company Y" doesn't need any other information to authenticate you. In other words, they can be sure that a request really came from the subject named in the certificate.
However, this doesn't mean that you are authorized to do anything. In practice, most systems that use client certificates have an "out of band" process where you provide the "subject" distinguished name that is specified in the client certificate, and the system assigns some privileges to that name.
In fact, because of some practical limitations, some systems actually associate the privileges with the certificate itself (using the issuer's name and the certificate serial number). This means that if you get a new certificate, you might have to re-enroll it, even if it has the same subject name.
A certificate only assures a relying party you have a certain name. That party needs some additional mechanism to determine what you are allowed to do.
Unlike authentication mechanisms based on "secret" (symmetric) keys—e.g., passwords—a server only needs public information for asymmetric authentication. A private signing key should never be disclosed; it's certainly not necessary for client authentication.
With symmetric, password-based authentication, the client and server get access to the same string of bytes—the secret key. With public-key cryptography, only the public key of a key pair is disclosed. The corresponding private key is never disclosed, and no practical method for figuring out the private key from the public key has been discovered.
As long as you keep your private key safe, the server at company Y cannot forge requests that appear to come from you.
Client authentication replay attacks are defeated by requiring the client to digitally sign a message that includes a number randomly generated by the server for each handshake (this is the "random" member of the "ServerHello" message). If the packet used to authenticate the client in a previous session were re-used, a server will be unable to verify the signature, and won't authenticate the replay attack.
RFC 2246, Appendix F.1.1.2 might be a more helpful reference—especially the third paragraph:
When RSA is used for key exchange, clients are authenticated using
the certificate verify message (see Section 7.4.8). The client signs
a value derived from the master_secret and all preceding handshake
messages. These handshake messages include the server certificate,
which binds the signature to the server, and ServerHello.random,
which binds the signature to the current handshake process.

Your client-side certificate (or more precisely its private key), is only as secure as your company's online and/or physical security let it be.
For extremely secure relations (which typically do not have the requirement of scaling much), it may be acceptable that the provider of the service requires an extra element in the protocol which allows them to identify your site (and more often than not, to identify a particular computer or individual within the company, which is something client certs do not fully do.)
This of course brings the question: what is warranty that this extra bit of authentication device will be more securely held by your company? (as compared to the client-site certificate itself). The standard response for this is that these extra bits of security elements are typically non-standards, possibly associated with physical devices, machine IDs and such, and are therefore less easily transportable (and the know-how about these is less common: hackers know what RSA files to look for, and what they look like, what do they know of the genesis and usage of the KBD-4.hex file ?)
Extra question: Can Y make use of my client-side certificate elsewhere?
No they [normally] cannot! The integrity of this certificate lies in your safe keeping of its private key (and, yes, a similar safekeeping from the certificate providers...). Therefore, unless they are responsible for installing the said certificate, or unless their software on your client (if any) somehow "hacks" into certificate-related storage / files / system dlls, they should not be able to reuse the certificate. That is they cannot reuse the certificate any more easily (which is theoretically NP hard) than anyone that, say, would sniff the packets related to authentication as the client establishes a session with the Y site.
Extra questionS ;-)
- What is the nature of the client cert?
- Man-in-the-middle concerns...
Before getting to these, let's clear a few things up...
The question seems to imply TLS (Transport Layer Security) which is indeed a good protocol for this purpose, but for sake of understanding, the keys (public and private) from the certs (server's and client's) could well be used with alternative protocols. And also, TLS itself offers several different possible encryption algorithms for its support (one of the initial phases of a TLS session is for both parties to negotiate the set of algorithms they'll effectively use).
Also, what goes without saying... (also goes if you say it): the respective private keys are NEVER transmitted in any fashion, encoded or not. The confusion sometimes arise because after the authentication phase, the parties exchange a key (typically for a symmetric cipher) that is used in subsequent exchanges. This key is typically randomly generated, and of totally different nature than the RSA keys whether public or private!
In a simplified fashion, the client's certificate contains the following information:
- The Certificate Authority (CA aka issuer)
- The "owner" of the certificate (aka Subject)
- Validity date range
- the PUBLIC key of the certificate
In more detail, the certificates are typically found in an X.509 wrapper (? envelope), which contains additional fields such as version number, algorithm used, certificate ID, a certificate signature (very important to ensure that the certificate received wasn't tempered with). The X.509 also provide for optional attributes, and is also used for transmitting other types of certificate-related data (such as the CRL)
Therefore the certificate's content allow the recipient to:
- ensure the certificate itself was not tempered with
- ensure that the issuer of the certificate is one accepted by the recipient
- ensure that the certificate is valid/current and not revoked
- know the public key and its underlying size and algorithm
With regards to man-in-the-middle concerns, in particular the possibility of "re-playing" a possibly recorded packet exchange from a previous session.
The protocol uses variable, possibly random MACs (Message Authentication Codes) for that purpose. Essentially, during the negotiation phase, one of the parties (not sure which, may vary...) produces a random string of sorts and sends it to the other party. This random value is then used, as-is (or, typically, with some extra processing by an algorithm known by both parties) as part of the messages sent. It being encoded with the private key of teh sending party, if the the receiving party can decode it (with sender's public key) and recognize the (again variable) MAC, then it is proof that the sender is in possession of the private key of the certificate, and hence its identity is asserted. Because the MACs vary each time, pre-recorded sessions are of no help (for this simple purpose).

Related

How to make sure the domain level SSL certificate is present in trust store while establishing the connection to a website?

According to my understanding, when we are trying to connect to a website/url, even if one of the certificates in the SSL certificate chain of the website is present in the trust store then connection is established successfully. But, I want to establish a connection only if the domain level certificate is present in the trust store. And I am not allowed create a new trust store instead need to use the default trust store. How can this be implemented in Java? TIA.
Unfortunately for you, that's not how PKIs were designed to work. The search for any trusted root certificate in the chain is a design feature of PKIs that ensures we don't have to install a certificate per domain on clients - bloating local trust stores with millions of certificates and complicating revocation and renewal of certificates.
What you're looking for is referred to certificate pinning where the client validates that the certificate presented by the server has a specific thumbprint it knows and trusts before continuing any further communication with the server on the other end. It is essentially the client authenticating the server.
Depending on your particular implementation, the validation logic can be done in the application instead of at the TLS/SSL protocol layer, meaning you can do as much (CN, Key Usage Attributes, SAN) or as little (just thumbprint)validation as you want , but typically certificate thumbprints are used since they are *guaranteed to be unique. A interception proxy or other man-in-the-middle for instance can create a certificate with valid CN entry for your domain (valid domain validation), but they cannot spoof the thumbprint.
A certificate is a unique token issued to a particular individual. It is a form of identification, similar to a government-issued photo ID which most people carry.
Certificates were designed for one purpose - to convey an identity which can be verified as authentic. It does this via a chain of trust. If a client or server trusts the issuer of a certificate, then it will automatically trust the certificate.
Put in similar terms, this is similar to the TSA specifying guidelines for which forms of identification it will accept before it will let you into the security checkpoint. As long as you possess one of those valid forms of ID, the TSA will let you through. This is how the PKI is designed, and it has to be designed that way to function efficiently. So, there is no way to do this explicitly in the PKI framework.
What you're instead asking for is a separate level of identify verification beyond what PKI provides. A possible solution could be certificate pinning, but I'm not sure this gets you anywhere. If the private key is compromised, which is probably more likely than compromising a trusted CA, then you haven't gained any additional level of security.
Instead, best practice is to implement multi-factor authentication. Using the certificate itself as a second factor really doesn't make a whole lot of sense, because it isn't truly a two-factor identity. Instead, it would make more sense to use the PKI as-is, and establish a second authentication mechanism via TOTP or some other independent token generation.

Security and getting a certificate: designing a protocol

In the beginning of an SSL query, the client sends a CLIENT_HELLO message.
The server replies with a certificate message that gives the chain of verifications going back to a known trusted agent.
Suppose for efficiency I wanted to store certificates locally for a new protocol. The current design in TLS is always to require getting the certificate. What could happen to a certificate that would require me to know?
I am trying to understand possible attack scenarios. Consider doing online banking, and suppose a certificate has been compromised. In such as case, the bank is not criminal, but they have been hacked and have to issue a new certificate. Is this reasonable?
If you consider that the bank itself is corrupt, then it seems to me there is no point in worrying about the certificate since they have your money and can just steal it. If the entity you are dealing with goes criminal, does the certificate matter?
Under what circumstances can certfiicates be revoked? I am trying to understand why SSL sends the certificate each time -- it seems really wasteful, but there is probably a good reason.
Would it be possible instead to keep all certificates stored on the client, but check a timestamp with a trusted server? It seems like one could at least send less data across the network
TLS Certificate message after `ServerHello is mandatory in mostly cases, so caching won't have any useful effect. See RFC5246
7.4.2. Server Certificate
When this message will be sent:
The server MUST send a Certificate message whenever the agreed- upon key exchange method uses certificates for authentication (this includes all key exchange methods defined in this document except DH_anon). This message will always immediately follow the ServerHello message.
TLS has its own methods to improve performance. When client sends a valid session_id in ClientHello the session can be resumed and the parties must proceed directly to the Finished messages
Also RFC5077 specifies how to resume sessions without server-side state
EDITED - added comments to specific questions
Suppose for efficiency I wanted to store certificates locally for a new protocol. The current design in TLS is always to require getting the certificate. What could happen to a certificate that would require me to know?
"always" is not correct. TLS sends certificates during handshake. Once the shared key is negotiated, the session can be resumed later by client using sessionid (the usual behaviour). Then, the server does not send the certification chain.
The server sends the certification chain. The client must verify that the presented certificate is reliable:
checking a digital signature performed with the private key of the server certificat
the certificate is issued by a trusted CA. Is supposed that client has a trust store with the root certificates of the certification authorities it trust. The client builds the certification chain presented by server until it finds the root certificate in local truststore
You can perfectly skip the sending of certificates from the server in the second step step if client has a copy of the server certificate in a local truststore
I am trying to understand possible attack scenarios. Consider doing online banking, and suppose a certificate has been compromised. In such as case, the bank is not criminal, but they have been hacked and have to issue a new certificate. Is this reasonable?
In this scenario the attacker could make a MITM attack. The certificate must be revoked by CA and client should check revocation. This is out of scope of TLS
If you consider that the bank itself is corrupt, then it seems to me there is no point in worrying about the certificate since they have your money and can just steal it. If the entity you are dealing with goes criminal, does the certificate matter?
Seems in this case the certificate is the least of the problems...
Under what circumstances can certfiicates be revoked?
Each CA stablish its own procedure. There is no a standard but there are "good practices": When certificate data changes (e.g email) or becomes invalid (Representative of a company), after a renewal revoke the older one, when key is compromised or certificate is lost
Would it be possible instead to keep all certificates stored on the client, but check a timestamp with a trusted server?
Yes it is possible as commented above: Verify a digital signature, verify revocation and stablish a refreshing mechanism
But if you're looking for performance comparing with TLS, the session resumption will probably have better results

Add RSA based encryption to WCF service without certificates

I am looking for a way to encrypt messages between client and server using the WCF. WCF offers a lot of built in security mechanisms to enrcypt traffic between client and server, but there seems to be nothing fitting my requirements.
I don't want to use certificates since they are too complicated, so don't suggest me to to use certificates please. I don't need confidentiality, so I though I'll go best using plain RSA.
I want real security, no hardcoded key or something. I was thinking about having a public/private keypair generated every time the server starts. Both keys will only be stored in RAM.
Then wen a client connects it should do exactly like SSL. Just as described here.
1.exchange some form of a private/public key pair; the server generates a key pair and keeps the private key to itself and shares the public key with the client (e.g. over a WCF message, for instance)
2.using that private/public key pair, exchange a common shared secret, e.g. an "encryption key" that will symmetrically encrypt your messages (and since it's symmetrical, the server can use the same key to decrypt the messages)
3.setup infrastructure on your client (e.g. a WCF extension called a behavior) to inspect the message before it goes out and encrypt it with your shared secret
That would be secure, wouldn't it?
Is there any existing solution to archive what I described? If not I'll create it on my own. Where do I start best? Which kind of WCF custom behaviour is the best to implement?
EDIT:
As this is NOT secure, I'll take the following approach:
When Installing the server component a new X509 certificate will be generated and automatially added to the cert store (of the server). The public part of this generated certificate will be dynamically included into the client setup. When running the client setup on the client machine the certificate will be installed into the trustet windows certificate store of the client.
So there's no extra work when installing the product and everything should be secure, just as we want it.
You've said you don't want to use certificates. I won't push certificate use on you, but one thing you are missing is that certificates serve a purpose.
A certificate proves that key you are negotiating an SSL connection with belongs to the entity you think it belongs to. If you have some way of ensuring this is the case without using certificates, by all means, use raw keys.
The problem is, in step 1:
1.exchange some form of a private/public key pair; the server generates a key pair and keeps the private key to itself and shares the public key with the client (e.g. over a WCF message, for instance)
How does the client know that the public key it received from the server wasn't intercepted by a man-in-the-middle and replaced with the MITM's key?
This is why certificates exist. If you don't want to use them, you have to come up with another way of solving this problem.
Do you have a small, well-known set of clients? Is it possible to preconfigure the server's public key on the client?
Alexandru Lungu has created an article on codeproject:
WCF Client Server Application with Custom Authentication, Authorization, Encryption and Compression
No, it would not be secure!
since there's no confidentiality, an attacker could do a men in the middle attack, and all the security is gone.
The only real secure way of encrypting messages between server and client IS to actually use digital certificates.
I'm sorry, the only two methods of providing secure communications are:
Use a public key infrastructure that includes a chain of trust relationships, a.k.a. certificates
or
Use a shared secret, a.k.a. a hardcoded key.
Nothing else addresses all of the known common attack vectors such as man-in-the-middle, replay attack, etc. That's the hard truth.
On the other hand I can offer you an alternative that may alleviate your problem somewhat: Use both.
Write a very, very simple web service whose only job is to generate symmetric keys. Publish this service via SSL. Require end user credential authentication in order to obtain a symmetric key.
Write the rest of your services without SSL but using the symmetric keys published via the first service.
That way your main app doesn't have to deal with the certificates.

Does TLS (/ handled by a load balancer) with a client certificate provide non-repudiation and message integrity?

So, a few questions:
1) Does using a client certificate during TLS provide non-repudation?
1a) Follow-up: If so, does having a load balancer handle the transaction still provide this assurance at the end server/service level?
2/2a) Same questions as above, but for message integrity.
I know the answers for MLS, but I'm not sure about TLS. If I understand correctly, TLS involves a handshake where the shared secret is established, and that is used to secure the pipe - so none of these things hold, since each message uses only the shared secret.
Does using a client certificate during TLS provide non-repudation?
No. This is unfortunate, as it engages in everything that is necessary to implement it except making the digital signature available. It might be possible to stand up in court as an expert witness and make a case that as it all happens under the hood the transaction should be non-repudiable anyway, but I would much rather prefer to be able to produce an actual digital signature in court. Then there is nothing to debate.
2/2a) Same questions as above, but for message integrity.
TLS provides message privacy, message integrity, and authentication of at least one of the peers (unless you have broken that). It doesn't provide authorization, and it doesn't provide non-repudiation as above.
Does using a client certificate during TLS provide non-repudation?
1a) Follow-up: If so, does having a load balancer handle the
transaction still provide this assurance at the end server/service
level?
TLS is about transport level security (hence the name Transport Layer Security). It aims to secure the communication between the client and the server (possible a load-balancer), according to the specification:
The primary goal of the TLS Protocol is to provide privacy and data
integrity between two communicating applications.
You could in principle keep the entire TLS exchange, in particular keeping the handshake to prove that the client-certificate signed the content of the Certificate Verify message. You would also have to keep the various generated/calculated intermediate values (in particular the master secret, and subsequently the shared key). There is one problem with this: the TLS specification requires (only with "should") the pre-master secret to be deleted. This could make proving the path back from data to client certificate rather difficult. (You would certainly have to tweak SSL/TLS stacks to record all this too.)
In addition, recording all this would be under the application protocol (assuming HTTPS here, but the same would apply to other protocols). This would certainly be another layer before you get to the actual data you want not to be repudiated. (The problem is that you may have to record the entire session for proof, without being able to select with request/response to isolate.)
You may also run into further problems when it comes to session resumption (for example) and generally parallel requests. This would certainly add to the confusion.
Overall, it's not what TLS is designed for. Non-repudiation is about being able to keep a proof of the exchange, possibly to be able to display it in court or similar. Explaining to people (who might not have the technical background) how you make the link between the interesting data and the client certificate could be challenging.
2/2a) Same questions as above, but for message integrity.
TLS guarantees the integrity of the communication (see introduction to the specification). (All of this, of course, provided that the client verifies the server certificate correctly, although you should be able to detect a MITM if you're using client certificates anyway.)
Integrity will only be guaranteed up to the point where the TLS connection ends. This will be the load-balancer itself if you're using one. (Of course, it's better to link the load-balancer to its worker nodes via a network that can be trusted.)
If you're after a system where clients can send non-repudiating messages, which can be audited at a later date, you should look into message level solutions.

How does SSL really work?

How does SSL work?
Where is the certificate installed on the client (or browser?) and the server (or web server?)?
How does the trust/encryption/authentication process start when you enter the URL into the browser and get the page from the server?
How does the HTTPS protocol recognize the certificate? Why can't HTTP work with certificates when it is the certificates which do all the trust/encryption/authentication work?
Note: I wrote my original answer very hastily, but since then, this has turned into a fairly popular question/answer, so I have expanded it a bit and made it more precise.
TLS Capabilities
"SSL" is the name that is most often used to refer to this protocol, but SSL specifically refers to the proprietary protocol designed by Netscape in the mid 90's. "TLS" is an IETF standard that is based on SSL, so I will use TLS in my answer. These days, the odds are that nearly all of your secure connections on the web are really using TLS, not SSL.
TLS has several capabilities:
Encrypt your application layer data. (In your case, the application layer protocol is HTTP.)
Authenticate the server to the client.
Authenticate the client to the server.
#1 and #2 are very common. #3 is less common. You seem to be focusing on #2, so I'll explain that part.
Authentication
A server authenticates itself to a client using a certificate. A certificate is a blob of data[1] that contains information about a website:
Domain name
Public key
The company that owns it
When it was issued
When it expires
Who issued it
Etc.
You can achieve confidentiality (#1 above) by using the public key included in the certificate to encrypt messages that can only be decrypted by the corresponding private key, which should be stored safely on that server.[2] Let's call this key pair KP1, so that we won't get confused later on. You can also verify that the domain name on the certificate matches the site you're visiting (#2 above).
But what if an adversary could modify packets sent to and from the server, and what if that adversary modified the certificate you were presented with and inserted their own public key or changed any other important details? If that happened, the adversary could intercept and modify any messages that you thought were securely encrypted.
To prevent this very attack, the certificate is cryptographically signed by somebody else's private key in such a way that the signature can be verified by anybody who has the corresponding public key. Let's call this key pair KP2, to make it clear that these are not the same keys that the server is using.
Certificate Authorities
So who created KP2? Who signed the certificate?
Oversimplifying a bit, a certificate authority creates KP2, and they sell the service of using their private key to sign certificates for other organizations. For example, I create a certificate and I pay a company like Verisign to sign it with their private key.[3] Since nobody but Verisign has access to this private key, none of us can forge this signature.
And how would I personally get ahold of the public key in KP2 in order to verify that signature?
Well we've already seen that a certificate can hold a public key — and computer scientists love recursion — so why not put the KP2 public key into a certificate and distribute it that way? This sounds a little crazy at first, but in fact that's exactly how it works. Continuing with the Verisign example, Verisign produces a certificate that includes information about who they are, what types of things they are allowed to sign (other certificates), and their public key.
Now if I have a copy of that Verisign certificate, I can use that to validate the signature on the server certificate for the website I want to visit. Easy, right?!
Well, not so fast. I had to get the Verisign certificate from somewhere. What if somebody spoofs the Verisign certificate and puts their own public key in there? Then they can forge the signature on the server's certificate, and we're right back where we started: a man-in-the-middle attack.
Certificate Chains
Continuing to think recursively, we could of course introduce a third certificate and a third key pair (KP3) and use that to sign the Verisign certifcate. We call this a certificate chain: each certificate in the chain is used to verify the next certificate. Hopefully you can already see that this recursive approach is just turtles/certificates all the way down. Where does it stop?
Since we can't create an infinite number of certificates, the certificate chain obviously has to stop somewhere, and that's done by including a certificate in the chain that is self-signed.
I'll pause for a moment while you pick up the pieces of brain matter from your head exploding. Self-signed?!
Yes, at the end of the certificate chain (a.k.a. the "root"), there will be a certificate that uses it's own keypair to sign itself. This eliminates the infinite recursion problem, but it doesn't fix the authentication problem. Anybody can create a self-signed certificate that says anything on it, just like I can create a fake Princeton diploma that says I triple majored in politics, theoretical physics, and applied butt-kicking and then sign my own name at the bottom.
The [somewhat unexciting] solution to this problem is just to pick some set of self-signed certificates that you explicitly trust. For example, I might say, "I trust this Verisign self-signed certificate."
With that explicit trust in place, now I can validate the entire certificate chain. No matter how many certificates there are in the chain, I can validate each signature all the way down to the root. When I get to the root, I can check whether that root certificate is one that I explicitly trust. If so, then I can trust the entire chain.
Conferred Trust
Authentication in TLS uses a system of conferred trust. If I want to hire an auto mechanic, I may not trust any random mechanic that I find. But maybe my friend vouches for a particular mechanic. Since I trust my friend, then I can trust that mechanic.
When you buy a computer or download a browser, it comes with a few hundred root certificates that it explicitly trusts.[4] The companies that own and operate those certificates can confer that trust to other organizations by signing their certificates.
This is far from a perfect system. Some times a CA may issue a certificate erroneously. In those cases, the certificate may need to be revoked. Revocation is tricky since the issued certificate will always be cryptographically correct; an out-of-band protocol is necessary to find out which previously valid certificates have been revoked. In practice, some of these protocols aren't very secure, and many browsers don't check them anyway.
Sometimes an entire CA is compromised. For example, if you were to break into Verisign and steal their root signing key, then you could spoof any certificate in the world. Notice that this doesn't just affect Verisign customers: even if my certificate is signed by Thawte (a competitor to Verisign), that doesn't matter. My certificate can still be forged using the compromised signing key from Verisign.
This isn't just theoretical. It has happened in the wild. DigiNotar was famously hacked and subsequently went bankrupt. Comodo was also hacked, but inexplicably they remain in business to this day.
Even when CAs aren't directly compromised, there are other threats in this system. For example, a government use legal coercion to compel a CA to sign a forged certificate. Your employer may install their own CA certificate on your employee computer. In these various cases, traffic that you expect to be "secure" is actually completely visible/modifiable to the organization that controls that certificate.
Some replacements have been suggested, including Convergence, TACK, and DANE.
Endnotes
[1] TLS certificate data is formatted according to the X.509 standard. X.509 is based on ASN.1 ("Abstract Syntax Notation #1"), which means that it is not a binary data format. Therefore, X.509 must be encoded to a binary format. DER and PEM are the two most common encodings that I know of.
[2] In practice, the protocol actually switches over to a symmetric cipher, but that's a detail that's not relevant to your question.
[3] Presumable, the CA actually validates who you are before signing your certificate. If they didn't do that, then I could just create a certificate for google.com and ask a CA to sign it. With that certificiate, I could man-in-the-middle any "secure" connection to google.com. Therefore, the validation step is a very important factor in the operation of a CA. Unfortunately, it's not very clear how rigorous this validation process is at the hundreds of CAs around the world.
[4] See Mozilla's list of trusted CAs.
HTTPS is combination of HTTP and SSL(Secure Socket Layer) to provide encrypted communication between client (browser) and web server (application is hosted here).
Why is it needed?
HTTPS encrypts data that is transmitted from browser to server over the network. So, no one can sniff the data during transmission.
How HTTPS connection is established between browser and web server?
Browser tries to connect to the https://payment.com.
payment.com server sends a certificate to the browser. This certificate includes payment.com server's public key, and some evidence that this public key actually belongs to payment.com.
Browser verifies the certificate to confirm that it has the proper public key for payment.com.
Browser chooses a random new symmetric key K to use for its connection to payment.com server. It encrypts K under payment.com public key.
payment.com decrypts K using its private key. Now both browser and the payment server know K, but no one else does.
Anytime browser wants to send something to payment.com, it encrypts it under K; the payment.com server decrypts it upon receipt. Anytime the payment.com server wants to send something to your browser, it encrypts it under K.
This flow can be represented by the following diagram:
I have written a small blog post which discusses the process briefly. Please feel free to take a look.
SSL Handshake
A small snippet from the same is as follows:
"Client makes a request to the server over HTTPS. Server sends a copy of its SSL certificate + public key. After verifying the identity of the server with its local trusted CA store, client generates a secret session key, encrypts it using the server's public key and sends it. Server decrypts the secret session key using its private key and sends an acknowledgment to the client. Secure channel established."
Mehaase has explained it in details already. I will add my 2 cents to this series. I have many blogposts revolving around SSL handshake and certificates. While most of this revolves around IIS web server, the post is still relevant to SSL/TLS handshake in general. Here are few for your reference:
SSL Handshake and IIS
Client certificate Authentication in SSL Handshake
Do not treat CERTIFICATES & SSL as one topic. Treat them as 2 different topics and then try to see who they work in conjunction. This will help you answer the question.
Establishing trust between communicating parties via Certificate Store
SSL/TLS communication works solely on the basis of trust. Every computer (client/server) on the internet has a list of Root CA's and Intermediate CA's that it maintains. These are periodically updated. During SSL handshake this is used as a reference to establish trust. For exampe, during SSL handshake, when the client provides a certificate to the server. The server will try to cehck whether the CA who issued the cert is present in its list of CA's . When it cannot do this, it declares that it was unable to do the certificate chain verification. (This is a part of the answer. It also looks at AIA for this.) The client also does a similar verification for the server certificate which it receives in Server Hello.
On Windows, you can see the certificate stores for client & Server via PowerShell. Execute the below from a PowerShell console.
PS Cert:> ls Location : CurrentUser StoreNames : {TrustedPublisher, ClientAuthIssuer, Root, UserDS...}
Location : LocalMachine StoreNames : {TrustedPublisher,
ClientAuthIssuer, Remote Desktop, Root...}
Browsers like Firefox and Opera don't rely on underlying OS for certificate management. They maintain their own separate certificate stores.
The SSL handshake uses both Symmetric & Public Key Cryptography. Server Authentication happens by default. Client Authentication is optional and depends if the Server endpoint is configured to authenticate the client or not. Refer my blog post as I have explained this in detail.
Finally for this question
How does the HTTPS protocol recognize the certificate? Why can't HTTP work with certificates when it is the certificates which do all the trust/encryption/authentication work?
Certificates is simply a file whose format is defined by X.509 standard. It is a electronic document which proves the identity of a communicating party.
HTTPS = HTTP + SSL is a protocol which defines the guidelines as to how 2 parties should communicate with each other.
MORE INFORMATION
In order to understand certificates you will have to understand what certificates are and also read about Certificate Management. These is important.
Once this is understood, then proceed with TLS/SSL handshake. You may refer the RFC's for this. But they are skeleton which define the guidelines. There are several blogposts including mine which explain this in detail.
If the above activity is done, then you will have a fair understanding of Certificates and SSL.