Encrypting a file using a private key and public key - ssl

I am a developer with very little experience of encryption. I am trying to learn more about Encryption and specifically SSL in my spare time.
Say a trusted company has a file (notepad) that contains a load of personal and confidential information. Say I wanted to ask them to send this information to a me to analyze. How would I do this?
My research is telling me that I would have to:
1) Create a self signed certificate
2) Generate the public and private key
3) Issue the public key to the trusted company and keep the private key
4) Ask the company to encrypt the information using the public key and then send it to me
5) Decrypt the file using the private key
Is that correct? How would I do this?
Should the certificate be signed by a Certificate Authority in this case as I would manually be issuing the public key to the trusted company (they know it comes from me).
As I said I am relatively new to this subject.

SSL is a protocol for encrypting network connections. It's not really relevant for manually encrypting and decrypting files. (It's also insecure and deprecated; TLS should be used instead.)
If you want to manually encrypt and decrypt files, yes, you could use a self-signed certificate and just give the public key to the company that'll send the file. However, you have to make sure to give them the public key in a way that can't be intercepted — such as physically handing it over on a flash drive, not emailing it — or you're vulnerable to a man-in-the-middle attack: someone could intercept the public key and give the company a different one instead. (If that happened, the company would unknowingly be encrypting the file for the attacker to decrypt, and the attacker would then re-encrypt the file with the real public key and send it along to you.)
If you're doing this manually and you deliver the public key in a secure way, you don't need to involve a certificate authority. The purpose of a certificate authority is to vouch for the authenticity of a certificate when it's delivered over an insecure channel. For example, when you visit stackoverflow.com, your browser gets a certificate (with a public key) from the web server. Since this certificate wasn't personally given to you by SO staff, how do you know that the web server is the real stackoverflow.com, not a man-in-the-middle attacker intercepting your connection in order to hijack your account? The website's certificate is signed by a certificate authority — DigiCert, currently — that says DigiCert has verified that it's real. Your browser is configured to trust that signature from DigiCert, because the browser developers have verified that the company is trustworthy.
PGP and GPG are good tools for securely exchanging encrypted files. They use a "web-of-trust" model based on securely exchanging certificates, rather than relying on a certificate authority.

Related

How a client verifies a certificate validity? [duplicate]

What is the series of steps needed to securely verify a ssl certificate? My (very limited) understanding is that when you visit an https site, the server sends a certificate to the client (the browser) and the browser gets the certificate's issuer information from that certificate, then uses that to contact the issuerer, and somehow compares certificates for validity.
How exactly is this done?
What about the process makes it immune to man-in-the-middle attacks?
What prevents some random person from setting up their own verification service to use in man-in-the-middle attacks, so everything "looks" secure?
Here is a very simplified explanation:
Your web browser downloads the web server's certificate, which contains the public key of the web server. This certificate is signed with the private key of a trusted certificate authority.
Your web browser comes installed with the public keys of all of the major certificate authorities. It uses this public key to verify that the web server's certificate was indeed signed by the trusted certificate authority.
The certificate contains the domain name and/or ip address of the web server. Your web browser confirms with the certificate authority that the address listed in the certificate is the one to which it has an open connection.
Your web browser generates a shared symmetric key which will be used to encrypt the HTTP traffic on this connection; this is much more efficient than using public/private key encryption for everything. Your browser encrypts the symmetric key with the public key of the web server then sends it back, thus ensuring that only the web server can decrypt it, since only the web server has its private key.
Note that the certificate authority (CA) is essential to preventing man-in-the-middle attacks. However, even an unsigned certificate will prevent someone from passively listening in on your encrypted traffic, since they have no way to gain access to your shared symmetric key.
You said that
the browser gets the certificate's issuer information from that
certificate, then uses that to contact the issuerer, and somehow
compares certificates for validity.
The client doesn't have to check with the issuer because two things :
all browsers have a pre-installed list of all major CAs public keys
the certificate is signed, and that signature itself is enough proof that the certificate is valid because the client can make sure, on his own, and without contacting the issuer's server, that that certificate is authentic. That's the beauty of asymmetric encryption.
Notice that 2. can't be done without 1.
This is better explained in this big diagram I made some time ago
(skip to "what's a signature ?" at the bottom)
It's worth noting that in addition to purchasing a certificate (as mentioned above), you can also create your own for free; this is referred to as a "self-signed certificate". The difference between a self-signed certificate and one that's purchased is simple: the purchased one has been signed by a Certificate Authority that your browser already knows about. In other words, your browser can easily validate the authenticity of a purchased certificate.
Unfortunately this has led to a common misconception that self-signed certificates are inherently less secure than those sold by commercial CA's like GoDaddy and Verisign, and that you have to live with browser warnings/exceptions if you use them; this is incorrect.
If you securely distribute a self-signed certificate (or CA cert, as bobince suggested) and install it in the browsers that will use your site, it's just as secure as one that's purchased and is not vulnerable to man-in-the-middle attacks and cert forgery. Obviously this means that it's only feasible if only a few people need secure access to your site (e.g., internal apps, personal blogs, etc.).
I KNOW THE BELOW IS LONG, BUT IT IS DETAILED, YET SIMPLIFIED ENOUGH. READ CAREFULLY AND I GUARANTEE YOU'LL START FINDING THIS TOPIC IS NOT ALL THAT COMPLICATED.
First of all, anyone can create 2 keys. One to encrypt data, and another to decrypt data. The former can be a private key, and the latter a public key, AND VICERZA.
Second of all, in simplest terms, a Certificate Authority (CA) offers the service of creating a certificate for you. How? They use certain values (the CA's issuer name, your server's public key, company name, domain, etc.) and they use their SUPER DUPER ULTRA SECURE SECRET private key and encrypt this data. The result of this encrypted data is a SIGNATURE.
So now the CA gives you back a certificate. The certificate is basically a file containing the values previously mentioned (CA's issuer name, company name, domain, your server's public key, etc.), INCLUDING the signature (i.e. an encrypted version of the latter values).
Now, with all that being said, here is a REALLY IMPORTANT part to remember: your device/OS (Windows, Android, etc.) pretty much keeps a list of all major/trusted CA's and their PUBLIC KEYS (if you're thinking that these public keys are used to decrypt the signatures inside the certificates, YOU ARE CORRECT!).
Ok, if you read the above, this sequential example will be a breeze now:
Example-Company asks Example-CA to create for them a certificate.
Example-CA uses their super private key to sign this certificate and gives Example-Company the certificate.
Tomorrow, internet-user-Bob uses Chrome/Firefox/etc. to browse to https://example-company.com. Most, if not all, browsers nowadays will expect a certificate back from the server.
The browser gets the certificate from example-company.com. The certificate says it's been issued by Example-CA. It just so happens to be that Bob's OS already has Example-CA in its list of trusted CA's, so the browser gets Example-CA's public key. Remember: this is all happening in Bob's computer/mobile/etc., not over the wire.
So now the browser decrypts the signature in the certificate. FINALLY, the browser compares the decrypted values with the contents of the certificate itself. IF THE CONTENTS MATCH, THAT MEANS THE SIGNATURE IS VALID!
Why? Think about it, only this public key can decrypt the signature in such a way that the contents look like they did before the private key encrypted them.
How about man in the middle attacks?
This is one of the main reasons (if not the main reason) why the above standard was created.
Let's say hacker-Jane intercepts internet-user-Bob's request, and replies with her own certificate. However, hacker-Jane is still careful enough to state in the certificate that the issuer was Example-CA. Lastly, hacker-Jane remembers that she has to include a signature on the certificate. But what key does Jane use to sign (i.e. create an encrypted value of the certificate main contents) the certificate?????
So even if hacker-Jane signed the certificate with her own key, you see what's gonna happen next. The browser is gonna say: "ok, this certificate is issued by Example-CA, let's decrypt the signature with Example-CA's public key". After decryption, the browser notices that the certificate contents don't match at all. Hence, the browser gives a very clear warning to the user, and it says it doesn't trust the connection.
The client has a pre-seeded store of SSL certificate authorities' public keys. There must be a chain of trust from the certificate for the server up through intermediate authorities up to one of the so-called "root" certificates in order for the server to be trusted.
You can examine and/or alter the list of trusted authorities. Often you do this to add a certificate for a local authority that you know you trust - like the company you work for or the school you attend or what not.
The pre-seeded list can vary depending on which client you use. The big SSL certificate vendors insure that their root certs are in all the major browsers ($$$).
Monkey-in-the-middle attacks are "impossible" unless the attacker has the private key of a trusted root certificate. Since the corresponding certificates are widely deployed, the exposure of such a private key would have serious implications for the security of eCommerce generally. Because of that, those private keys are very, very closely guarded.
if you're more technically minded, this site is probably what you want: http://www.zytrax.com/tech/survival/ssl.html
warning: the rabbit hole goes deep :).

TLS/SSL handshake [duplicate]

What is the series of steps needed to securely verify a ssl certificate? My (very limited) understanding is that when you visit an https site, the server sends a certificate to the client (the browser) and the browser gets the certificate's issuer information from that certificate, then uses that to contact the issuerer, and somehow compares certificates for validity.
How exactly is this done?
What about the process makes it immune to man-in-the-middle attacks?
What prevents some random person from setting up their own verification service to use in man-in-the-middle attacks, so everything "looks" secure?
Here is a very simplified explanation:
Your web browser downloads the web server's certificate, which contains the public key of the web server. This certificate is signed with the private key of a trusted certificate authority.
Your web browser comes installed with the public keys of all of the major certificate authorities. It uses this public key to verify that the web server's certificate was indeed signed by the trusted certificate authority.
The certificate contains the domain name and/or ip address of the web server. Your web browser confirms with the certificate authority that the address listed in the certificate is the one to which it has an open connection.
Your web browser generates a shared symmetric key which will be used to encrypt the HTTP traffic on this connection; this is much more efficient than using public/private key encryption for everything. Your browser encrypts the symmetric key with the public key of the web server then sends it back, thus ensuring that only the web server can decrypt it, since only the web server has its private key.
Note that the certificate authority (CA) is essential to preventing man-in-the-middle attacks. However, even an unsigned certificate will prevent someone from passively listening in on your encrypted traffic, since they have no way to gain access to your shared symmetric key.
You said that
the browser gets the certificate's issuer information from that
certificate, then uses that to contact the issuerer, and somehow
compares certificates for validity.
The client doesn't have to check with the issuer because two things :
all browsers have a pre-installed list of all major CAs public keys
the certificate is signed, and that signature itself is enough proof that the certificate is valid because the client can make sure, on his own, and without contacting the issuer's server, that that certificate is authentic. That's the beauty of asymmetric encryption.
Notice that 2. can't be done without 1.
This is better explained in this big diagram I made some time ago
(skip to "what's a signature ?" at the bottom)
It's worth noting that in addition to purchasing a certificate (as mentioned above), you can also create your own for free; this is referred to as a "self-signed certificate". The difference between a self-signed certificate and one that's purchased is simple: the purchased one has been signed by a Certificate Authority that your browser already knows about. In other words, your browser can easily validate the authenticity of a purchased certificate.
Unfortunately this has led to a common misconception that self-signed certificates are inherently less secure than those sold by commercial CA's like GoDaddy and Verisign, and that you have to live with browser warnings/exceptions if you use them; this is incorrect.
If you securely distribute a self-signed certificate (or CA cert, as bobince suggested) and install it in the browsers that will use your site, it's just as secure as one that's purchased and is not vulnerable to man-in-the-middle attacks and cert forgery. Obviously this means that it's only feasible if only a few people need secure access to your site (e.g., internal apps, personal blogs, etc.).
I KNOW THE BELOW IS LONG, BUT IT IS DETAILED, YET SIMPLIFIED ENOUGH. READ CAREFULLY AND I GUARANTEE YOU'LL START FINDING THIS TOPIC IS NOT ALL THAT COMPLICATED.
First of all, anyone can create 2 keys. One to encrypt data, and another to decrypt data. The former can be a private key, and the latter a public key, AND VICERZA.
Second of all, in simplest terms, a Certificate Authority (CA) offers the service of creating a certificate for you. How? They use certain values (the CA's issuer name, your server's public key, company name, domain, etc.) and they use their SUPER DUPER ULTRA SECURE SECRET private key and encrypt this data. The result of this encrypted data is a SIGNATURE.
So now the CA gives you back a certificate. The certificate is basically a file containing the values previously mentioned (CA's issuer name, company name, domain, your server's public key, etc.), INCLUDING the signature (i.e. an encrypted version of the latter values).
Now, with all that being said, here is a REALLY IMPORTANT part to remember: your device/OS (Windows, Android, etc.) pretty much keeps a list of all major/trusted CA's and their PUBLIC KEYS (if you're thinking that these public keys are used to decrypt the signatures inside the certificates, YOU ARE CORRECT!).
Ok, if you read the above, this sequential example will be a breeze now:
Example-Company asks Example-CA to create for them a certificate.
Example-CA uses their super private key to sign this certificate and gives Example-Company the certificate.
Tomorrow, internet-user-Bob uses Chrome/Firefox/etc. to browse to https://example-company.com. Most, if not all, browsers nowadays will expect a certificate back from the server.
The browser gets the certificate from example-company.com. The certificate says it's been issued by Example-CA. It just so happens to be that Bob's OS already has Example-CA in its list of trusted CA's, so the browser gets Example-CA's public key. Remember: this is all happening in Bob's computer/mobile/etc., not over the wire.
So now the browser decrypts the signature in the certificate. FINALLY, the browser compares the decrypted values with the contents of the certificate itself. IF THE CONTENTS MATCH, THAT MEANS THE SIGNATURE IS VALID!
Why? Think about it, only this public key can decrypt the signature in such a way that the contents look like they did before the private key encrypted them.
How about man in the middle attacks?
This is one of the main reasons (if not the main reason) why the above standard was created.
Let's say hacker-Jane intercepts internet-user-Bob's request, and replies with her own certificate. However, hacker-Jane is still careful enough to state in the certificate that the issuer was Example-CA. Lastly, hacker-Jane remembers that she has to include a signature on the certificate. But what key does Jane use to sign (i.e. create an encrypted value of the certificate main contents) the certificate?????
So even if hacker-Jane signed the certificate with her own key, you see what's gonna happen next. The browser is gonna say: "ok, this certificate is issued by Example-CA, let's decrypt the signature with Example-CA's public key". After decryption, the browser notices that the certificate contents don't match at all. Hence, the browser gives a very clear warning to the user, and it says it doesn't trust the connection.
The client has a pre-seeded store of SSL certificate authorities' public keys. There must be a chain of trust from the certificate for the server up through intermediate authorities up to one of the so-called "root" certificates in order for the server to be trusted.
You can examine and/or alter the list of trusted authorities. Often you do this to add a certificate for a local authority that you know you trust - like the company you work for or the school you attend or what not.
The pre-seeded list can vary depending on which client you use. The big SSL certificate vendors insure that their root certs are in all the major browsers ($$$).
Monkey-in-the-middle attacks are "impossible" unless the attacker has the private key of a trusted root certificate. Since the corresponding certificates are widely deployed, the exposure of such a private key would have serious implications for the security of eCommerce generally. Because of that, those private keys are very, very closely guarded.
if you're more technically minded, this site is probably what you want: http://www.zytrax.com/tech/survival/ssl.html
warning: the rabbit hole goes deep :).

Certificates, Encryption, And Authentication

Mostly, my confusion seems to be eminating from my attempts to understand security within the context of WCF. In WCF, it looks like certificates can be used for the purpose of authentication, as well as encryption. Basically, I am trying to understand:
How can an X509 certificate be used as an authentication token? Aren't ssl certificates usually made to be publically available? Wouldn't this make it impossible for them to be used for authentication purposes? If not, are there some protocols which are commonly used for this purpose?
When encrypting messages with WCF, are certificates used which have been issued only to the client, only to the server, or to both? If certificates from the client and server are both used, I'm a little unclear as to why. This mostly stems from my understanding of https, in which case only a certificate issued to the server (and chained to some certificate issued by a root CA) would be necessary to establish an encrypted connection and authenticate the server.
I'm not entirely sure this is the correct forum. My questions stemmed from trying to understand WCF, but I guess I would like to understand the theory behind this in general. If it's a good idea, please suggest the correct forum for me. I'd be happy to try to get this question migrated, if necessary.
Thanks in advance!
Well this is pretty complex question. I will try to explain some parts but avoiding as much detail as possible (even after that it will be pretty long).
How does authentication with certificate work?
If a holder of the private key signs some data, other participants can use the public key of the signer to validate the signature. This mechanism can be used for authentication. Private and public keys are stored in certificate where private key is kept safe on the holder machine whereas certificate with public key can be publicly available.
How does it relate to HTTPS?
WCF offers transport and message security. The difference between them is described here. The transport security in case of HTTP is HTTPS where only server needs issued certificate and client must to trust this certificate. This certificate is used both for authenticating server to the client and for establishing secure channel (which uses symmetric encryption).
HTTPS also offers variant called Mutual HTTPS where client must have also issued certificate and client uses the certificate to authenticate to the server.
How does message security work and what is a purpose of two certificates in that scenario?
In case of message security each message is signed, encrypted and authenticated separately = all these security informations are part of the message. In case of SOAP this is described by many specifications but generally you are interested in security bindings and X.509 Token profile.
Security binding is part of WS-SecurityPolicy assertions and it is describes how the message is secured. We have three bindings:
Symmetric security binding - symmetric encryption
Asymmetric security binding - asymmetric encryption
Transport security binding - assertion that message must be send over HTTPS or other secured transport
X.509 Token profile specifies how to transport certificates (public keys) in messages and how to use them.
Now if you have symmetric security binding you need only server certificate because
When client wants to send message to the server it will first generate random key.
It will use this key to encrypt and sign request
It will use service certificate to encrypt derived key and pass it to the request as well.
When the server receives the message it will first use its private key to decrypt that key.
It will use decrypted key to decrypt the rest of the message.
It will also use the key to encrypt the response because client knows that key.
Client will use the same key generated for request to decrypt the response
This is symmetric encryption which is much more faster then asymmetric encryption but key derivation should not be available in WS-Security 1.0. It is available in WS-Security 1.1. HTTPS internally works in similar way but the key is the same for the whole connection lifetime.
If you have asymmetric security binding you need two certificates:
Initiator must have its own certificate used to sign requests and decrypt responses
Recipient must have its own certificate used to decrypt requests and sign responses
That means following algorithm
Initiator encrypts request with recipient's public key
Initiator signs request with its private key
Recipient uses initiator's public key to validate request signature
Recipient uses its private key to decrypt request
Recipient uses initiator's public key to encrypt response
Recipient uses its private key to sign response
Initiator uses recipient's public key to validate response signature
Initiator uses its private key to decrypt response
The order of signing and encrypting can be changed - there is another WS-SecurityPolicy assertion which says what should be done first.
These were basics. It can be much more complex because message security actually allow you as many certificates as you want - you can for example use endorsing token to sign primary signature with another certificate etc.
The certificate only has the public key of a public/private key pair. It does not have the private key -- this is separate from the certificate proper. When you connect to an HTTPS server, you can trust that the server is the owner of that certificate because the server must be holding the private key (and hopefully nobody else has it) because otherwise the SSL connection is not possible. If the server did not hold the private key that pairs with the public key of its certificate, it could not present you with a valid SSL connection.
You then can decide whether or not you trust that particular certificate based on one or more certificate authorities (CA) that have signed the chain of certificates. For example, there may be just one CA that has signed this certificate. You have a trusted CA root certificate locally, and so you know that it in fact was your trusted CA that signed that server certificate, because that signature would not be possible unless that CA held the private key to the CA certificate. Once again it is merely holding the private key, in this case that proves who signed the certificate. This is how you can trust the certificate.
When you present the optional client certificate on the SSL connection, the server can trust you because 1) it can see the CA (or CA's) that sign your client certificate, and 2) it can tell that you have your private key in your possession because otherwise the SSL connection would not be possible. So it also works in reverse for the server trusting the client.
You can tell that everybody is honest if you trust that the server and client are keeping their private keys private, and if you trust the source of your root certificates that sign the server and client certificates (or chains of certificates.)

How does a ROOT CA verify a signature?

Say when using https, browser makes a request to the server and server returns its certificate including public key and the CA signature.
At this point, browser will ask its CA to verify if the given public key really belongs to the server or not?
How is this verification done by the Root cert on the browser?
To give an example:
Say serverX obtained a certificate from CA "rootCA". Browser has a copy of rootCA locally stored. When the browser pings serverX and it replies with its public key+signature. Now the root CA will use its private key to decrypt the signature and make sure it is really serverX?
is it how it works?
Your server creates a key pair, consisting of a private and a public key. The server never gives out the private key, of course, but everyone may obtain a copy of the public key. The public key is embedded within a certificate container format (X.509). This container consists of meta information related to the wrapped key, e.g. the IP address or domain name of a server, the owner of that server, an e-mail contact address, when the key was created, how long it is valid, for which purposes it may be used for, and many other possible values.
The whole container is signed by a trusted certificate authority (= CA). The CA also has a private/public key pair. You give them your certificate, they verify that the information in the container are correct (e.g. is the contact information correct, does that certificate really belong to that server) and finally sign it with their private key. The public key of the CA needs to be installed on the user system. Most well known CA certificates are included already in the default installation of your favorite OS or browser.
When now a user connects to your server, your server uses the private key to sign some random data, packs that signed data together with its certificate (= public key + meta information) and sends everything to the client. What can the client do with that information?
First of all, it can use the public key within the certificate it just got sent to verify the signed data. Since only the owner of the private key is able to sign the data correctly in such a way that the public key can correctly verify the signature, it will know that whoever signed this piece of data, this person is also owning the private key to the received public key.
But what stops a hacker from intercepting the packet, replacing the signed data with data he signed himself using a different certificate and also replace the certificate with his own one? The answer is simply nothing.
That's why after the signed data has been verified (or before it is verified) the client verifies that the received certificate has a valid CA signature. Using the already installed public CA key, it verifies that the received public key has been signed by a known and hopefully trusted CA. A certificate that is not signed is not trusted by default. The user has to explicitly trust that certificate in his browser.
Finally it checks the information within the certificate itself. Does the IP address or domain name really match the IP address or domain name of the server the client is currently talking to? If not, something is fishy!
People may wonder: What stops a hacker from just creating his own key pair and just putting your domain name or IP address into his certificate and then have it signed by a CA? Easy answer: If he does that, no CA will sign his certificate. To get a CA signature, you must prove that you are really the owner of this IP address or domain name. The hacker is not the owner, thus he cannot prove that and thus he won't get a signature.
But what if the hacker registers his own domain, creates a certificate for that, and have that signed by a CA? This works, he will get it CA signed, it's his domain after all. However, he cannot use it for hacking your connection. If he uses this certificate, the browser will immediately see that the signed public key is for domain example.net, but it is currently talking to example.com, not the same domain, thus something is wrong again.
The server certificate is signed with the private key of the CA. The browser uses the public key of the CA to verify the signature. There is no direct communication between browser and CA.
The important point is that the browser ships with the public CA key. So the browser knows beforehand all CAs it can trust.
If you don't understand this, look up the basics of Asymmetric Cryptography and Digital Signatures.
Certs are based on using an asymmetric encryption like RSA. You have two keys, conventionally called the private and public keys. Something you encrypt with the private key can only be decrypted using the public key. (And, actually, vice versa.)
You can think of the cert as being like a passport or drivers license: it's a credential that says "this is who I am; you can trust it because it was given to me by someone (like Verisign) you trust." This is done with a "signature", which can be computed using the certificate authority's public key. If the data is what the CA got originally, you can verify the cert.
The cert contains identifying information about the owner of the cert. When you receive it, you use the combination of the key you know from your trusted authority to confirm that the certificate you received is valid, and that you can therefore infer you trust the person who issued the cert.
Your browser does not ask the CA to verify, instead it has a copy of the root certs locally stored, and it will use standard cryptographic procedure to verify that the cert really is valid.
This is why when you self sign a certificate your certificate is not valid, eventhough there technically is a CA to ask, you could off course copy the self signed CA to your computer and from then on it would trust your self signed certifications.
CACert.org has this same issue, it has valid certificates but since browsers don't have its root certs in their list their certificates generate warnings until the users download the root CA's and add them to their browser.

How does SSL really work?

How does SSL work?
Where is the certificate installed on the client (or browser?) and the server (or web server?)?
How does the trust/encryption/authentication process start when you enter the URL into the browser and get the page from the server?
How does the HTTPS protocol recognize the certificate? Why can't HTTP work with certificates when it is the certificates which do all the trust/encryption/authentication work?
Note: I wrote my original answer very hastily, but since then, this has turned into a fairly popular question/answer, so I have expanded it a bit and made it more precise.
TLS Capabilities
"SSL" is the name that is most often used to refer to this protocol, but SSL specifically refers to the proprietary protocol designed by Netscape in the mid 90's. "TLS" is an IETF standard that is based on SSL, so I will use TLS in my answer. These days, the odds are that nearly all of your secure connections on the web are really using TLS, not SSL.
TLS has several capabilities:
Encrypt your application layer data. (In your case, the application layer protocol is HTTP.)
Authenticate the server to the client.
Authenticate the client to the server.
#1 and #2 are very common. #3 is less common. You seem to be focusing on #2, so I'll explain that part.
Authentication
A server authenticates itself to a client using a certificate. A certificate is a blob of data[1] that contains information about a website:
Domain name
Public key
The company that owns it
When it was issued
When it expires
Who issued it
Etc.
You can achieve confidentiality (#1 above) by using the public key included in the certificate to encrypt messages that can only be decrypted by the corresponding private key, which should be stored safely on that server.[2] Let's call this key pair KP1, so that we won't get confused later on. You can also verify that the domain name on the certificate matches the site you're visiting (#2 above).
But what if an adversary could modify packets sent to and from the server, and what if that adversary modified the certificate you were presented with and inserted their own public key or changed any other important details? If that happened, the adversary could intercept and modify any messages that you thought were securely encrypted.
To prevent this very attack, the certificate is cryptographically signed by somebody else's private key in such a way that the signature can be verified by anybody who has the corresponding public key. Let's call this key pair KP2, to make it clear that these are not the same keys that the server is using.
Certificate Authorities
So who created KP2? Who signed the certificate?
Oversimplifying a bit, a certificate authority creates KP2, and they sell the service of using their private key to sign certificates for other organizations. For example, I create a certificate and I pay a company like Verisign to sign it with their private key.[3] Since nobody but Verisign has access to this private key, none of us can forge this signature.
And how would I personally get ahold of the public key in KP2 in order to verify that signature?
Well we've already seen that a certificate can hold a public key — and computer scientists love recursion — so why not put the KP2 public key into a certificate and distribute it that way? This sounds a little crazy at first, but in fact that's exactly how it works. Continuing with the Verisign example, Verisign produces a certificate that includes information about who they are, what types of things they are allowed to sign (other certificates), and their public key.
Now if I have a copy of that Verisign certificate, I can use that to validate the signature on the server certificate for the website I want to visit. Easy, right?!
Well, not so fast. I had to get the Verisign certificate from somewhere. What if somebody spoofs the Verisign certificate and puts their own public key in there? Then they can forge the signature on the server's certificate, and we're right back where we started: a man-in-the-middle attack.
Certificate Chains
Continuing to think recursively, we could of course introduce a third certificate and a third key pair (KP3) and use that to sign the Verisign certifcate. We call this a certificate chain: each certificate in the chain is used to verify the next certificate. Hopefully you can already see that this recursive approach is just turtles/certificates all the way down. Where does it stop?
Since we can't create an infinite number of certificates, the certificate chain obviously has to stop somewhere, and that's done by including a certificate in the chain that is self-signed.
I'll pause for a moment while you pick up the pieces of brain matter from your head exploding. Self-signed?!
Yes, at the end of the certificate chain (a.k.a. the "root"), there will be a certificate that uses it's own keypair to sign itself. This eliminates the infinite recursion problem, but it doesn't fix the authentication problem. Anybody can create a self-signed certificate that says anything on it, just like I can create a fake Princeton diploma that says I triple majored in politics, theoretical physics, and applied butt-kicking and then sign my own name at the bottom.
The [somewhat unexciting] solution to this problem is just to pick some set of self-signed certificates that you explicitly trust. For example, I might say, "I trust this Verisign self-signed certificate."
With that explicit trust in place, now I can validate the entire certificate chain. No matter how many certificates there are in the chain, I can validate each signature all the way down to the root. When I get to the root, I can check whether that root certificate is one that I explicitly trust. If so, then I can trust the entire chain.
Conferred Trust
Authentication in TLS uses a system of conferred trust. If I want to hire an auto mechanic, I may not trust any random mechanic that I find. But maybe my friend vouches for a particular mechanic. Since I trust my friend, then I can trust that mechanic.
When you buy a computer or download a browser, it comes with a few hundred root certificates that it explicitly trusts.[4] The companies that own and operate those certificates can confer that trust to other organizations by signing their certificates.
This is far from a perfect system. Some times a CA may issue a certificate erroneously. In those cases, the certificate may need to be revoked. Revocation is tricky since the issued certificate will always be cryptographically correct; an out-of-band protocol is necessary to find out which previously valid certificates have been revoked. In practice, some of these protocols aren't very secure, and many browsers don't check them anyway.
Sometimes an entire CA is compromised. For example, if you were to break into Verisign and steal their root signing key, then you could spoof any certificate in the world. Notice that this doesn't just affect Verisign customers: even if my certificate is signed by Thawte (a competitor to Verisign), that doesn't matter. My certificate can still be forged using the compromised signing key from Verisign.
This isn't just theoretical. It has happened in the wild. DigiNotar was famously hacked and subsequently went bankrupt. Comodo was also hacked, but inexplicably they remain in business to this day.
Even when CAs aren't directly compromised, there are other threats in this system. For example, a government use legal coercion to compel a CA to sign a forged certificate. Your employer may install their own CA certificate on your employee computer. In these various cases, traffic that you expect to be "secure" is actually completely visible/modifiable to the organization that controls that certificate.
Some replacements have been suggested, including Convergence, TACK, and DANE.
Endnotes
[1] TLS certificate data is formatted according to the X.509 standard. X.509 is based on ASN.1 ("Abstract Syntax Notation #1"), which means that it is not a binary data format. Therefore, X.509 must be encoded to a binary format. DER and PEM are the two most common encodings that I know of.
[2] In practice, the protocol actually switches over to a symmetric cipher, but that's a detail that's not relevant to your question.
[3] Presumable, the CA actually validates who you are before signing your certificate. If they didn't do that, then I could just create a certificate for google.com and ask a CA to sign it. With that certificiate, I could man-in-the-middle any "secure" connection to google.com. Therefore, the validation step is a very important factor in the operation of a CA. Unfortunately, it's not very clear how rigorous this validation process is at the hundreds of CAs around the world.
[4] See Mozilla's list of trusted CAs.
HTTPS is combination of HTTP and SSL(Secure Socket Layer) to provide encrypted communication between client (browser) and web server (application is hosted here).
Why is it needed?
HTTPS encrypts data that is transmitted from browser to server over the network. So, no one can sniff the data during transmission.
How HTTPS connection is established between browser and web server?
Browser tries to connect to the https://payment.com.
payment.com server sends a certificate to the browser. This certificate includes payment.com server's public key, and some evidence that this public key actually belongs to payment.com.
Browser verifies the certificate to confirm that it has the proper public key for payment.com.
Browser chooses a random new symmetric key K to use for its connection to payment.com server. It encrypts K under payment.com public key.
payment.com decrypts K using its private key. Now both browser and the payment server know K, but no one else does.
Anytime browser wants to send something to payment.com, it encrypts it under K; the payment.com server decrypts it upon receipt. Anytime the payment.com server wants to send something to your browser, it encrypts it under K.
This flow can be represented by the following diagram:
I have written a small blog post which discusses the process briefly. Please feel free to take a look.
SSL Handshake
A small snippet from the same is as follows:
"Client makes a request to the server over HTTPS. Server sends a copy of its SSL certificate + public key. After verifying the identity of the server with its local trusted CA store, client generates a secret session key, encrypts it using the server's public key and sends it. Server decrypts the secret session key using its private key and sends an acknowledgment to the client. Secure channel established."
Mehaase has explained it in details already. I will add my 2 cents to this series. I have many blogposts revolving around SSL handshake and certificates. While most of this revolves around IIS web server, the post is still relevant to SSL/TLS handshake in general. Here are few for your reference:
SSL Handshake and IIS
Client certificate Authentication in SSL Handshake
Do not treat CERTIFICATES & SSL as one topic. Treat them as 2 different topics and then try to see who they work in conjunction. This will help you answer the question.
Establishing trust between communicating parties via Certificate Store
SSL/TLS communication works solely on the basis of trust. Every computer (client/server) on the internet has a list of Root CA's and Intermediate CA's that it maintains. These are periodically updated. During SSL handshake this is used as a reference to establish trust. For exampe, during SSL handshake, when the client provides a certificate to the server. The server will try to cehck whether the CA who issued the cert is present in its list of CA's . When it cannot do this, it declares that it was unable to do the certificate chain verification. (This is a part of the answer. It also looks at AIA for this.) The client also does a similar verification for the server certificate which it receives in Server Hello.
On Windows, you can see the certificate stores for client & Server via PowerShell. Execute the below from a PowerShell console.
PS Cert:> ls Location : CurrentUser StoreNames : {TrustedPublisher, ClientAuthIssuer, Root, UserDS...}
Location : LocalMachine StoreNames : {TrustedPublisher,
ClientAuthIssuer, Remote Desktop, Root...}
Browsers like Firefox and Opera don't rely on underlying OS for certificate management. They maintain their own separate certificate stores.
The SSL handshake uses both Symmetric & Public Key Cryptography. Server Authentication happens by default. Client Authentication is optional and depends if the Server endpoint is configured to authenticate the client or not. Refer my blog post as I have explained this in detail.
Finally for this question
How does the HTTPS protocol recognize the certificate? Why can't HTTP work with certificates when it is the certificates which do all the trust/encryption/authentication work?
Certificates is simply a file whose format is defined by X.509 standard. It is a electronic document which proves the identity of a communicating party.
HTTPS = HTTP + SSL is a protocol which defines the guidelines as to how 2 parties should communicate with each other.
MORE INFORMATION
In order to understand certificates you will have to understand what certificates are and also read about Certificate Management. These is important.
Once this is understood, then proceed with TLS/SSL handshake. You may refer the RFC's for this. But they are skeleton which define the guidelines. There are several blogposts including mine which explain this in detail.
If the above activity is done, then you will have a fair understanding of Certificates and SSL.