I have recently started my job as web application backend developer. I am bit stuck in understanding lifecycle of a Http request.
What I understood is
Every Http request first contacts a DNS server which resolves the request URL domain to a IP address.
After fetching the Webserver IP address request is forwarded to it(via PUT request). A webserver like apache handles this request and forwards this to application which has to handle this.
After this I am lost with
How response is sent by the application to the user who requested it and will Apcache involved in this?
Can I see the entire flow in my browser with some debugging tools?
Can someone refer some links to understand this in depth?
I think you are a bit wrong on your understanding of it.
If you go to www.google.com (not using any forms, just wanting the site), this is what happens:
First the browser needs to translate www.google.com to an IP address if it does not already know it. If it knows it, nothing happens at this point. If it does not know it, it contacts a DNS server to resolve the name.
Then browser will open a TCP connection to the IP address of www.google.com and send a HTTP GET request over. In this example it will be
GET / HTTP/1.1 Host: www.google.com
The server software will get this HTTP request. It will somehow generate a HTTP response and send that back trough the TCP connection. How the server does this is server software dependent. You can for example plug in application code in Apache, or just make Apache return a file from the filesystem. PHP is an application called by some software, which then generates the response sent to the browser. When the response is sent, in HTTP version 1.0 the connection is closed. HTTP 1.1 can have persistent connections though.
When the browser gets the response, it typically renders it on screen. The HTTP request is now done. A click on "search" will send a new request to the server.
GET, PUT, POST, DELETE and others are HTTP request methods. They have special meaning which you can see in the RFC.
Cookies are commonly used to identify the same user across multiple HTTP requests, called sessions. Therefore these cookies are called session cookies
You can debug the communication by using a network sniffer tool, for example Wireshark. Firefox has a third party plugin called Tamper Data that can change the request before they are sent to the server.
The HTTP RFC is a good source of how it all works.
while server receives the request from browser , the browser will be binded to some port on the host , ip address and port number of browser will be attached with the request that sends to server. server sends the responce to the ip address and port number
This is among the popular interview questions asked in various product based companies.
HTTP Is a request-response protocol. For example, a user agent initiates a request to a server, typically by opening a TCP/IP connection to a particular port on a host (port 80 by default). The request itself comprises:
a request line,
a set of request headers, and
an entity.
An HTTP server listening on that port waits for the client to send a request message. Upon receiving the request, the server sends a response that comprises:
a status line,
a set of response headers, and
an entity.
The entity in the request or response can be thought of simply as the payload, which may be binary data. The other items are readable ASCII characters. When the response has been completed, either the browser or the server may terminate the TCP/IP connection, or the browser can send another request.
I found this resource very helpful in understanding the steps taken during the HTTP lifecycle : quite interesting actually though, wasn't aware of all the intermediate steps especially w/the cache checking when determining the IP Address of a URL.
https://medium.com/#maneesha.wijesinghe1/what-happens-when-you-type-an-url-in-the-browser-and-press-enter-bb0aa2449c1a
Related
Consider the case where there exists a simple client-server web application where the client sends requests to the server. If the server sends a request to an external API, what IP and header values will be detected by the API? The ones of the client that first send the request to the server, or the ones of the server?
Only the actual IP that makes the request will be visible to the API. So if there is a chain of requests, only the last request IP will be accessible to the receiving party.
Let's assume that the client wants to authenticate himself to a HTTP proxy. The proxy is configured with kerberos, and has clearly the service name HTTP/proxy.foo.bar set in it's configs. How does the client know which service name to request the ticket to ? Does it request the ticket to the domain name he's making request to (in this case it is proxy.foo.bar indeed), or does it receive the name in the authentication sequence, in a 407 reply in this case (which doest contain the negotiate challenge, but I just don't know if there's a way to look into it) ?
I'm trying to debug the kerberos errors on a proxy which suddenly stopped authenticating some clients. The thing is, that looking in the Wireshark, I see that the client is requesting a ticket not for a service name configured on a proxy (same name he's instructed to use), HTTP/proxy.foo.bar, but for a name that the proxy IP resolves to, HTTP/host.foo.bar (well, at least it's the name that the proxy resolves to, may be though the client gets it some other way), and TGS just cannot find one, thus an error happens.
So you’ve got two questions in here (you didn't ask how to actually solve the problem, to do that more details would be needed - see comments).
You asked "The proxy is configured with kerberos, and has clearly the service name HTTP/proxy.foo.bar set in it's configs. How does the client know which service name to request the ticket to?"
A. It works pretty much like this. The client types in a URL in the web browser or clicks on a hyperlink. It looks up the IP host in DNS domain which matches the host name in the URL. Then it goes to that IP host, looking for the service defined in the URL, in this case it is the HTTP service. If it receives an HTTP 401 Negotiate challenge (it's 401, not 407) from the web server, due to it being Kerberos-protected, it goes to its KDC and requests a Kerberos service ticket for HTTP/proxy.foo.bar, zips back to proxy.foo.bar and presents the ticket to that host for the HTTP service running on it. The host validates this ticket and if all is well and the client web browser renders the HTML. You've seen the Kerberos ticket ticket when you ran klist on the client. I don't have any web references for you, this is all off the top of my head.
You also asked “Does it request the ticket to the domain name he's making request to (in this case it is proxy.foo.bar indeed), or does it receive the name in the authentication sequence, in a 407 reply in this case (which doest contain the negotiate challenge, but I just don't know if there's a way to look into it) ?”
A. Your question was a bit hard to follow but if I am understanding you correctly, the answer is the web client requests a ticket as a result of the HTTP 401 Negotiate authentication challenge from the web server (see above).
There’s many diagrams sequencing this process on the web, including here: http://www.zeroshell.org/kerberos/Kerberos-operation/
I have configured JBoss with HTTPS connector only. Now I have problem, that in case I'm using HTTP request to HTTPS connector, it returns page with one ASCII character, instead of some error page, for example 505 or something else what can user inform, about invalid request.
There is no used Apache nor any other web server, where some rules for URL rewriting could apply. Also often used change in web.xml with <transport-guarantee>CONFIDENTIAL</transport-guarantee> tag do not solve this problem, as there must be HTTP request, which is then redirected to HTTPS based on "redirectPort" param in connector configuration and in this case there is not plan to use any other port and it is not possible to combine HTTP and HTTPS listener together on one port.
Is there some way how to configure SSL listener, that way, that it refuse HTTP request, or automatically change to HTTPS?
JBoss 5.1.0
Your browser uses HTTP to talk to server, but on the server side is the security layer (around the HTTP) and it wants to do a SSL/TLS handshake. So the communication fails because client doesn't know about the security on the other side.
The client (browser) receives error message (binary data) from server. The client doesn't know what to do with them, so it presents them to a user as a web page content.
RFC-5246 states:
Error handling in the TLS Handshake protocol is very simple. When an
error is detected, the detecting party sends a message to the other
party. Upon transmission or receipt of a fatal alert message, both
parties immediately close the connection.
Undertow - a new web server in WildFly - is able to do HTTP Upgrade. But I'm not sure if the upgrade to SSL/TLS is already supported. Nevertheless, the problem with this scenario could be on the browser side.
I'm trying to build a complete web caching proxy using Boost Asio and LibCURL, I've already built the server and everything works fine. It receives http requests (GET, POST, upload using POST ...) correctly and also it sends back the responses to the browser for e.g correctly.
Now, I want to extend it, so it can handles https requests. I read about it in LibCURL web site http://curl.haxx.se/libcurl/c/libcurl-tutorial.html (proxy section), I understood how it works and I have a clear idea how it should be done. But I didn't find a good documentation about how proxies handle https requests. and:
what are the possible messages (information, format, length ...) exchanged by the source application and the proxy ?
things to consider.
...
Thanks in advance :-) .
You will receive the CONNECT command in plain text, and respond to it ditto, then the communications after that will be encrypted. If your proxy is to be an SSL endpoint, which is highly problematic given that HTTPS requires a certificate that matches the target host-address, you will then need to enter SSL mode on both connections. More probably you should just start copying bytes in both directions without attempting to process the contents.
I am creating a secure web based API that uses HTTPS; however, if I allow the users to configure it (include sending password) using a query string will this also be secure or should I force it to be done via a POST?
Yes, it is. But using GET for sensitive data is a bad idea for several reasons:
Mostly HTTP referrer leakage (an external image in the target page might leak the password[1])
Password will be stored in server logs (which is obviously bad)
History caches in browsers
Therefore, even though Querystring is secured it's not recommended to transfer sensitive data over querystring.
[1] Although I need to note that RFC states that browser should not send referrers from HTTPS to HTTP. But that doesn't mean a bad 3rd party browser toolbar or an external image/flash from an HTTPS site won't leak it.
From a "sniff the network packet" point of view a GET request is safe, as the browser will first establish the secure connection and then send the request containing the GET parameters. But GET url's will be stored in the users browser history / autocomplete, which is not a good place to store e.g. password data in. Of course this only applies if you take the broader "Webservice" definition that might access the service from a browser, if you access it only from your custom application this should not be a problem.
So using post at least for password dialogs should be preferred. Also as pointed out in the link littlegeek posted a GET URL is more likely to be written to your server logs.
Yes, your query strings will be encrypted.
The reason behind is that query strings are part of the HTTP protocol which is an application layer protocol, while the security (SSL/TLS) part comes from the transport layer. The SSL connection is established first and then the query parameters (which belong to the HTTP protocol) are sent to the server.
When establishing an SSL connection, your client will perform the following steps in order. Suppose you're trying to log in to a site named example.com and want to send your credentials using query parameters. Your complete URL may look like the following:
https://example.com/login?username=alice&password=12345)
Your client (e.g., browser/mobile app) will first resolve your domain name example.com to an IP address (124.21.12.31) using a DNS request. When querying that information, only domain specific information is used, i.e., only example.com will be used.
Now, your client will try to connect to the server with the IP address 124.21.12.31 and will attempt to connect to port 443 (SSL service port not the default HTTP port 80).
Now, the server at example.com will send its certificates to your client.
Your client will verify the certificates and start exchanging a shared secret key for your session.
After successfully establishing a secure connection, only then will your query parameters be sent via the secure connection.
Therefore, you won't expose sensitive data. However, sending your credentials over an HTTPS session using this method is not the best way. You should go for a different approach.
Yes. The entire text of an HTTPS session is secured by SSL. That includes the query and the headers. In that respect, a POST and a GET would be exactly the same.
As to the security of your method, there's no real way to say without proper inspection.
SSL first connects to the host, so the host name and port number are transferred as clear text. When the host responds and the challenge succeeds, the client will encrypt the HTTP request with the actual URL (i.e. anything after the third slash) and and send it to the server.
There are several ways to break this security.
It is possible to configure a proxy to act as a "man in the middle". Basically, the browser sends the request to connect to the real server to the proxy. If the proxy is configured this way, it will connect via SSL to the real server but the browser will still talk to the proxy. So if an attacker can gain access of the proxy, he can see all the data that flows through it in clear text.
Your requests will also be visible in the browser history. Users might be tempted to bookmark the site. Some users have bookmark sync tools installed, so the password could end up on deli.ci.us or some other place.
Lastly, someone might have hacked your computer and installed a keyboard logger or a screen scraper (and a lot of Trojan Horse type viruses do). Since the password is visible directly on the screen (as opposed to "*" in a password dialog), this is another security hole.
Conclusion: When it comes to security, always rely on the beaten path. There is just too much that you don't know, won't think of and which will break your neck.
Yes, as long as no one is looking over your shoulder at the monitor.
I don't agree with the statement about [...] HTTP referrer leakage (an external image in the target page might leak the password) in Slough's response.
The HTTP 1.1 RFC explicitly states:
Clients SHOULD NOT include a Referer
header field in a (non-secure) HTTP
request if the referring page was
transferred with a secure protocol.
Anyway, server logs and browser history are more than sufficient reasons not to put sensitive data in the query string.
Yes, from the moment on you establish a HTTPS connection everyting is secure. The query string (GET) as the POST is sent over SSL.
You can send password as MD5 hash param with some salt added. Compare it on the server side for auth.