How does apache 2 server process http GET request? - apache

I was trying an IOT project where in I want to update the database remotely using ESP8266-01 module. I have the php file to update the database, but to execute that I am trying with GET method. Unfortunately, it wan't working. The server received the request in the access.log, but wan't updating the database. Wanted to debug, so I had this question.
The entry in the access.log is as follows:
192.168.43.150 - - [18/Mar/2017:20:23:40 +0000] "GET collectdata.php?status=1 HTTP/1.1\r\nHost: 192.168.43.92\r\n\r\n" 400 0 "-" "-"

This looks wrong: GET collectdata.php...
That needs to be a full path, eg. GET /collectdata.php or GET /scripts/collectdata.php, or similar.
The 400 response code you're seeing in the log means "Bad request", and the lack of a leading slash (and the rest of the path, if needed) is what Apache is complaining about.

Related

Can I use Karate to test URLs with multiple subsequent slashes? [duplicate]

This question already has an answer here:
Karate : How to send query param without url encoding
(1 answer)
Closed 1 year ago.
I'm trying to use Karate to check the behavior of our server when paths contain multiple slashes, so I have a test like
Scenario: Multiple slashes
* url 'http://localhost'
Given path '///some//path///'
When method get
Then status 404
When running this, the Karate output/log contains all the slashes:
12:04:52.307 [pool-1-thread-1] DEBUG com.intuit.karate - request:
1 > GET http://localhost///some//path///
1 > Host: localhost
But according to the server log (simply using Apache for this demo), the actual request just has single slashes:
127.0.0.1 - - [23/Aug/2021:12:04:52 +0200] "GET /some/path/ HTTP/1.1" 404 488 "-" "Apache-HttpClient/4.5.13 (Java/11.0.4)"
So I have several questions:
Is that slash normalization the expected (and documented?) behavior?
Can I switch it off somewhere?
If the slashes are collapsed, shouldn't the log contain the actual (normalized) request, so that I see what's really sent to the server?
You can see if this thread answers your question, combined with the path keyword (scroll to the end): https://github.com/intuit/karate/issues/1561
If not, you can assume this is not directly supported. Personally I don't think such use-cases are worth automating with Karate.
Work-arounds are to make your own HTTP call by integrating any Java lib (via Karate interop) or using cURL (hack): https://stackoverflow.com/a/64352676/143475
You are welcome to contribute code to Karate.

ADLS Gen 2 Storage API - Refusing Http Verbs

I'm having a problem with some endpoints within the ADLS Gen 2 API Path operations.
I can create, list, get properties of, and delete file systems just fine.
However, after adding a directory to a file system, certain verbs are failing - HEAD, GET, and DELETE.
For example, I have created a filesystem named c79b0781, with a directory path of abc/def
Call failed with status code 400 (The HTTP verb specified is invalid - it is not recognized by the server.): DELETE https://myadls.dfs.core.windows.net/c79b0781/abc?recursive=true&timeout=30
For headers, I have:
x-ms-version: 2018-11-09
I can delete the filesystem from the Azure Storage Explorer, but the API is refusing my query.
The List action is also failing with a similar error
Call failed with status code 400 (The HTTP verb specified is invalid - it is not recognized by the server.): GET https://myadls.dfs.core.windows.net/c79b0781?resource=filesystem&recursive=false&timeout=30
With headers:
x-ms-version: 2018-11-09
And finally, my Get Properties is also failing
Call failed with status code 400 (The HTTP verb specified is invalid - it is not recognized by the server.): HEAD https://myadls.dfs.core.windows.net/c79b0781?resource=filesystem&timeout=30
It seems to only happen when I add directories to the file system.
A bit more in depth:
This Test works
PUT https://myadls.dfs.core.windows.net/c79b0781?resource=filesystem
GET https://myadls.dfs.core.windows.net/c79b0781?recursive=false&resource=filesystem
DELETE https://myadls.dfs.core.windows.net/c79b0781?resource=filesystem
My second Test with directory creation
PUT https://myadls.dfs.core.windows.net/c79b0781?resource=filesystem
PUT https://myadls.dfs.core.windows.net/c79b0781/abc/123?resource=directory
After this point, the calls begin rejecting HTTP verbs
GET https://myadls.dfs.core.windows.net/c79b0781?recursive=false&resource=filesystem
Examining my directory create request closer, it looks like this:
PUT https://myadls.dfs.core.windows.net/c79b0781/abc/123?resource=directory
With Headers:
Authorization: [omitted]
Content-Length: 0
And I can see the folders in Storage explorer, I just cannot act on them after this point.
Test Case 2
I have started down a path wondering if it is permissions. So, I created a new File System through the Azure Storage Explorer with abc/def folder structure within.
Test 1 (passing)
Get List for directory "abc"
Get List for directory "abc/def"
Test 2 (failing)
Create Directory "uvw/xyz"
Get List for directory "abc" Fails here
Get List for directory "abc/def"
Get List for directory "uvw/xyz"
Once I create a directory through the api, it is as if the entire filesystem begins rejecting all HTTP requests.
This bug ended up leading me down a rabbit hole into my Flurl implementation that I am using for performing rest requests.
The Put method had no body and was calling PutJsonAsync where according to the spec, it expects the content type to be application/octet-stream with content length 0.
I replaced the call to PutJsonAsync to PutAsync and everything magically started working.
So, there seems to be some bug within Flurl itself that caused this issue, due to my misuse in my wrapper code.

apache seems to ignore "Internal Server Error" returned by CGI script

I'm writing a Python CGI script and trying to test the behaviour of the system when I need to return Status: 500 Internal Server Error.
My script is something like that:
#!/usr/bin/env python3
print("Content-type: text/html")
print("Status: 500 Internal Server Error")
print()
When I run this script there is a report in apache access log with code 500, but it's not reported in the error log. I also don't get a "500 page" in the browser.
If an internal error is caused by some other means (e.g., a script that is not executable, or contains bad HTTP header) I do get the "normal" behaviour of internal server error.
It seems like apache is ignoring, somehow, the status returned from (my) CGI scripts. I've searched for an answer but couldn't find anything.
Just for clarity, CGI is working fine on this server in any other aspect.
Any thoughts? Am I missing something?
Thansk,
Amit
Answering to myself: it seems that I was barking up the wrong tree. Based on some clues and more empirical results, It seems that when passing a request to an external script (e.g. a cgi script, php etc) the apache server expects the external script to handle any error, and it's the responsibility of the external script's to return a document that includes the error code and an error message. The external script is also responsible to log the error (it's usually enough to print it to the standard error, and it'll be picked by apache and be written to its error log).
So, for example, if my cgi script needs to report an "Internal Server Error" it is not enough to return just the header (see in my question), but it should create and return the whole error message, in HTML format. In addition, it should print an error message to the standard error.
I haven't found an official source for that, but perhaps I somehow overlooked it.

Apache custom dynamic error response

I've seen hundreds of pages explaining how to create custom error pages in Apache 2 server. My question is different. I have a web application running in Apache (it is a ISAPI DLL, but it could also be a CGI executable). My application can handle internal server errors and generate a detailed error message (for instance, include a full stack trace), included in the response together with error code 500. AFAIK, Apache just let me use redirection in order to display custom error messages: http://httpd.apache.org/docs/2.2/custom-error.html
HTTP spec (RFC 2616 - section 10), not only allows but also recommend that detailed error message should be included in the BODY section of the response in case of error code > 500.
Link: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.5
Seems that Apache won't let my custom error message go to the browser, and always replace it with its own internal error message and I believe that it is not the correct behavior, based on RFC 2616.
So my question is: Is there any setting in Apache server that will let my custom message go to the browser? Or, is there anything that can be done in my application that will instruct Apache to send my custom error message (something like some specific header field in the response)?
More on the subject:
When my ISAPI application returns error code 500, with other error information in the response body, Apache replaces it with its standard "500 Internal Server Error" message/HTML content, and inside Error.log file I can see the "useless" "Premature end of script headers" message. I'm deeply sure that my headers are fine, including the Content-Type field.
If I replace the 500 error code with any other server error code (e.g. 501) it works flawlessly and my response goes to the browser as is. The same header is sent to the Apache server, only the error code is different (501, instead of 500). With this test result in mind, one of these two must be true:
1- Apache requires some specific header field when status code is 500
2- Apache won't let custom error messages with status code 500 go to the browser.
I don't see any other alternative.
I think you're conflating two questions. You can generate a 500 response with a CGI script and include your custom body. Or you can override any 500 with any resource you want.
If you're failing to do the former, it's likely because of some subtle thing in the ISAPI interface between Apache and your module. Desk-checking the code says you should be able either set the pseudo
Status: 500
Header, or basically return any ISAPI error and end up with a 500 and your custom body.
Apache has two notions of a status code -- the one in the status line (r->status) and an error code returned separately from the module that handles the request (return HTTP_INTERNAL_SERVER_ERROR, return r->status).
When the former is used as the latter is when the custom error messages get lost. All of that happens in./modules/arch/win32/mod_isapi.c in Apache. Whatever is going on, it is ISAPI unique.

What HTTP status code is most search-engine-friendly during a planned outage?

If you have to take a site down for some type of unavoidable maintenance task (and it's not a big enough site that you have a backup server), what HTTP status code should you have your server return to minimize the possibility that search engines will think the site is gone?
I found this list of status codes from W3C, of which the following seem applicable:
503 Service Unavailable
500 Internal Server Error
408 Timeout
404 Not Found
I think 503 is the most appropriate, but I don't know what search engines might prefer.
From the horse's mouth:
If my site is down for maintenance, how can I tell Googlebot to come back later rather than to index the "down for maintenance" page?
You should configure your server to return a status of 503 (network unavailable) rather than 200 (successful). That lets Googlebot know to try the pages again later.
Don't send a 404 -- they may remove you from their index.
I'd probably send a 503 and an appropriate Retry-After, although I don't know if anything actually uses the header.
According to Google the 503 code would be the way to go, since it means "the server is temporarily unavailable."
Also check out the W3C page on the same.