Apache gives 404 for encoded urls with special characters - apache

I have an application that generates xml files, and they might contain special characters. My problem is that Apache will not give me the xml file if the url with the special character is encoded.
Example:
File ABCö.xml is accessible by http://host/path/ABCö.xml, but if accessed with encoded url http://host/path/ABC%F6.xml apache gives me an 404.
Is this a setting in httpd.conf or do I need som rewriting to make the xml files accessible by both urls?

You may have an encoding issue.
Most (all?) modern browsers use UTF-8 when encoding special characters in URLs that the user inputs directly into the address bar.
So when you enter ABCö.xml say in Firefox, it will transform ö into its UTF-8 multi-byte representation, so the end result will be
ABC%C3%B6.xml
and not the single-byte
ABC%F6.xml
only one of them will work. Check which encoding is used in your file name.

Related

Reading images with special characters in Apache web server

When I try to GET images that have special characters like ấ in the filename, I can't read the files on the frontend. It will always throw a 404 error when navigating to the url as well.
My server os is CentOS, and my site is running on Apache with Nodejs. I was wondering if I have to somehow change the file encoding in order to read images with special characters. All normal images work fine, it just seems to not recognize the images with special characters at all.
There are a lot of files, which makes renaming them all not an option for me unfortunately. If anyone knows what I have to do to get the files to the correct encoding, please let me know.
Update: I've discovered a way to find the files, but I dont understand the encoding pattern. For example a file known as kt-giấy-2.jpg can be viewed directly using kt-gia%CC%82%CC%81y-2.jpg, does anyone know what kind of encoding this is? It doesnt line up with URI encoders.
For anyone that has this issue. My issue was that I transferred the files from Mac Osx to Centos directly through a zip file through Cpanel. The files are fine, but you need to use convmv to change the files. The files were readable, but they werent in the exact encoding.
Mac OSX encodes in NFC, every other os encodes in NFD
use this command in the directory of the files you want to encode differently.
convmv -r -f utf8 -t utf8 --nfc --notest .

How to use the "url_dec" function in HAProxy?

I have a OPNSense firewall setup with HAproxy sitting on my WAN interface to reverse-proxy my web server.
The problem with my application (which is outsourced) is that it has a lot of unicode characters in the URL parameters. Before installing OPNsense, I was running ISA server 2006 with no problems.
As I have read in its documentation, HAProxy only supports ASCII characters. However, I have a lot of non ascii characters which are written by design in the URL as URL parameters.
These characters include arabic characters and special french characters. HAProxy considers these characters illegal, making the HTTP request invalid and returning error code 400 (Invalid request). After days of debugging and checking logs, I figured that this is the normal behavior of HAProxy.
One of the things I tried is to make HAProxy accept these characters, but It was not successful.
One last resort before trying another reverse proxy engine is to try to encode these characters in Javascript. But once I encode them, how do I decode them on the HAProxy configuration ?
As is the HTTP response I am getting is 404 not found because the encoded URL parameters are not being decoded properly.
Any suggestions ?

IIS8 - Slash after file name delivers file & HTTP:200 for CFM, HTML, but HTTP:404 for ASP

I'm working on a site that's recently migrated to IIS8/Windows Server 2012 R2. It's running ColdFusion 11, and serves a mix of static HTML, CFM and ASP content.
The problem I'm facing is this:
For .cfm and .htm files, it is possible to enter a URL with a trailing slash (with or without arbitrary text after the slash) and still get the file served with a HTTP 200 OK response code. For example, all of these:
myurl.com/about.cfm/foo
myurl.com/about.cfm/
myurl.com/about.cfm/typewhateveryouwantitdoesnotmatter
will deliver the same content as
myurl.com/about.cfm
Except that any relative URLs in the page will break - images, CSS, scripts, links, etc. IIS interprets about.cfm/ as a directory, and returns the rendered content of the about.cfm file for any nonexistent "file" in that "directory".
This behavior is undesireable - I'd rather it produced a 404 error.
Interestingly, the misbehavior described above does not work for ASP content.
myurl.com/about.asp/foo
returns HTTP 404: Not Found just as one would want.
Googling this problem is tough because of the signal-to-noise ratio - I'm wading through reams of basic IIS URL rewrite advice concerning the addition of trailing slashes to directories, and having a hard time finding anything like my situation. Thanks in advance for your help!

-Apache- files from "website" not UTF8

ok, lets start from scratch. I just realized this is apache and not phpmyadmin, my bad.
Anyway, I needed some sort of file storage accessible through the web. I deleted the index.html to list the other files in /var/www. Now if I open the json file (UTF8 w/o BOM) in the browser, the special charakters like ä,ü,ö are not correctly displayed (normal chars are). If I download the file, all is correct on my system.
So the file itself is fine, but the stream from apache to the web is not in UTF8, or something like that. And that I would like to change.
I need this for an android app, where I parse the content of the json file with volley lib. But there it also gets the special charakters wrong.
hope this is more usefull than befor. my apologies for that.
The only thing that is wrong is that your browser doesn't know it should interpret the UTF-8 encoded JSON file as UTF-8. Instead it falls back to its default Latin-1 interpretation, in which certain characters will screw up, because it's using the wrong encoding to interpret the file.
That is all. The file will appear fine if it is interpreted using the correct encoding, UTF-8 in this case.
Use the View → Encoding menu of your browser to force it to UTF-8 and see it work.
Why doesn't the browser use UTF-8? Because there's no HTTP Content-Type header telling it to do so. Why is there no appropriate HTTP header set? Because you didn't tell your web server that it should set this header for .json files. How do you tell Apache to do so? By adding this line in an .htaccess file:
AddCharset UTF-8 .json

Showing non-ascii characters in URL

I'm trying to make a page that will show Arabic/Hebrew in the URL.
for example: www.mydomain.co.ar/אבא.php
Problem is, when i upload the page to the Apache server and try to browse to that
page either with "www.mydomain.co.ar/אבא.php" or the percent encoding way
"www.mydomain.co.ar%D7%90%D7%91%D7%90.php" i get a 404.
Then i list the directory and apache sees àáà.php.
I know there is a way to show up non ASCII in url, wikipedia is doing it for ages.
My thoughts are maybe .htaccess rewrite? if so how can i accomplish that?
Looks like you have to tell apache that the file system is encoded in UTF-8 (or whatever). Maybe starting apache with an UTF-8 locale active (LC_CTYPE=ar.utf8 or similar) helps there.
Wikipedia parses the URLs in the PHP software (and then asks the database about the right article), so this does not necessarily say how Apache does this.