method for getting correct system path on windows - libevent

I have made up a simple http server using libevent. The way the resource (folders in my case) are accessed is
http://serverAddress:port/path/to/resouce/
the path to resource is extracted using the decoded url . It works fine on Linux as request would be something like this
http://severAddress:port/home/vickey/folder
but on window$ request is
http://serverAddress:port/c:/users/vickey/folder
which results in decoded url as /c:/users/vickey/folder. Its manually possible to remove the leading slash to correct the problem. However since I m using and learning boost libraries in my code I was wondering if there was some implementation of this sort ? I tried using native() and relative_path(). Thanks.

Its definitely possible to do as you're asking, but I would suggest a different approach. How about creating a configuration property for the server which could be called RESOURCE_BASE_PATH. The resource path received in the URL would be appended to the RESOURCE_BASE_PATH to create the complete path.
This is pretty standard for FTP and HTTP servers and the like. On Windows, it could be set to "c:" and on Linux, left blank which would default to "/".
Also remember on Windows the slashes (\) are different than those on Unix (/).

Related

Archiving an old PHP website: will any webhost let me totally disable query string support?

I want to archive an old website which was built with PHP. Its URLs are full of .phps and query strings.
I don't want anything to actually change from the perspective of the visitor -- the URLs should remain the same. The only actual difference is that it will no longer be interactive or dynamic.
I ran wget --recursive to spider the site and grab all the static content. So now I have thousands of files such as page.php?param1=a&param2=b. I want to serve them up as they were before, so that means they'll mostly have Content-Type: text/html, and the webserver needs to treat ? and & in the URL as literal ? and & in the files it looks up on disk -- in other words it needs to not support query strings.
And ideally I'd like to host it for free.
My first thought was Netlify, but deployment on Netlify fails if any files have ? in their filename. I'm also concerned that I may not be able to tell it that most of these files are to be served as text/html (and one as application/rss+xml) even though there's no clue about that in their filenames.
I then considered https://surge.sh/, but hit exactly the same problems.
I then tried AWS S3. It's not free but it's pretty close. I got further here: I was able to attach metadata to the files I was uploading so each would have the correct content type, and it doesn't mind the files having ? and & in their filenames. However, its webserver interprets ?... as a query string, and it looks up and serves the file without that suffix. I can't find any way to disable query strings.
Did I miss anything -- is there a way to make any of the above hosts act the way I want them to?
Is there another host which will fit the bill?
If all else fails, I'll find a way to transform all the filenames and all the links between the files. I found how to get wget to transform ? to #, which may be good enough. It would be a shame to go this route, however, since then the URLs are all changing.
I found a solution with Netlify.
I added the wget options --adjust-extension and --restrict-file-names=windows.
The --adjust-extension part adds .html at the end of filenames which were served as HTML but didn't already have that extension, so now we have for example index.php.html. This was the simplest way to get Netlify to serve these files as HTML. It may be possible to skip this and manually specify the content types of these files.
The --restrict-file-names=windows alters filenames in a few ways, the most important of which is that it replaces ? with #. This is needed since Netlify doesn't let us deploy files with ? in the name. It's a bit of a hack; this is not really what this option is meant for.
This gives static files with names like myfile.php#param1=value1&param2=value2.html and myfile.php.html.
I did some cleanup. For example, I needed to adjust a few link and resource paths to be absolute rather than relative due to how Netlify manages presence or lack of trailing slashes.
I wrote a _redirects file to define URL rewriting rules. As the Netlify redirect options documentation shows, we can test for specific query parameters and capture their values. We can use those values in the destinations, and we can specify a 200 code, which makes Netlify handle it as a rewrite rather than a redirection (i.e. the visitor still sees the original URL). An exclamation mark is needed after the 200 code if a "query-string-less" version (such as mypage.php.html) exists, to tell Netlify we are intentionally shadowing.
/mypage.php param1=:param1 param2=:param2 /mypage.php#param1=:param1&param2=:param2.html 200!
/mypage.php param1=:param1 /mypage.php#param1=:param1.html 200!
/mypage.php param2=:param2 /mypage.php#param2=:param2.html 200!
If not all query parameter combinations are actually used in the dumped files, not all of the redirect lines need to be included of course.
There's no need for a final /mypage.php /mypage.php.html 200 line, since Netlify automatically looks for a file with a .html extension added to the requested URL and serves it if found.
I wrote a _headers file to set the content type of my RSS file:
/rss.php
Content-Type: application/rss+xml
I hope this helps somebody.

How to properly defang/disarm URLs with scheme ftp/ftps?

No problems with: HTTP/HTTPS
When defanging/disarming URL schemes (e.g. with python-defang):
http becomes hXXp
https becomes hXXps
So, no problem here.
But what happens with: FTP/FTPS/FXP
But how can these schemes be properly defanged?
ftp becomes fXp
how do I know, if a given URL is defanged or if it's a real URL which just makes use of the File eXchange Protocol (fxp) instead of the normal File Transfer Protocol (ftp)?
ftps becomes what? fXps?
what is the "official defanged" version of ftps?!
fxp becomes what? fXxp?
what is the "official defanged" version of fxp?!
Alternative?
Is there something like a rule of thumb for defanging/disarming: just to make sure that a URL doesn't work anymore within a browser so that the client won't open a malicious URL accidentally?
Link's source indicates it only supports HTTP HTTPS and FTP. Not SFTP, FTPS or FXP. Although support seems as trivial to add by updating the PROTOCOL_TRANSLATIONS list in init.py
FXP:// SFTP:// and FTPS:// are not supported in modern browsers. At best clicking such a URL will show an external application launch dialog. Similar to what you get with a magnet link.
As a rule of thumb; if crippling URLs is the goal. I would replace ':' with something else. Changing the protocol name itself doesn't make the url invalid, just unlikely to be understood/exist. It will still be parsed by extensions, plugins, etc, which may be enough to trigger bad mojo. Changing the colon will render them to strings.

Issues with intern-runner and proxyUrl that contains subfolders

I need to setup intern to test ajax calls from a different server. I set everything up sort of following the official wiki in this address
https://github.com/theintern/intern/wiki/Using-Intern-to-unit-test-Ajax-calls
My config file has proxyUrl set to http://localhost:8080/sub
and http://localhost:8080/sub is setup as a reverse proxy to inter-runner in http://localhost:9000
When I run ./node_modules/.bin/intern-runner -config=tests/config from the tests root folder, the browser opens up and is able to request several files, until it tries to request the config file. That's when it receives a 404, because it requests the wrong address - http://localhost:8080/tests/config.js - without the sub folder.
I'm wondering if I'm missing something inside the config file, or if intern is not able to use proxies with subfolders. I tried to set the baseUrl parameter, but it had no effect.
Any ideas?
Update:
It seems that sometimes intern-runner uses the path provided in the config param, and sometimes it uses the one in the proxyUrl parameter inside the config file. As a workaround, what I did was to place the config file and the tests on 2 folders (actually I made a symbolic link). The first on tests/ and the second on sub/tests/ and ran it using ./node_modules/.bin/intern-runner -config=sub/tests/config.
It works, but it's kind of stupid and I really wished there was a better way to do it.
This is indeed a limitation/bug of intern. It assumes that the proxy sits at the root of the absolute domain name, i.e. that it has a pathname of /.
An issue has been created on intern's github repository here and the corresponding pull request that fixes the problem is here. Hopefully this gets merged into the upcoming 2.1 release of intern.

HTTP Parameters not being sent in Apache 2.4 breaking functionality

So let's start with some background. I have a 3-tier system, with an API implemented in django running with mod_wsgi on an Apache2 server.
Today I decided to upgrade the server, running at DigitalOcean, from Ubuntu 12.04 to Ubuntu 14.04. Nothing special, only that Apache2 also got updated to version 2.4.7. After wasting a good part of the day figuring out that they actually changed the default folder from /var/www to /var/www/html, breaking functionality, I decided to test my API. Without touching a single line of code, some of my functions were not working.
I'll use one of the smaller functions as an example:
# Returns the location information for the specified animal, within the specified period.
#csrf_exempt # Prevents Cross Site Request Forgery errors.
def get_animal_location_reports_in_time_frame(request):
start_date = request.META.get('HTTP_START_DATE')
end_date = request.META.get('HTTP_END_DATE')
reports = ur_animal_location_reports.objects.select_related('species').filter(date__range=(start_date, end_date), species__localizable=True).order_by('-date')
# Filter by animal if parameter sent.
if request.META.get('HTTP_SPECIES') is not None:
reports = reports.filter(species=request.META.get('HTTP_SPECIES'))
# Add each information to the result object.
response = []
for rep in reports:
response.append(dict(
ID=rep.id,
Species=rep.species.ai_species_species,
Species_slug=rep.species.ai_species_species_slug,
Date=str(rep.date),
Lat=rep.latitude,
Lon=rep.longitude,
Verified=(rep.tracker is not None),
))
# Return the object as a JSON string.
return HttpResponse(json.dumps(response, indent = 4))
After some debugging, I observed that request.META.get('HTTP_START_DATE') and request.META.get('HTTP_END_DATE') were returning None. I tried many clients, ranging from REST Clients (such as the one in PyCharm and RestConsole for Chrome) to the Android app that would normally communicate with the API, but the result was the same, those 2 parameters were not being sent.
I then decided to test whether other parameters are being sent and to my horror, they were. In the above function, request.META.get('HTTP_SPECIES') would have the correct value.
After a bit of fiddling around with the names, I observed that ALL the parameters that had a _ character in the title, would not make it to the API.
So I thought, cool, I'll just use - instead of _ , that ought to work, right? Wrong. The - arrives at the API as a _!
At this point I was completely puzzled so I decided to find the culprit. I ran the API using the django development server, by running:
sudo python manage.py runserver 0.0.0.0:8000
When sending the same parameters, using the same clients, they are picked up fine by the API! Hence, django is not causing this, Ubuntu 14.04 is not causing this, the only thing that could be causing it is Apache 2.4.7!
Now moving the default folder from /var/www to /var/www/html, thus breaking functionality, all for a (in my opinion) very stupid reason is bad enough, but this is just too much.
Does anyone have an idea of what is actually happening here and why?
This is a change in Apache 2.4.
This is from Apache HTTP Server Documentation Version 2.4:
MOD CGI, MOD INCLUDE, MOD ISAPI, ... Translation of headers to environment variables is more strict than before
to mitigate some possible cross-site-scripting attacks via header injection. Headers containing invalid characters
(including underscores) are now silently dropped. Environment Variables in Apache (p. 81) has some pointers
on how to work around broken legacy clients which require such headers. (This affects all modules which use
these environment variables.)
– Page 11
For portability reasons, the names of environment variables may contain only letters, numbers, and the underscore character. In addition, the first character may not be a number. Characters which do not match this restriction will be replaced by an underscore when passed to CGI scripts and SSI pages.
– Page 86
A pretty significant change in other words. So you need to rewrite your application so send dashes instead of underscores, which Apache in turn will substitute for underscores.
EDIT
There seems to be a way around this. If you look at this document over at apache.org, you can see that you can fix it in .htaccess by putting the value of your foo_bar into a new variable called foo-bar which in turn will be turned back to foo_bar by Apache. See example below:
SetEnvIfNoCase ^foo.bar$ ^(.*)$ fix_accept_encoding=$1
RequestHeader set foo-bar %{fix_accept_encoding}e env=fix_accept_encoding
The only downside to this is that you have to make a rule per header, but you won't have to make any changes to the code either client or server side.
Are you sure Django didn't get upgraded as well?
https://docs.djangoproject.com/en/dev/ref/request-response/
With the exception of CONTENT_LENGTH and CONTENT_TYPE, as given above, any HTTP headers in the request are converted to META keys by converting all characters to uppercase, replacing any hyphens with underscores and adding an HTTP_ prefix to the name. So, for example, a header called X-Bender would be mapped to the META key HTTP_X_BENDER.
The key bits are: Django is converting '-' to underscore and also prepending 'HTTP_' to it. If you are already adding a HTTP_ prefix when you call the api, it might be getting doubled up. Eg 'HTTP_HTTP_SPECIES'

Resolving a remote $HOME directory via FTP/SFTP

In Objective-C, NSString has a method called
stringByExpandingTildeInPath
This method will take a string like "~/Documents" and resolve it to "/Users/Nick/Documents". The "~" tilde is resolved to the home directory of the current user of the machine the program is running on.
Now my question is this... I am writing a little FTP/SFTP utility using Cocoa and Objective-C. How could I resolve a tilde (~) path on remote machine via FTP/SFTP?
For example. A user wants to upload a file to
sftp://remote-host.com:~/
If remote-host.com is a Linux or OSX server, then this path is totally valid. However uploading a file only works when I specify the absolute path. I'm not sure if this is a limitation of the framework I'm using, ConnectionKit, or if this is something that I need to manually implement. I'm ok with the latter but, any suggestions on how?
You could try just removing the "~/" (and use the rest as a relative path) - generally the server should put you in the user's home directory by default when you connect.