Does anyone know of any reverse proxy solutions that allow the content/data of an HTTP response to be directly modified before being relayed to the requesting client?
As an example:
Proxy relays client request for pdf document to another server, response received by proxy, watermark added to pages of pdf, watermarked pdf is returned to client.
Regards,
Mike
Apache has mod_proxy and mod_proxy_html, which is used to rewrite links, headers, etc. I've only ever seen HTML or XML filters, but you should be able to write your own binary one for your PDF needs. The possible difficulty I could see is that Apache treats webpages as a stream, rather than a file. I'm not sure how to watermark a PDF doc, but if you need access to the entire file to do it, it might get complicated quickly.
Note that it would seem far easier to me to do the watermarking on the server, where you have access to the file, rather than a proxy. If server load is a concern, either a batch process, or a separate server could be an alternative solution.
I found a description of Deliverance over on the python tags, and it may be useful for what you're looking for. I have no experience with it myself, so grain of salt and all that.
http://www.openplans.org/projects/deliverance/introduction
I've had success with Pound.
I think I might go down the Squid/ICAP route.
This is for an enterprise level system, does anyone have any experience with either of these in this context?
http://wiki.squid-cache.org/Features/ICAP
Related
I am currently hosting the contents of a site with ProviderA. I have a domain registered with ProviderB. I want users to access the contents (www.providerA.com/sub/content) by visiting www.providerB.com. A domain forward is easy enough and works as intended, however, unless I embed the site in a frame (which is a big no-no), the actual URL reads www.providerA.com/sub/content despite the user inputting www.providerB.com.
I really need a solution for this. A domain masking without the use of a frame. I'm sure this has been done before. An .htaccess domain rewrite?
Your help would be hugely appreciated! I'm going nuts trying to find a solution.
For Apache
Usual way: setup mod_proxy. The apache on providerB becomes a client to providerA's apache. It gets the content and sends it back to the client.
But looks like you only have .htaccess. So no proxy, you need full configuration access for that.
So you cannot, see: How to set up proxy in .htaccess
If you have PHP on providerB
Setup a proxy written in PHP. All requests to providerB are intercepted by that PHP proxy. It gets the content from providerA and sends it back. So it does the same thing as the Apache module. However, depending on the quality of the implementation, it might fail on some requests, types, sizes, timeouts, ...
Search for "php proxy" on the web, you will see a couple available on GitHub and others. YMMV as to how difficult it is to setup, and the reliability.
No PHP but some other server side language
Obviously that could be done in another language, I checked PHP because that is what I use the most.
The best solution would be to transfer the content to providerB :-)
I want to modify my application URL from //localhost:8080/monitor/index.html to just monitor , so that on putting monitor on browser, my application should open. Is there a way to achieve this, can someone suggest the configuration changes which will be required for this.
Can I map my short URL to the existing one may be somewhere in web.xml. I am not sure about the approach any suggestions will be great.
Thanks and regards
Deb
You're mixing up several different protocol layers in your question.
If you just enter nothing but "monitor" in the browser URL bar the browser is going to first lookup "monitor" in DNS and finding nothing it will then probably send a query to Google or your configured search engine. In the past browsers have taken other steps, such as appending ".com" and prepending "www." but I don't think modern browsers do that any more.
So far, your server is not even remotely involved.
If you're a large ISP user (TimeWarner, Comcast) and use their DNS it's also possible the ISP will intercept your failed DNS lookup and route the request to a "helpful" search page (i.e. SPAM) of their own.
At this point the request is still nowhere near your server.
I suppose you could mess with the /etc/hosts file on your local system to resolve "monitor" to the proper hostname, but that's an extremely brittle solution that has to be hard coded on each machine you want to have this "shortcut" link (and which breaks when the hostname changes).
You're much better off just setting up a web shortcut in your browser that points to the right place.
Is there any way to capture post request and write it to log running apache mod_proxy (or any other mod)?
For example, I have one CMS behind apache mod_proxy and I want to capture Login textbox which uses POST verb in the apache log file, it is possible?
Thanks :).
Please take a look at mod_dumpio. All input and/or all output will be logged into error.log.
mod_security can log post data too, but a little complex.
This may suit to your needs mod_log_post (striped down version of mod_sec), but has less documentation and support. Though it might work within your purpose.
What I am trying to achieve is to have Apache's mod_proxy_balancer check if a request was already made using a Memcache store.
Basically:
Streaming media request comes in.
Check if streaming media has already been served with Memcache.
If so, can that streaming media server handle another request.
If so send request to said streaming media server.
If not send request to the next streaming media server in line.
Store key:value pair in Memcache.
My questions are:
Does mod_proxy_balancer already do this in some way?
Is there anyway to make Apache a content-aware load balancer?
Any other suggestions would be greatly appreciated too, other software, other approach, etc.
Cheers.
Looking at 'mod_proxy_balancer.c'; one could, as suggested in the comments in the file, add additional lbmethods. Something along the lines of "bymemcached_t" or "bymemcached_r" where the t and r endings denote the "bytraffic" and "byrequests" methods respectively. We would do our pseudo code above and if not found proceed to the other methods and save the result in the memcached store.
In my research I came across HAProxy which does exactly what I want from its documentation using the balance algorithm option of 'uri' just not using Memcached. Which is fine for my purposes.
I have an ssl page that also downloads an avatar from a non-ssl site. Is there anything i can do to isolate that content so that the browser does not warn user of mixed content?
Just an idea - either:
try to use an ssl url on the avatar website, if necessary by editing whatever JS/PHP/... script they provide, or:
use your scripting language of choice to grab a copy of the avatar and store it on your server, then serve it from there.
There are a number of good security reasons for the browser to warn about this situation, and attempting to directly bypass it is only likely to set off more red flags.
Ninefingers' suggestions are good, and I would suggest a third option: you can proxy the content directly through your own server using a simple binary retrieve/transmit script, if it changes frequently and is unsuitable for caching.
If all the content you want to include from foreign sites comes from a specific server and path (i.e. http://other.guy/avatar/*) you could use mod_proxy to create a reverse proxy which makes https://your.site/avatar_proxy/{xyz} mirror http://other.guy/avatar/{xyz} .This will increase your bandwidth usage and probably slow things down.