Managing ajax Couchdb calls and IE's (hta) agressive cache - apache

I'm having a quite annoying problem, and came up with a quite ugly hack to make it work.
I develop an Hta application using a CouchDB database (for internal company use). The problem is there seems to be some very aggressive caching of the database queries, and it's been hard to come up with solutions.
So the updated data in the database just won't come up in the browser, who still has the previous request results in his cache, until the entire app is started anew.
Oh, and CouchDB (or it's mochiweb server) doesn't allow unknown GET variables, so the usual solution of appending some sort of timestamp just won't work.
I have found some sort of solution, but it's damn ugly. Solutions are:
Only open documents with latest revision number (easy and nice, won't work on views)
Use apache as forward proxy listening to 200+ ports, and select one at random on each read query. (that's the ugly one).
Hta accepts ajax calls to other ports (maybe even on other domains, strange behaviour), so it works nicely, I just have a 1/200 chance that new data won't come up, but that's still better then 1/1, I can live with that.
So what I'm asking is, is there a better solution to this ? Can I hack in to the mochiweb server to modify cache headers (and hope they're not going to be ignored) ? Is there a special unknown "throwaway" key I could use in the url's to append some random string ? Or is there a way to tell Hta not to cache anything (from within the app, this is supposed to work on hundreds of computers) ?

it's still ugly but slightly less ugly than your current apache setup but Can't you use an apache rewrite rule to allow you to set an arbitrary no_cache attribute on the url? apache can throw it away so couchdb won't see it.

Related

What's elasticsearch and is it safe to delete logstash?

I have an internal Apache server for testing purpose, not client facing.
I wanted to upgrade the server to apache 2.4, but there is no space left, so I was trying to delete some files on the server.
After checking file size, I found a folder /var/lib/elasticsearch takes 80g space. For example, /var/lib/elasticsearch/elasticsearch/nodes/0/indices/logstash-2015.12.08 takes 60g already. I'm not sure what's elasticsearch. Is it safe if i delete this logstash? Thanks!
Elasticsearch is a search engine, like a NoSql database, and it stores the data in indeces. What you are seeing is the data of one index.
Probobly someone was using the index aroung 2015 when the index was timestamped.
I would just delete it.
I'm afraid that only you can answer that question. One use for logstash+elastic search are to help make sense out of system logs. That combination isn't normally setup by default, so I presume someone set it up at some time for some reason, and it has obviously done some logging. Only you can know if it is still being used, or if it is safe to delete.
As other answers pointed out Elastic search is a distributed search engine. And I believe an earlier user was pushing application or system logs using Logstash to this Elastic search instance. If you can find the source application, check if the log files are already there, if yes, then you can go ahead and delete your index. I highly doubt anyone still needs the logs back from 2015, but it is really your call to see what your application's archiving requirements are and then take necessary action.

Visual Basic / ASMX - how to use application cache variable?

I'm trying to amend our content management system so it'll handle SQL database failures more gracefully. It's a bunch of ASMX pages, and a Helpers.vb file in which I've written a SQL connection tester function.
Each of the ASMX pages call the same function.
I need to create a variable I can check that's persistent and performant, otherwise I'm going to have fall back on something disasterously slow like reading a text file every time I set up a sql connection string.
I've tried using application caching, but either it doesn't work in the context of my helpers.vb file, or I've made a mess of the syntax. One problem that's already stymied some of the approaches I've found via google - I can't use 'Import System.Web.Caching' - IntelliSense doesn't show the 'Caching' part.
Has anyone got any example code that might get me up and running? Or an alternative approach?
#Mike,
Many thanks, now I'm using HttpRuntime.Cache correctly... it works!
Thanks everyone for taking the time to post :)

Controlling access to large files in Apache

I am looking to control access to some large files (we're talking many GB here) by the use of signed URLs. The files are currently restricted by LDAP Basic authentication (mod_auth_ldap), but I need to change this to verify the signature (passed as a query parameter in the URL).
Basically, I just need to run a script to verify the signature, and allow the request to proceed as if authentication had succeeded. My initial thought to this was just to use a simple CGI script, but as the files are so large I'm concerned about performance. So, really, this question is (probably) more like "are there any performance implications of streaming large files from a CGI script via Apache?"… and if so, "is there a better way of doing this (short of writing a dedicated authentication module)?"
If this makes any sense, help would be much appreciated :)
P.S. I wasn't sure exactly what to search for for this (10 minutes of Googling were fruitless), so I may very well be duplicating someone else's post.
Have a look at the crypto cookies/sessions in apache - one way to do this is to put a must-have-valid session limit on that directory - forward anyone who does not have a valid one to a cgi-script; auth there - and then forward back to the actual download.
That way apache can use its normal sendfile() and other optimizations.
However keep in mind that a shell script or perl script ending with a simple 'execvp', 'exec cat' or something like that is not that expensive.
An alternative is more uRL based - like http://authmemcookie.sourceforge.net/.
Dw.
I ended up solving this with a CGI script as mentioned… cookies weren't an option because we need to be able to support clients that don't use cookies (apt).

Nice remote apache log viewer

I have a server with 10+ virtual domains (most running Mediawiki). I'd like to be able to watch their traffic remotely with something nicer than tail -f . I could cobble something together, but was wondering if something super-deluxe already exists that involves a minimum of hacking and support. This is mostly to understand what's going on, not so much for security (though it could serve that role too). It must:
be able to deal with vhost log files
be able to handle updates every 10 seconds or so
Be free/open source
The nice to haves are:
Browser based display (supported by a web app/daemon on the server)
Support filters (bots, etc)
Features like counters for pages, with click to view history
Show a nice graphical display of a geographic map, timeline, etc
Identify individual browsers
Show link relationships (coming from remote site, to page, to another page)
Be able to identify logfile patterns (editing or creating a page)
I run Debian on the server.
Thanks!
Take a look at Splunk.
I'm not sure if it supports real time (~10 second) updates but there are a ton of features and it's pretty easy to get set up.
The free version has some limitations but there is also an enterprise version.
Logstash is the current answer. (=
Depending on the volume, Papertrail could be free for you. It is the closest thing to a tail -f and is searchable, archivable and also sends alerts based on custom criteria.

How to compare test website and live website

We have our production server running our website. Then we have a test server which has exact same data but with changes to code to do some new functionality. This web app has over 500 pages.
Is there any program that can
Login to the test site
Crawl through each page and then save the page as html
Compare with the same page saved with live site?
This way we can make sure that new features that we add to our test site will not break the live site when code updates are applied to production.
I am currently trying to use WinHTTrack website copier and then comparing the test and live folders with some code comparison tool like beyond compare. This works ok but there are lot of files changed because of the domain name changes.
Looking forward to ideas / solutions for this problem.
Regards
Have you looked at using Watir for this? It's not exactly the thing you are looking for but it might allow you some more granularity in your tests and ensure the site is functionally identical rather than getting caught up on changing guids, timestamps and all the other things that tend to change across any significant size website from day to day as part of it's standard functionality.
Apparently you can't make consistent, reproduceable builds in your project, can you? I would recommend moving towards that in the long run, it will save you a lot of headaches. That way you would know exactly what was deployed to which server when, so there would be no more need to bend around backwards to get the deployed sources back like this...
I know this is not a direct solution to your problem... but maybe it is worth comparing, whether you would save more in the long run by investing the efforts into your build process now, instead of implementing this workaround (and then improving your build process anyway - because one day you will almost surely need to do that).
wget has a --convert-links option, there are also some options to preserve cookies that might let you do it logged in http://drupal.org/node/118759#comment-664498
use an Offline Downloader, download all files to your computer from both sources, then compare the folder contents using a free tool like Total Commander.
EDIT
Load both of your sources into a CVS, and compare it there.