Refresh browser via cron(or not) to a different page on remote request? - forwarding

I need to display pages in a tutorial fashion. I looked in to netsupport, beamyourscreen and other possibilities but, I do not want the viewers to download anything. I cannot use gd / send screenshots due to audio / video instructions embedded in some of the pages.
Basically, I need the ability to "refresh" a users browser window to a different page via an interface on my end. Whether via a form submission, javascript or any other type of "controller" that allows me to change the page on the viewers browser. PERL preferred but, PHP / javascript whatever works and is cross browser. I set up a simple javascript page forward timer that "works" but, page load times and conversation interruptions are a huge factor.
The entire tutorial website will be developed around this ability.
I was looking in to curl / cron / wget methods but, found little information.
I have seen forum and chat scripts that basically perform a similar task but, there must be a simple(ish) solution in leau of hacking up another script to suit my needs.
I do not want others to control the pages either. The site really, only needs to be accessable during the tutorial however, It "could" remain web accessable as long as user interaction was normal unless (being controlled).
The initial site concept is based on instructing people how to properly introduce new pets into a home. Will be operated by a veteranarian that saved my pets life. I wanted to give something back.
Possible? I really appreciate simple examples etc...

You have no other way but to keep polling the server for "instructions" using javascript. No, you can't send nothing to the end user browser, neither curl nor wget.
Mainly, you'll have to set up a simple request/response protocol between the browser and the server.
If you want to go deeper, you can use something like cometd/meteord/etc. If not, a hidden iframe that reloads himself and receives pages with javascript code for the needed actions can do the trick.

Another alternative.
With javascript dopolling and single character flatfile. Have a simple one character flatfile with a single var. Write it in perl (it is faster and uses less resources than php). The parent script calls a javascript variable in a flatfile. It hits the flatfile and goes wherever the var sets it. The flatfile is written to by the controller. Done.
I guess you could also rename an empty flatfile and use that as the controller. I am usure which is faster, open and read a specific file or hit the directory and return the file name. On the controller side, opening and writing to a file vs renaming a file. Maybe they counter each other in resources and time?
This way the site can act as a normal site. When you want to have remote users see a "presentation" (automatically being shown the site pages at the controllers pace), the controller activates polling and tells the viewers to push a start button. This allows a remote instructor to load pages for the viewers at his leisure.
It is a simple solution that works with nothing really sophisticated going on. No frames are needed either. Just need javascript enabled.
Any better suggestions are welcome!

It occurred to me that what you might want to use is HTML Push technology. Check out the wiki, they have several links. I have never used it myself

Related

How to speed up Google Translate

I have a web page that has 70000 characters. As you know when doing translation through Google API you can only send up to 5000 characters at a time. Which means I have to send data to Google 14 times (70000/5000) which takes a lot of time and then my page is displayed. Is there a way to speed up the process?
Thanks
have you tried caching the translation?
If you were using some AJAX framework (you don't mention what your web page is created with eg c#) then you can make it faster by making the API call via the AJAX framework.
It would look something like this (psuedo-code since we don't know what you are using):
Serve web page (almost instant)
Web page starts AJAX call:
Break text into chunks
Foreach chunk
Translate via API
Append to the page
This way the user will see the page immediately, and will also see the translation appear piece by piece as it is processsed instead of having to wait until the end.
My best bet would be to generate a page in one language, then ask google to translate it trough HTTP and display result as your own, to make it seamless for user. I believe that is what Google Chrome does when translating web pages.
Example of URL that makes Google translate the whole web page:
http://translate.google.com/translate?hl=en&sl=ru&tl=en&u=http%3A%2F%2Flinux.org.ru%2F
Of course, another option is to use Google Translate API and cache result if page content is not changing frequently.
go to the Javascript file in Google, it will lead you also to the CSS file, make a file or perhaps two, or you may be able to add CSS to your own, now make Javascript page on your web site in own directory. make a nip of code to update the Javascript code every so many seconds or minutes, and this will make the transition much faster, just by refreshing the content they give.. have fun :) also ultimately you can also send a request at the same time as the first one to translate after char 5000 which should be relatively easy to do.

Dynamic url shortening script for text input

We are looking for script, which automatically detects url, as you type and shorten it, in text input window, before press "submit". The shortening service used is http://yourls.org/
Have you tried implementing one yourself? Deploy the shortener to your own web site (it's written in PHP, as far as I can see from a cursory glance at the web site) and provide a simple Ajax endpoint which will dynamically perform a shortening conversion, then implement calls to that from the main page using JavaScript.
You might want to impose a reasonable delay to allow the user to finish typing, to avoid performing lots of unnecessary conversions of bogus URLs (which may require, e.g. writes to a file or database - I haven't looked at how the library referenced does things).
I'm not sure what you're trying to achieve; if you create new shortened URLs for each substring before the user has finished typing the full URL, you will just proliferate your database.
I don't see how shortening a URL before it's finished makes sense.
If you want to relieve the user from the arduous task of clicking the submit button, then initiate the submit using javascript (jQuery, or something). I'm not sure if that's what you want to do.
http://monkeytooth.net/2010/12/htaccess-php-how-to-wordpress-slugs/
simple means of implementing the concept its a lot more easier than one would think. Querying a DB or some other means of matching the slug/id with the that of which is found in the URL wouldn't be all to hard either. The linked article doesn't really go in depth as what to do next but catching and breaking the URL apart is the essential process of making it work. I have person used the method myself on several sites and it works like a charm for me and the sites it was used on.

Figure out if a website has restricted/password protected area

I have a big list of websites and I need to know if they have areas that are password protected.
I am thinking about doing this: downloading all of them with httrack and then writing a script that looks for keywords like "Log In" and "401 Forbidden". But the problem is these websites are different/some static and some dynamic (html, cgi, php,java-applets...) and most of them won't use the same keywords...
Do you have any better ideas?
Thanks a lot!
Looking for password fields will get you so far, but won't help with sites that use HTTP authentication. Looking for 401s will help with HTTP authentication, but won't get you sites that don't use it, or ones that don't return 401. Looking for links like "log in" or "username" fields will get you some more.
I don't think that you'll be able to do this entirely automatically and be sure that you're actually detecting all the password-protected areas.
You'll probably want to take a library that is good at web automation, and write a little program yourself that reads the list of target sites from a file, checks each one, and writes to one file of "these are definitely passworded" and "these are not", and then you might want to go manually check the ones that are not, and make modifications to your program to accomodate. Using httrack is great for grabbing data, but it's not going to help with detection -- if you write your own "check for password protected area" program with a general purpose HLL, you can do more checks, and you can avoid generating more requests per site than would be necessary to determine that a password-protected area exists.
You may need to ignore robots.txt
I recommend using the python port of perls mechanize, or whatever nice web automation library your preferred language has. Almost all modern languages will have a nice library for opening and searching through web pages, and looking at HTTP headers.
If you are not capable of writing this yourself, you're going to have a rather difficult time using httrack or wget or similar and then searching through responses.
Look for forms with password fields.
You may need to scrape the site to find the login page. Look for links with phrases like "log in", "login", "sign in", "signin", or scrape the whole site (needless to say, be careful here).
I would use httrack with several limits and then search the downloaded files for password fields.
Typically, a login form could be found within two links of the home page. Almost all ecommerce sites, web apps, etc. have login forms that are accessed just by clicking on one link on the home page, but another layer or even two of depth would almost guarantee that you didn't miss any.
I would also limit the speed that httrack downloads, tell it not to download any non-HTML files, and prevent it from downloading external links. I'd also limit the number of simultaneous connections to the site to 2 or even 1. This should work for just about all of the sites you are looking at, and it should be keep you off the hosts.deny list.
You could just use wget and do something like:
wget -A html,php,jsp,htm -S -r http://www.yoursite.com > output_yoursite.txt
This will cause wget to download the entire site recursively, but only download endings listed with the -A option, in this case try to avoid heavy files.
The header will be directed to file output_yoursite.txt which you then can parse for the header value 401, which means that the part of the site requires authentication, and parse the files accordingly to Konrad's recommendation also.
Looking for 401 codes won't reliably catch them as sites might not produce links to anything you don't have privileges for. That is, until you are logged in, it won't show you anything you need to log in for. OTOH some sites (ones with all static content for example) manage to pop a login dialog box for some pages so looking for password input tags would also miss stuff.
My advice: find a spider program that you can get source for, add in whatever tests (plural) you plan on using and make it stop of the first positive result. Look for a spider that can be throttled way back, can ignore non HTML files (maybe by making HEAD requests and looking at the mime type) and can work with more than one site independently and simultaneously.
You might try using cURL and just attempting to connect to each site in turn (possibly put them in a text file and read each line, try to connect, repeat).
You can set up one of the callbacks to check the HTTP response code and do whatever you need from there.

Redirection Before Page Load

I have an injection script--a start script--whose ultimate goal is to redirect to a different URL. That injection script needs to access the extension settings, so it sends a message to a global HTML file. That global file checks the settings and redirects to the appropriate URL by setting the safari.application.activeBrowserWindow.activeTab.url property.
What I'm finding is that all too often, the interim page loads first making for an annoying UX at best and introducing errors at worst. I'm assuming that this is a result of the asynchronous nature of messaging, but I haven't been able to find a way to stop it.
Is there any way to prevent the default behavior (loading the originally requested page) while still reading from extension settings?
Thanks.
It looks like this simply isn't possible given the current state of the Safari extension API.

Automate adding entries to a wiki

Once I have my renamed files I need to add them to my project's wiki page. This is a fairly repetitive manual task, so I guess I could script it but I don't know where to start.
The process is:
Got to appropriate page on the wiki
for each team member (DeveloperA, DeveloperB, DeveloperC)
{
for each of two files ('*_current.jpg', '*_lastweek.jpg')
{
Select 'Attach' link on page
Select the 'manage' link next to the file to be updated
Click 'Browse' button
Browse to the relevant file (which has the same name as the previous version)
Click 'Upload file' button
}
}
Not necessarily looking for the full solution as I'd like to give it a go myself.
Where to begin? What language could I use to do this and how difficult would it be?
Check if the wiki you mean to talk to supports XMLRPC, because if it does it should be a snap. I wrote a tool called WikiUp to solve a similar problem (updating a delineated section on a wiki page).
If you're writing in C#, the WebClient classes might be a good place to start. I bet people could give more specific advice if you mentioned which wiki platform you are using, and whether it requires authentication, though.
I'd probably start by downloading fiddler and watching the http requests from doing it manually. Then you could use some simple scripts and regexes to build your http requests for automating the process.
Of course, if your wildly lucky, your wiki would have a backend simple enough that you could just plug them into its db directly. :)
You might find CoScripter useful -- it's a Firefox extension that allows you to automate tasks you perform on websites. I'm not certain how you'd integrate this with the list of files you're changing on your local system, but it can certainly handle the file uploading through a web form.
Better bet is probably using cURL or a similar HTTP library with your programming language of choice. If you're on *nix, you can use the cURL commandline program inside your shell script to get this done fairly easily. (Like #jsight said you will need to analyze the actual forms you're using on the webpage, using Fiddler or just looking at the form elements and re-creating the POST through cURL.)