can I validate input on MTurk? - mechanicalturk

I've written a HIT on mturk asking people for domain suggestions. Is there any way to ensure that the domain has valid syntax at the time of entry or submission?

So it turns out you can embed an iframe within the HIT. This allowed me to embed a form which I could then validate in any way I pleased. It requires the worker to copy the result of the form into the HIT form.

I think to do this the 'proper way' (i.e. no need to copy-paste) you'd need to use an ExternalQuestion. This can be done via either the API (various languages) or the command-line client.

Related

May we use Yii flash messages on this scenario?

I haven't seen this scenario covered here:
Yii Framework: How to work with Flash Messages.
So, after user registration, I wish to redirect the user to a thank you page where he/she could read more about what he/she should do, and what would happen next. It's a nice amount of information, so adding that message to an already existing page is not an option, because it would get to noisy. Making temporary displaying msg isn't an option neither, because it's a fair amount of text to be read.
On cases like this:
Should we still use flash messages and use a conditional so that what normally exists on the page stays hidden while display a success flash message ?
OR
Should we simply redirect to a given thank you view (by creating the respective thankyou action?)
Is there a better option?
You could use a flash message. But these are really for things like "Your account is now created".
If you want to include a good amount of information, I think it best to have a separate thankyou action/view that people are redirected to after the sign up process is complete.

"Anti-XSS protection" by adding )]}' before ajax response

Google plus returns ajax requests with )]}' on first line. I heard it is protection against XSS. Are there any examples what and how could anyone do with this without that protection ?
Here's my best guess as to what's happening here.
First off, there are other aspects of the google json format that aren't quite valid json. So, in addition to any protection purposes, they may be using this specific string to signal that the rest of the file is in google-json format and needs to be interpreted accordingly.
Using this convention also means that the data feed wont execute from a call from a script tag, nor by interpreting the javascript directly from an eval(). This ensures front end developers are passing the content through a parser, which will keep any implanted code from executing.
So to answer your question, there are two plausible attacks that this prevents, one cross-site through a script tag, but the more interesting on is within-site. Both attacks assume that:
a bug exists in how user data is escaped and
it is exploited in a way that allows an attacker to inject code into one of the data feeds.
As a simple example, lets say a user figured out how to take a string like example
["example"]
and changed it to "];alert('example');
[""];alert('example');"]
Now if when that data shows up in another user's feed, the attacker can execute arbitrary code in the user's browser. Since it's within site, cookies are being sent to the server and the attacker could automate things like sharing posts or messaging people from the user's account.
In the Google scenario, these attacks won't work for a number of reasons. The first 5 characters will cause a javascript error before the attack code is run. Plus, since developers are forced to parse the code instead of accidentally running it through an eval, this practice will prevent code from being executed anyway.
As others said, it's a protection against Cross Site Script Inclusion (XSSI)
We explained this on Gruyere as:
Third, you should make sure that the script is not executable. The
standard way of doing this is to append some non-executable prefix to
it, like ])}while(1);. A script running in the same domain can
read the contents of the response and strip out the prefix, but
scripts running in other domains can't.

Dynamic url shortening script for text input

We are looking for script, which automatically detects url, as you type and shorten it, in text input window, before press "submit". The shortening service used is http://yourls.org/
Have you tried implementing one yourself? Deploy the shortener to your own web site (it's written in PHP, as far as I can see from a cursory glance at the web site) and provide a simple Ajax endpoint which will dynamically perform a shortening conversion, then implement calls to that from the main page using JavaScript.
You might want to impose a reasonable delay to allow the user to finish typing, to avoid performing lots of unnecessary conversions of bogus URLs (which may require, e.g. writes to a file or database - I haven't looked at how the library referenced does things).
I'm not sure what you're trying to achieve; if you create new shortened URLs for each substring before the user has finished typing the full URL, you will just proliferate your database.
I don't see how shortening a URL before it's finished makes sense.
If you want to relieve the user from the arduous task of clicking the submit button, then initiate the submit using javascript (jQuery, or something). I'm not sure if that's what you want to do.
http://monkeytooth.net/2010/12/htaccess-php-how-to-wordpress-slugs/
simple means of implementing the concept its a lot more easier than one would think. Querying a DB or some other means of matching the slug/id with the that of which is found in the URL wouldn't be all to hard either. The linked article doesn't really go in depth as what to do next but catching and breaking the URL apart is the essential process of making it work. I have person used the method myself on several sites and it works like a charm for me and the sites it was used on.

How to Verify whether a Robot is Entering Information

I have a web form which the users fill and the info send to server and stored on a database. I am worried that Robots might just fill in the form and I will end up with a database full of useless records. How can I prevent Robots from filling in my forms? I am thinking maybe something like Stackoverflow's robot detection, where if it thinks you are a robot, it asks you to verify that you are not. Is there a server-side API in Perl, Java or PHP?
There are several solutions.
Use a CAPTCHA. SO uses reCAPTCHA as far as I know.
Add an extra field to your form and hide it with CSS (display:none). A normal user would not see this field and therefore will not fill it. You check at the submission if this field is empty. If not, then you are dealing with a robot that has carefully filled out all form fields. This technique is usually referred to as a "honeypot".
Add a JavaScript timer function. At the page load it starts a value at zero and then increases it as time passes. A normal user would read and fill out your form for some time and only then submit it. A robot would just fill out and submit the form immediately upon receiving it. You check if the value has gone much from zero at the submission. If it has, then it is likely a real user. If you see just a couple of seconds (or even no value at all due to the robots not executing JavaScript) then it is likely a robot. This will however only work if you decide you will require your users have JavaScript on in order to perform "write" operations.
There are other techniques for sure. But these are quite simple and effective.
You can use reCAPTCHA (same as stackoverflow) - they have libraries for a number of programming languages.
I've always preferred Honeypot captcha (article by phil haack), as its less invasive to the user.
Captchas bring accessibility problems and will be ultimately defeated by software recognition.
I recommand the reading of this short article about bot traps, which include hidden fields, as Matthew Vines and New in town already suggested.
Anyway, you are still free to use both captcha and bot traps.
CAPTCHA is great. The other thing you can do that will prevent 99% of your robot traffic yet not annoy your users is to validate fields.
My site, I check for text in fields like zip code and phone number. That has removed all of the non-targeted robot misinformation.
You could create a two-step system in which a user fills the form, but then must reply to an e-mail to "activate" the record within a set period of time - say 24 hours.
In the back end, instead of populating your current table with all the form submissions, you could put them into a temporary table that automatically deletes any row that is older than your time allotment. Unless you have a serious bot problem, then I would think that the table wouldn't get that big, especially if the first form is just a few fields.
A benifit of this approach is that you don't have to use captcha or some other technology like that that might create some accessibility problems.

Refresh browser via cron(or not) to a different page on remote request?

I need to display pages in a tutorial fashion. I looked in to netsupport, beamyourscreen and other possibilities but, I do not want the viewers to download anything. I cannot use gd / send screenshots due to audio / video instructions embedded in some of the pages.
Basically, I need the ability to "refresh" a users browser window to a different page via an interface on my end. Whether via a form submission, javascript or any other type of "controller" that allows me to change the page on the viewers browser. PERL preferred but, PHP / javascript whatever works and is cross browser. I set up a simple javascript page forward timer that "works" but, page load times and conversation interruptions are a huge factor.
The entire tutorial website will be developed around this ability.
I was looking in to curl / cron / wget methods but, found little information.
I have seen forum and chat scripts that basically perform a similar task but, there must be a simple(ish) solution in leau of hacking up another script to suit my needs.
I do not want others to control the pages either. The site really, only needs to be accessable during the tutorial however, It "could" remain web accessable as long as user interaction was normal unless (being controlled).
The initial site concept is based on instructing people how to properly introduce new pets into a home. Will be operated by a veteranarian that saved my pets life. I wanted to give something back.
Possible? I really appreciate simple examples etc...
You have no other way but to keep polling the server for "instructions" using javascript. No, you can't send nothing to the end user browser, neither curl nor wget.
Mainly, you'll have to set up a simple request/response protocol between the browser and the server.
If you want to go deeper, you can use something like cometd/meteord/etc. If not, a hidden iframe that reloads himself and receives pages with javascript code for the needed actions can do the trick.
Another alternative.
With javascript dopolling and single character flatfile. Have a simple one character flatfile with a single var. Write it in perl (it is faster and uses less resources than php). The parent script calls a javascript variable in a flatfile. It hits the flatfile and goes wherever the var sets it. The flatfile is written to by the controller. Done.
I guess you could also rename an empty flatfile and use that as the controller. I am usure which is faster, open and read a specific file or hit the directory and return the file name. On the controller side, opening and writing to a file vs renaming a file. Maybe they counter each other in resources and time?
This way the site can act as a normal site. When you want to have remote users see a "presentation" (automatically being shown the site pages at the controllers pace), the controller activates polling and tells the viewers to push a start button. This allows a remote instructor to load pages for the viewers at his leisure.
It is a simple solution that works with nothing really sophisticated going on. No frames are needed either. Just need javascript enabled.
Any better suggestions are welcome!
It occurred to me that what you might want to use is HTML Push technology. Check out the wiki, they have several links. I have never used it myself