Running MTurk HITs on external website - mechanicalturk

I am implementing a website on which the recruited MTurk workers will perform tasks. I plan to recruit workers using MTurk tasks, using which I will redirect them to an external website for actual work. I have the following questions relating to this plan.
Is there any foreseeable problems with this approach of running HITs? If so, how can we mitigate them?
how should I implement the authentication procedure on my external site? For example, how can I make sure the people who come to the website to perform a specific task are indeed the same group of people recruited earlier for this particular task on MTurk?
when the workers finish the task, how should I integrate the payment procedure with MTurk based on their performance? For example, say worker is owed $3 after finishing the task on my external site, is it possible for me to tell MTurk to pay him/her this amount programmatically?
The external site will be built using Python, if such detail matters.
Any suggestions and comments based on your experiences and insights in using MTurk would be much appreciated!

I am thinking through this for a similar project of mine. I've experimented as a worker myself. Here is my plan, I hope it is of use to you. (I have not implemented it yet. It is based on an academic HIT I participated in as a worker.) Here goes:
A. Create a template that has language something like:
1. Please open this web site in a new browser window:
http://your-url.xyz.blah/tasks/${token}
2. Read and follow the instructions there.
3. After completing the task, you will receive a confirmation code. Paste
it here: [________]
B. Create some random tokens for your Mechnical Turk data file:
1A1B43B327015141
09F49F2D47823E0C
B5C49A18B3DB56F4
4E93BB63B0938728
CCE7FA60BFEB3198
...
(Generate these tokens from your app; it needs to cross-reference them.)
C. Your app extracts the token from URL, looks up the task, and does whatever it needs to do. I personally don't worry about people stumbling onto a URL, since it is a one-time use token.
D. After a user completes the task on the external web site, the external app gives a confirmation code. The confirmation code should be random and opaque. Only your application will know if any particular code corresponds to a correct or incorrect answer. In fact, if you want, the correctness may not even be determined in real time -- it could be the result of an aggregation and/or comparison across multiple submissions.
E. Write some code to interact programmatically. Take the token and confirmation code supplied from the MTurk result and make sure they match with your external app. If they don't match, reject the HIT. If they match, check the correctness in your external app and approve or reject. You might consider a bonus pay structure.
So, to answer your particular questions:
I don't anticipate problems with the approach I described. That said, Mechanical Turk is both an art and a science. Perhaps more art. Writing good questions and paying Turkers appropriately is something you have to figure out with a combination of common sense, market research, and experimentation.
See (C) above. A token is designed to only be used once. Use long enough tokens and the probability of collision becomes very low.
See (E) above. The Mechanical Turk Developer Guide is a good place to start.
Please share your results back. Or have the Turkers send StackOverflow hundreds of postcards. :)
Notes:
I'm currently exploring qualification tests. I suspect they can be very useful.
I want to get a Turker's Worker ID in my external application, but I haven't figured that part out yet. I'm reading up on it; for example: Getting workerId by assignmentId
I am thinking about using the ExternalQuestion feature from the API: "... you can host the questions on your own web site using an "external" question. ... A HIT with an external question displays a web page from your web site in a frame in the Worker's web browser. Your web page displays a form for the Worker to fill out and submit. The Worker submits results using your form, and your form submits the results back to Mechanical Turk. Using your web site to display the form gives your web site control over how the question appears and how answers are collected."

You might also find PsiTurk to be useful: "PsiTurk is an open platform for conducting custom behvioral experiments on Amazon's Mechanical Turk. ... It is intended to provide most of the backend machinery necessary to run your experiment. It uses AMT's External Question HIT type, meaning that you can collect data using any website. As long as you can turn your experiment into a website, you can run it with PsiTurk!"

Related

CRUD with single drag/drop or other action via API?

This is my first post/question. If I missed an existing thread that answers my question, I missed that thread in my search and definitely appreciate you linking me! Please let me know if I should be posting/asking this elsewhere....
My question relates to Salesforce.
I have a use case where a client has a monthly batch of files that need to be made available on various cloud-based storage/distribution platforms like Box and Dropbox but also other less ubiquitous tools specific to the sector. Currently, the client is logging into each distribution platform, one-at-a-time, and uploading the files; then, if at any point any files need to be updated or removed/restricted, the client logs into each platform one-by-one and takes the necessary action. Obviously the process being described is tedious/laborious and leaves multiple gaps for error. The client and I are discussing a solution that would allow for create/read/update/delete actions in all of the distribution platforms without having to leave their Salesforce org. I am aware of existing AppExchange integrations for Box, Dropbox, etc. but they don’t quite do everything we need (to my knowledge)—they tie-in nicely and there are use cases where they are powerful tools...but—my understanding of those existing integration is that they would still each require dedicated tabs within the Object and repeated ‘drags’ and ‘drops’ of the same files to each tab. Again, the end goal here is that, for example, the client wants to drag and drop one time and have it pushed to the various platforms, etc. Or another example is they would like to choose "delete" one time from within Salesforce and have the file removed/restricted on all distribution platforms.
I am a certified SF Admin 1, so...perhaps this should be in my wheelhouse but...I feel unsure how to approach. My feeling is this is asking for a combination of integrations via API and Process/Flow work, but I am hoping for some ideas/input/guidance. Any insight or help any of you have to offer would be so greatly appreciated!!
Thanks so much!

Using Magento as the main, and creating a single sign on to integrate with other third party software

This has been something I have been trying to work on for a good long time. It first started with Prestashop as an integration with other scripts or pieces of the puzzle I needed to make for an overall website. I am currently still using Prestashop as my webstore but have since switched to Magento.
I switched to Magento because of it's complex flexibility and because overall I think it is the best solution, best backing and best overall eCommerce script to go with.
That being said, the same issues I was having with Prestashop appear to be the same I will continue to have any in aspect that I try to integrate things together in perfect harmony.
I have Magento setup, as the main portion of the website, and inside Magento in sub folders I have Wordpress installed in a folder called "articles" and I have also went with FluxBB as my message forums because of it's simplicity in not having a crap load of bloated extra features that I could care less about and that is in a sub folder called "forums".
From this point, we know that Magento, Wordpress and FluxBB all have their own way of managing users; creating, managing, and tracking them.
What I am wanting to do is find the best way to fit these three and more together for my website to make the experience for the customer as smooth and as functional as possible. After emailing the ever talented and helpful Alan Storm, he told me the best solution he was aware of working was to make a third party user management that they all point to and it manages the customers authentication. I do believe his thoughts may be the best but I wanted to put this out there here on StackOverFlow and I may post this on Magento as well to get the broad scrope of magento developers and smart guys that like challenges.
I have several thoughts, none may work, some may work half ass, or one may just be something workable. But first let me tell you what I have accomplished so far. I have done the necessary steps to integrate my overall design for the header and footer, so essentially Wordpress and FluxBB are wrapped and are contained inside Magento's outer design layer. So with that being said I have also made it where Magento will check the session to see if the user is logged in to Magento or not by saying "Hello Guest" or "Hello User". This is where I have hit a stopping point because I am out of my depth and would like assistance, whether it is something we create together out of pure challengeness or someone says if I pay them they will help me, either way I would like this accomplished. If and when I get the code figured out whether by means of paying for assistance of a group effort I would like to make it freely available for others to use the concept for their own projects.
Brain Fart #1:
Adjust the user tables for both Wordpress and FluxBB to conform more to the structure of Magento, as for the password and username/email login portion. The rest of the fields can respectively stay as they are for post counts, and etc.
From there, I would like to figure out which class in Magento does the actual input into the database when a customer is created out of registration. When I find that code, I would like to extend upon it the ability to copy the user credentials into the other two tables in the database for Wordpress and FluxBB. If necessary it can just be an added couple of fields to Wordpress and FluxBB if that seems like a better idea and yes I do mean the actual encrypted password that Magento creates, I want this to be secure as well.
From there, when we know that a customer registers with Magento the data is copied over to the other two tables then we at least have made progress, whether this progress will actually work, is still to be determined.
We then disable the login/logout and registration links in any way that we can from Wordpress and FluxBB because they will no longer be needed because we want the user to register, login and logout through one location which is Magento.
Then comes the fun part in my eyes, keep the damn session going throughout the entire website as they order products, review wordpress articles and possibly leave comments, send to friends and etc.... as well as post topics, replies and etc in the FluxBB capacity.
To me this is where the creating the fields or adding the data from Magento's customer registration comes into play, I can make it check to see if they are logged into Magento already and from there we may be able to have it validate itself. This may be over kill or this may just be how it needs to be done. But to me if the credentials are located in all three databases then they should be able to be validated by changing the code in Wordpress and FluxBB or adding code. And Yes I am aware that we will also have to do something about Profile Editing and Password Editing if a customer so desires to change their information.
But that is my first thought on this whether it is the right decision or not, I would like hear from the vast knowledge of people here who have more experience and knowledge than I get with Magento, PHP and everything else.
Brain Fart #2
This illogical idea seems like an outside stretch entirely to me because of the complexity of Magento and how it is overall setup.
But the idea is to remove/edit the Wordpress and FluxBB (and any other third party software) to pretty much ignore it's own method of registration, login, logout, edit and look to Magento for it's credentials and establishing new customers. Essentially making them an oversized module of Magento.
I just know that the way Magento is setup is to be modulerized and its complexity seems like it would take a lot more coding and troubleshooting to do this.
Brain Fart #3
Dump both Wordpress and FluxBB and look towards modules in the Magento Connection Store that pretty much has all of the functionality that I need and can add to them what is missing and not mess with trying to integrate third party software.
I love Wordpress, I think replicating it with a module, at least after the hours I have spent looking at all of the modules available that are CMS/News related is a tough call. FluxBB I could take it or leave it, if someone had an already viable solution to use phpBB or vBulletin or SimpleMachines I would go with them. I rather it be free open source software, not because I am a cheap skate but just because I support open source as much as I can.
Brain Fart #4
Can this be a cookie this, but would only be effective if they allow cookies, or could somehow addon to the session to allow things to pass through but Magento sets up different sessions or allows you too so they things to crash against each other so this may not at all be an idea or may be one as well.
I know I am not giving examples of things I have tried, files I have looked at or anything related to that and I apologize, I provide some links related but nothing specifically found so far that matches what I am trying to accomplish. And I have tried to merge things together with some fun disastrous results.
Link Examples?:
http://www.magentocommerce.com/wiki/doc/webservices-api/api/customer#customer.create
http://www.magentogarden.com/blog/how-are-passwords-encrypted-in-magento.html
http://www.nicksays.co.uk/magento_events_cheat_sheet/
http://www.magentocommerce.com/wiki/5_-_modules_and_development/customers_and_accounts/registration_fields
How to access Magento customer's session from outside Magento?
Any assistance with this would be nice, I am trying to work on several parts of the website at once and this one is troublesome and I would say that everyone is going to find it hard or have found it hard. Anyone like challenges? :)
--------- EDIT:
I have got Magento and Wordpress to work perfectly together with James Kemp's module found on CodeCanyon's website (Single Sign-On for Magento and Wordpress) and I am going to adapt it to work for FluxBB or anything else I do.
Just passing along the information... I see this was edited, don't know what was edited and don't care. Just passing along information I have since found since posting this.
I am managing/customizing a combo of magento+vanilla forums+a custom app made in Yii framework. The users are "shared" between the apps. None of the two links are good. As Alan already replied to you, the correct SSO will be with an external user database/manager. But well, not everyone is up to recoding three apps just to get 1 post a week forum and 1 article a month blog to work with magento. So we are left with less options. First of all, if you don't want (most probably not) to rewrite a good portion of already written open source project that is being updated and maintained and then maintain your changes against periodical updates (you want them), then you have to duplicate the user data over three databases. Unless the project you adapt has some way to manage users data as plugin or external module. AFAIK both of your choice don't.
So, how to implement it? Assuming you choose Magento as mother-of-all, you need it to export an API for authentication, which may work over browser using cookies and javascript but this is rather tricky, or you can use it's frontend cookie to validate the sessions doing server-server API requests from children apps. This is a preferred option as far as "classical" SSO goes. Technically, what should happen when your users open forum or blog, the respective apps detect magento's cookie and check if the session is valid and who is the user. If the user is found, his data is copied to the blog or forum tables. Then you need to start an authenticated session on blog or forum app using the newly created user record.
So far so good, but yet some work. you need to disable the user profiles management in the children apps or modify it so the data held in Magento is always the correct one and you need to invent something to synchronize the Magento's representation of user profile down to the children. This is better to be hooked up on Magento's events so every time a user changes his profile the data is updated in the children app. But there is another but too. You probably want to keep some data app specific, a display name on the forum is not necessary the FirstName+LastName from the Magento and some would like to keep it private.
The above is just what I can recall as interesting facts about keeping it running. There are certainly many other things I've left out, more or less specific. But hopefully my comment can help your brain farting.
We've tried to evaluate other options but anything without duplicate data seems to be too expensive to implement or to maintain. Maybe later. With budget and time.

How to make a Tag cloud app that post on a website?

I want to make an app where the users can post messages that will be displayed on a website. The users would need to create a username and password to be able to post.
The app would be like a twitter, but only be able to post through the app and read the last few posts and not be able to write private messages.
The website would function like a huge cloud of thoughts where everyone could go and read what others have written. Once the post hit the cloud, they can't be deleted. Only me could delete posts.
All posts would have different color and font size, it would look like a huge tag cloud on the website.
How do I make an app and a website like this?
David H
The tutorial application for Google Application Engine is an unstyled version of what you describe. They'll even host it for you for free (up to a non-trivial level of usage).
The tag cloud creation is not so very hard but without knowing your preferred language it is hard to point you to helpful libraries (there are plenty out there).
Getting people to use it will be the hard part.
added in response to comment:
Good luck on your endeavor. I would be surprised if you weren't able to learn everything you need to know and have a working web app by the time school starts. I found a simple stand alone web cloud creation library that explains what it does and will run on GAE. So now even that part is in place for you.
I'm tempted to make some pathetic reference to the sorts of computing that I did prior to high school, but I expect that you probably have SD data cards have more computational power than I had available to me. Kids these days! ;)

Can you run your own reCaptcha service, in your own web app?

Is the backend used by reCaptcha open source? Is it a simple web app that can be deployed in a given container?
Thanks,
LES
It's a web service. It is supplied by a third party.
You can integrate it into your application, but as far as the source code goes, no. Its value is not in the source code but in the images that are supplied. They're not randomly generated but come from books from those parts an OCR system failed to process. So by solving reCaptcha people are actually helping scan books. Somebody takes care of the scanning process and supplied a constant flow of new challenges. Hard to beat.
Running reCaptcha on your own server would be very cumbersome, as it requires a constant supply of image data (scanned books) to work. Also it would kind of beat a part of the purpose, that is digitizing books for the common good. Besides, I don't think it's even available.
This should be able to answer all of your questions for you: recaptcha

How would you go about making an application that automatically retrieves your bank account balance twice a day?

I'm building a utility that will hopefully keep my wife in tune with how much money we have available.
I need a simple secure way of logging into my bank account and retrieving the balance.
Something like mechanize is the only method I can think of. I'm not even sure if that would work given the properly authenticated https that banks use.
Any ideas?
Write a perl script using LWP::UserAgent. It supports HTTPS connections. The only issue might be if the site requires javascript.
Web Client Programming with Perl has a few examples to get you started if you're not too familiar with perl.
If you really want to go there, get these extensions for Firefox: Live HTTP Headers, Firebug, FireCookie, and HttpFox. Also download cURL and a scripting language that can run cURL command-line tasks (or a scripting language like PHP or Perl that has access to cURL libraries directly).
I've started down this road for some idempotent GET tasks like getting PDFs of the S&P reports (of the stocks I track) from my online brokerage, and downloading the check images for my bank account. Both tasks are repetitive and slow ways of downloading data to my computer that the financial institutions don't provide any way of making it easier.
Here's why you shouldn't: (as a shortcut I'm going to call the archetypal large bank, brokerage, or other financial institution "BloatBank")
BloatBank is not likely to make public their API for accessing this kind of information. So it can change any time and all your hard work will be for naught. Whenever they change their mechanism, you'll have to adapt.
If BloatBank finds out you've been using automatic scripting to try to access your account information, they may ban you because you've violated their terms of service.
You might screw up, and the interaction between the hodgepodge of scripts on BloatBank's server, and your scripts that access your account, might cause a Bad Thing like closing your account. Testing this kind of script is tremendously difficult because you don't have any documentation about how their online service works, and you don't have a test account you can mess with.
(a variant of the above) You think you're safe because you're issuing GET requests. But BloatBank is just a crazy bank that doesn't know anything about REST, so there are some GET requests that can mess up your account.
If someone else does use your script to maliciously sniff your online password or mess with your account, any liability coverage from BloatBank may disappear because you've opened a security hole.
Why don't you teach your wife how to login to the bank herself? Or use Quicken (or Mint, etc) and teach her how to use the auto-download feature?
Have you checked out Watir? It is fantastic for automating web-browser actions. And since it's written in Ruby, you can take the results and store them in a DB (or email them to yourself) if needed.
If you are open to AIR, I'd say build an AIR app. I have worked with mechanize and I think it's cool. AIR gives you similar features with a richer GUI (see HTMLLoader and DOM manipulation of webpage).
If I were you, I'd simply pull the page and manipulate the DOM to suit my visual needs.
Please, if you find this easy to do for your bank please post your bank's name. If I have the same one I'll be closing my account.
More to your question. The process of loading a web page inside of your code rather than in a browser can be a black art, especially if their is any javascript involved. Your best bet would probably be embedding the IE Web Browser control in your app and then simulating key strokes and mouse clicks to arrive at your balance page. Then scrape the HTML for the balance.
I could try paying for Quicken and letting it do the balance downloading. Then I'd just need to find a way to get the number out of the software automatically.
This way I'm not violating any terms of service and I'm also reducing security risk since all "hacking" goes on locally.