How many times was a Mechanical Turk HIT returned? - mechanicalturk

I currently have a set of HITs uploaded on Mechanical Turk and I realise that the uptake (the number of participants submitting these tasks) is very low. It has been 3 days since our HITs were uploaded and I am wondering if the low uptake has to do with the complexity of our HIT.
My question is - is there a way to fetch the number of times a HIT was accepted by a participants and returned (w/o completion/submission)? The current API and data structure for a HIT does not include a field with this information - https://docs.aws.amazon.com/AWSMechTurk/latest/AWSMturkAPI/ApiReference_HITDataStructureArticle.html
Any pointers greatly appreciated.
Best,

Related

User simulation with jmeter for stress and scalability testing

Hi there I am trying to conduct stress and scalability testing for a web application. One of the problem is that it is not clear how many users the website can handle. So first I am conducting a stress testing with different user number. Consider below picture:
In the picture first I am starting with 100 user by gradually increasing the load. So i have requirement for example:
Response time should not be 7 sec
throughput should not fall under 35 request per second
Percentage of error should not be more than 10 percent of the total request
So if this 100 user satisfies the requirement I will continue the test with 150 user and will go on until it breaks at least 2 requirement out of 3. And with that user I will perform the scalability test. Is this approach right? Please give me advise and how should I simulate user if not right?
The approach is more or less right. However you could save your time and efforts for finding the breaking point of the system with a single test only, just start with 1 user and gradually increase the load until the maximum.
JMeter's theoretical limit for a single Thread Group is 2,147,483,647, your actual limit will be less as most probably your hardware resources are limited, check out What’s the Max Number of Users You Can Test on JMeter? article for more details.
Whatever, if you put a reasonably high number of threads in the Thread Group and configure the load to increase gradually you can run your test and pay attention to the following charts:
Active Threads Over Time - shows the number of active users
Response Times Over Time - shows the system response time
Transactions Per Second - shows the throughput (the number of requests per second)
At first stage of the test response time will be the same and the throughput will be growing as you increase the load. At the certain point of test the response time will start increasing and throughput will be going down - that would be the limit of your system (so called "bottleneck")

How to get a list of the 10 top closes users by driving directions without incurring Google API Costs

If I have a Database of 50,000 users with there address and every time a new user is created the want to see who are the top 10 closes users by driving distance.
Google starts charging 50 cents for 1000 requests. I'm looking a better may to do this that would limit my costs.
Also is there a better way then having to run the API against all 50,000 users every time a new user is added?
At first you may select closest users by the direct distance. You don't need paid functionality for that.
Let's say 20 would be enough bacause it's very unlikely that so many addresses will be the closest by distance but farther by driving route than others.
In that case you need only 20 requests to check the users closest by route.

What is the difference between parsing betting website for live scores vs official website API?

I want to monitor some live scores on soccer matches. I have 2 ways to do this:
official api from the website(free)
parse websites source code myself and get data from it( need to do it every second)
What is the difference? Is calling API faster?
This can depend on quite a lot external to this specific scenario, but given the context, yes the API's would much faster. The difference is in what data is being sent/received/parsed.
In either scenario you'd need some timer to tick and parse the results (website or API) so there's no performance difference in the "wait code", but the big difference will be in the data itself that is parsed. When you call the API, chances are more likely that you will send a specific parameter or call a specific function that indicates what you're looking for, pseudo-code example:
SoccerSiteApi.GetValue(SCORE, team1, team2);
Or
SoccerSiteApi.GetCurrentScores(team1, team2);
By calling the API, you are only sending and receiving a few hundred bytes (or more depending on data) and getting back exactly what you want, that is, you don't need to parse the scores out of the values sent back since they are the scores, so no processing time is spent doing anything additional with the data itself.
If, however, you were to parse the entire web site, you would need to make an HTTP GET request (and all that entails) to get the entire page (which could be a couple hundred KB or MB depending on content) and then spend processing time extracting the exact data you were looking for, and then doing this every second.
So the biggest difference is amount of data and time spent processing it.
Hope that can help

Rate limits and max data points per upload

noob here using Arduino Wifi (Adafruit CC3000) to send data. Or try. I have read and understood about rate limits and a limit on number of data items uploaded per connection. But I searched and cannot find what the numbers are for those limits. One per minute? I have deduced by testing that the number of data items is two, and tried to send my 6 data points in three consecutive calls of two. That crashes and burns with assorted errors, so I suspect I'm hitting the rate limit.
P.S. Two datapoints per minute work, so API key, etc is not an issue.
Can anyone tell me what these limits are?
Thanks very much for your time.
If you have a developer account (free), you have a max of 25 requests per minute. This includes if you have an app on the front end pulling/sampling the data. It includes all the GET PUT and socket connections as well.
Src: http://forums.electricimp.com/discussion/2108/max-frequency-of-updates-to-xively-wo-getting-booted/p1

Bumping an Amazon Mechanical Turk HIT

We have a web-based game for two players, which we offer via Amazon Mechanical Turk. For each game we need two players that will enter simultaneously, or at most 1 minute apart. We noticed that at the first few minutes after we publish the HIT, we get many workers, because the HIT is on the first page of their search results, but later the rate drops as the game moves to a previous page. So, in order to get enough simultaneous workers, we had to remove the HIT, and open a new HIT.
Is it possible, instead of deleting a HIT, and opening a new HIT to somehow "bump" / "poke" the old HIT to make it appear new?
This is possible in many ad websites, when after you publish an ad, you can bump it to the head of the ad list.
Using TurkPrime.com it is possible to bump a hit by using the "Restart" feature. It closes the first HIT and opens a new HIT excluding all workers who completed the first HIT. The effect is that it bumps the HIT to the top of Mechanical Turk when sorted by date.
With TurkPrime you use your own Amazon MTurk account, but they have an option for requesters with no MTurk account.
The only surefire way I know of to bump a HIT is to create different HITs of the same HIT Type and add HITs to that type. That bumps the creation date of the entire HIT Type for sorting purposes, which is what workers often search for. See my projects which use this approach:
https://github.com/HarvardEconCS/turkserver-meteor
https://github.com/HarvardEconCS/TurkServer
The other way to get concurrent workers for your HITs is to post on worker forums.
http://www.mturkforum.com/
http://www.cloudmebaby.com/
http://www.reddit.com/r/mturk
http://www.mturkgrind.com/
http://www.turkernation.com/
Generally, if you are going to do things that require more than two workers at the same time, you need to establish communication with the workers and schedule something in advance.