In mechanical turk, How to limit a HIT is accept by one worker - mechanicalturk

In mechanical turk, How to limit a HIT is accept by one worker, after the first one is finished other worker can accept it. The HIT is just accepted by one work per time. It has many assignments during the life time.
Specifically, One HIT is finished by worker in sequence: one by one.

There is no built-in functionality to do this. Using the API and your own server, you could configure your own application that accepts Notifications from MTurk, e.g. HITCompleted or AssignmentSubmitted, and then trigger new assignments and/or HITs.

Related

Understanding why you would want to process Message Queues at a future time

So I'm trying to understand what practical problems Queues solve. By reading all the information from Google, I get the high-level.
Push message to Queue for processing at a later time
So I'm looking at an architecture from Company A and they have different use cases for Job Queueing like for example
chat messages
file conversion
searching
Heavy sql queries
Why process it at a later time?
Here's my best guess...
Let's say I have an application that can process 10 "things" at a time.
My application then maxes out it's processing capacity.
an 11th request came in so app puts it in the Queue for later processing
Assuming this is a valid Use Case, wouldn't adding more servers to process more "things" make sense? Is it because it's more costly to add more servers than employ a Queue and sacrifice response time a little bit?
Given my Use Case examples, what other problems would Queues solve for them?
Have you ever lined up at a bank when it is busy? You would have waited in a queue.
"But," you could say, "wouldn't adding more staff to process more customers make sense? Is it because it's more costly to add more staff than employ a Queue and sacrifice response time a little bit?"
That would be correct. It can be quite costly to staff a bank based on the peak number of customers who would arrive each day. It is cheaper to staff below this level and have some customers wait in a queue.
Also, the number of customers each day are not 100% predictable. A queue allows excess demand to wait without breaking the system.
Queues enable decoupling.
For example, imagine an online store where customers purchase an item. They select the item, provide a credit card number and click 'Purchase'. If the credit card is declined, the online store can immediately prompt them to re-enter the number. This interaction has to take place immediately while the customer is still online.
However, there is no need to have the customer wait while an invoice is generated, a record is added to the accounting system and inventory is pulled off the shelf. This can be decoupled from the ordering process. A good way to do it is to push the order into a queue, which can be handled by the next system.
If that 'next system' happens to be offline at the moment, there is no reason to cancel the whole sale. The transaction can be processed when the 'next system' comes back online. This is much better than failing the whole process just because one component (which is not required immediately) has a failure.
Bottom line: Queues are excellent. They enable better handling of failures. They makes things more resilient (just wait a few minutes and try again!). They should be used at all times when the process is compatible with a queuing architecture.
Let's do scenarios
Scenario 1 without queue:
you request an endpoint /blabla/do-eveything/
this request do
download an image from very slow FTP
e.g 1.5 sec (can error, retry ? add +X sec)
attach the image to an email
send an email (3 sec)
e.g 1 sec (can error, retry ? add +X sec)
confirmation received > store confirmation to a third company tracking stuff
e.g 1.5 (can error, retry ? add +X sec)
when tracking confirm, update your data from another third company for big data purpose
e.g 2 sec (can error, retry ? add +X sec)
... you get the idead
return the response e.g 11 sec later (this is to slow) or more or timeout when everything failed
End user said internet was faster 20 years ago, maybe I need to change my internet connection or change my 16 threads
Scenario 2 queue everything you can:
you request an endpoint /blabla/do-eveything/
this request do
Queue job "DO_EVERYTHING"
e.g 0.02 sec
Return the response less then 0.250 sec
End user said that is website/app is too fast, I can keep my 56K internet connection
on queue/event system one failed job can be retry later without affeting the end user
you can pause job, add a unlimited number a task/step after the original message
better fault tolerance
Working with queue will allow you a better micro/nano service architecture, better testing because, you can test a single job, intead of a full controller that do everything...
Ye, is maybe more work, more thinking, but a the end no need to think about the work when holidays

Is boost::asio asyn_read with timer a good idea?

My server app needs to keep thousands of TCP connections. One time, I used one timer for each connection. Once a timer is expired, my code will check database to see if there is a message is ready for sending or not, if found then send it to remote client. This design works but the performance is very very slow, because there are thousands of timers in my app. My friend asked me to remove all timers and use one thread to check the database and send them to all remote clients in for(...) loop.
But I see a lot of articles that introcuce how to use dead_line_timer with async_read, see below link
http://www.boost.org/doc/libs/1_40_0/doc/html/boost_asio/example/timeouts/stream_receive_timeout.cpp
My question is, does this work well when server has thousands of connections? I guess not, how do you think?
I think the timers are not your main performance problem. They have of course some penalty, but by far not in the dimensions that the IO itself has.
I could imagine that your main problem is that you have a a large delay betweeen change-in-db -> timer-expired -> send happens. Another problem could be that you check your whole DB when a timer expires? If yes then you could only set a flag when an update happens, check for that in the timer and reset in when you sent the update.
Can you directly send the changes after they happen so that you avoid the timers at all? You could use io_service->post() to trigger an update function which sends the update to all connected clients. You should also use the async_write methods to avoid that a single client blocks your whole application.
If you don't want to send all updates but only in given intervals then your friends suggestion to use a single timer for checking for changes and sending the updates sounds also good.

Philips Hue command limitation

First of all I'm developing my own C# library for controlling Philips Hue, which means I'm not using the official SDK. (I'm guessing that the SDK will make sure you won't have any problems)
I'm a little confused about the limitation in the Core concepts page in the API, which states:
We can’t send commands to the lights too fast. If you stick to around 10 commands per second to the /lights resource as maximum you should be fine. For /groups commands you should keep to a maximum of 1 per second.
I intend to respect this limitation, but does the limitation still apply when you are performing GET requests on the /lights resource, or is it only for sending actual commands with PUT requests to /lights/<id>/state that change the state of the light? Same question goes for the /groups resource.
Also is it even possible to damage anything by sending too many requests, or will it just take longer to get all responses?
Edit:
My overall question is: How should I understand the API limitation?
A more specific sub-question is: Should I wait 100 ms before sending another /lights command, relative to when I received a response, or relative to when I sent the previous command?
Another sub-question is: Should I consider this limitation only when using PUT requests on e.g. /lights/<id>/state, or on all request types GET/PUT/POST/DELETE
I don't know if anything was changed in firmware updates, but I have discovered that the bridge might not be so simple as you would think, and that the API description isn't very clear.
I've done a little testing while running firmware 01009914.
The bridge seems to have some kind of queue of incoming commands. I sent {"bri":254} to a group 9 times and 1 final command of {"bri":1}. From the first command to when the light is actually dimmed, takes roughly 3-4 seconds. Each time I sent a command the bridge replied almost instantly with success token.
I did the same small tests sending other commands, 10 of each JSON object:
{"bri":254} 3-4 seconds
{"on":true, "bri":254} 6-7 seconds
{"on":true, "bri":254, "alert":"none", "effect":"none"} 12-13 seconds
This actually shows that each change of attributes takes roughly 0.3 seconds for the bridge to handle.
I will claim that for each attribute we change, the bridge takes about 300 ms to finish, and the limitation of commands should be understood as: As long as you stick with changing one attribute of a group each second, you should be fine.
Note: I only tried with one group consisting of three lights, and I don't know if the bridge actually does have a queue of incoming commands, and in case it does have a queue, I don't know what the limit of items in it is.
Edit:
Now we have some official clarification of the Hue System Performance.
I'm fairly certain that the 10 commands per second is a guideline to prevent failure of the Bridge, and is a technical limitation of the hardware. Any more than that and you're apt to overload the bridge. I believe this applies to commands as well as requests.
Both approaches are reasonable. For laziness' sake, you could wait for 100ms to send a response, but I would only rely on this method if you don't plan on any other interactions with the Bridge.
I consider this limitation on all request types.
You won't damage anything if you send commands too fast. However, if you send commands too fast the bridge might become unresponsive and/or some messages can be ignored.
When it comes to the bridge, the way I think of it is that the bridge is more or less single threaded, so it works best if you make sure you don't send the next command before the previous one has returned.
In practice we've found that this works much better than waiting a fixed time between each request. In fact, you can pretty much send commands as fast as you want as long as you wait for the previous one to finish.
When you send a command to the bridge, the bridge has to then send it to the lamps through Zigbee. Since it's a mesh network in some cases the message has to make a couple of hops from lamp to lamp before it reaches the target. Depending on how many lamps you have and how many hops the signal needs to take, this can take a while. Also, it's possible that some messages randomly take much longer than others.
In general the system is not designed to handle very fast changes, but if you keep the above in mind you can make many cool effects :)

Long polling blocking multiple windows?

Long polling has solved 99% of my problems. There is now just one other problem. Imagine a penny auction site, where people bid. On the frontpage, there are several Auctions.
If the user opens three of these auctions, and because javascript is not multithreaded, how would you get the other pages to ever load? Won't they always get bogged down and not load because they are waiting for long polling to end? In practice, I've experienced this and I can't think of a way around it. Any ideas?
There are two ways that javascript gets around some of this.
While javascript is single threaded conceptually, it does its io in separate threads using completion handlers. This means other pieces of javascript can be running while you are waiting for your network request to complete.
Javascript for each page (or even each frame in each page) is isolated from Javascript on the other pages/frames. This means that each copy of javascript can be running in its own thread.
A bigger issue for you is likely to be that browsers often limit the number of concurrent connections to a given site, and it sounds like you want to make many concurrent connections to the same site. In this case you will get a lock up.
If you control both the sever and client, you will need to combined the multiple long-poll request from the client into a single long-poll request to the server.

Work managers threads constraint and page cannot be displayed

We have a memory intensive processing for certain functionality and we would like to limit the number of parallel requests to this processing. We are able to configure by using "Work Managers" in WebLogic and putting a limit on the number of threads for that servlet.
For example, if we put maximim thread limit as 3, then if there are 10 parallel requests; 7 requests are in queue. There could be situations where these the requests waiting in queue could take up to 30-40 minutes to be processed. We did simple testing and the received page cannot be displayed due to timeout after 15 mins and received the message after 1 hour.
Does any one know if there is a setting in WebLogic to increase/decrease timeout and avoid page cannot be displayed?
Appreciate if any one has any thoughts around this.
Does any one know if there is a setting in WebLogic to increase/decrease timeout and avoid page cannot be displayed?
There might be something but I actually didn't check as it would be a bad advice anyway. By looking for this, you are trying to solve the wrong problem here. A browser is just not made for long-running process like the one you are describing (>30mn) even if you don't mind the user waiting (not mentioning that he could refresh the page and queue more and more jobs).
So, the right answer here is in my opinion: use asynchronism, this is the perfect use case. When the user clicks on the button, send a JMS message to a queue (or create a Quartz job) and send the user a page with a request ID telling him to come back later. When the processing is done, update the status somewhere and make the status/result available to the user. Really, the user experience will be better doing this and you'll face less problems than with a browser.
1) Use some other tool (not browser) like WGET where you can control timeout parameter (--timeout).
2) Why do you use HTTP? Use message driven beans and send message JMS to that and don't care about time outs.
Perhaps quartz can do what you need? Start a job and check in on it as you need to?