Amazon Mechanical Turk API - Using GetReviewableHITs to get all reviewable HITs regardless of HITTypeId - mechanicalturk

Current Result: Calling GetReviewableHITs without HITTypeId returns all HITs of the requester even if status is not equal to Reviewable. (seems to become analogous to the SearchHits method)
Desired Result: Calling GetReviewableHITs without HITTypeId returns all HITs of the requester where the status of each HIT is equal to Reviewable.
At the end of the day, I'm simply looking for an efficient means to get all of my HITs that are in the Reviewable state without having to supply a HITTypeId.
The GetReviewableHITs API specifies that that if HITTypeId is "not specified, all of the Requester's HITs are considered for the query." After testing, indeed all HITs are returned even with status not equal to Reviewable or Reviewing. So this appears to be by design of the API. So I'm looking for other ideas.
For example, do I indeed have to return all the HITs and iterate through each to find the reviewable ones? Without thousands of HITs this doesn't scale well. Or do I need to maintain my own state for the HITTypeIds? Seems I can't find an API to return me them so this becomes a lot of overhead to have to maintain my own database just for this one API.

The default call to GetReviewableHITs from the SDK I'm using (.NET) has a "Minimal ResponseGroup" meaning that the HITs returned did not have Status field. A later call to GetHIT with a detailed "ResponseGroup" allowed me to see that in fact all the status fields of the HITs returned were in the Reviewable state.

Related

Is there any time difference between different responded requests,

I am curious about one thing.
For example let's say, I am going to update my email on a website. My userId is: 1.
My request body for a successful request is:
currentEmail: example1#gmail.com,
newEmail: example2#gmail.com,
userId: 1
I will get a successful response for above. But the request below will be failed because there isn't any user with the userId of 2:
currentEmail: example1#gmail.com,
newEmail: example2#gmail.com,
userId: 2
Will there be any reasonable time difference between these two? Because second request won't trigger any database writing.
And also let's say I will try to find a user with a userId.
GET api/findUser/{userId}
If there isn't any user with the userId above, will there be any time difference between successful and failed requests?
If you're "curious" - go ahead and measure it, the real results will depend on the API implementation, DB implementation, caching on DB and ORM levels, etc.
For example if in 1st case the API just calls SQL Update statement - the execution time should be similar, however if the API builds a "user" DTO first - the attempt to amend non-existing user will be faster.
In the latter case my expectation is that attempt to get info for the user which doesn't exist will be faster, however it also "depends".
So you need to inspect the associated code execution footprint, query plans and maybe even load test the database separate from the API.

How to get a single item from my Amazon Dynamodb Table

My .Net code below is always returning a search.Matches.Count of 0 even though the movie is in the table. I've literally searched the whole internet but have not been able to get an answer, even on Amazon's AWS Developer website.
Please let me know what am I doing wrong? I appreciate your help. I'm totally new to this.
client = New AmazonDynamoDBClient(config)
table = Table.LoadTable(client, "MovieTable")
scanFilter = New ScanFilter
With scanFilter
.AddCondition("KeyCode", ScanOperator.NotEqual, MovieName)
.AddCondition("Status", ScanOperator.Equal, "In")
End With
search = table.Scan(scanFilter)
If search.Matches.Count = 1 then getMovieName
As the documentation explains, "Scan", a function which is supposed to go through the entire database, cannot go through the entire database at one fell swoop. Instead, it goes through it 1MB at a time, and after 1MB of data it returns to the caller, and you're supposed to ask to continue in the next page (again, see the documentation on how).
In your case, you have a very specific filter which matches only one item, but still - Scan will return after having read 1MB of data, even if none of the items in this 1MB match your request. It doesn't wait until 1MB of results have been collected! So in your use case it is not surprising that you're getting an empty result set, with LastEvaluatedKey set signalling that there are more pages to read.
By the way In your use case, where you are looking for just one item, doing a Scan of the entire database is obviously not a great choice (unless you're only doing this for debugging). a GetItem or Query operation will make more sense, if you can, and maybe a secondary index would be useful if you're searching by items not in the key.

Yammer API - Paging

I am trying to gather a range of messages through the rest API, and am aware that you can only retrieve 20 results at a time. I have tried incrementing a page variable, but this has no affect, and I am just getting the same results each time no matter the page number (https://www.yammer.com/api/v1/messages.json?page=6). I have proceeded to use the newer_than and older_than parameters to page through the results, and it works to some extent, but it appears to be excluding records. I am using the following approach below:
Since just setting a newer_than only results in the 20 most recent records as long as they are newer than the id that is sent in the newer_than parameter, I am also setting a dynamic older_than parameter.
Send request with only a newer than parameter. This returns the 20 most recent records. (eg. ww.yammer.com/api/v1/messages.json?newer_than=235560157)
Extract the ID of the 20th id in the JSON, and using this to populate the older_than parameter. The result is 20 different records. (eg.ww.yammer.com/api/v1/messages.json?newer_than=235560157&older_than=405598096)
Repeat step 2 until no results are returned since the newer_than and older_than parameters will eventually overlap.
The problem is that the set of records that is returned with this method is less than the number of records that is returned for messages from the data export API. I am working under the assumption that newer message IDs are always generated with a value greater than any older messages.
Could I possibly be misunderstanding how paging through results is supposed to be implemented with the REST API?
Any help would be much appreciated!
Thanks in advance!
First of all, the page parameter works only for the search API.
Secondly, the way you are trying to fetch messages will not return any comments on the messages or will return top 2 comments on any message based on the "extended" parameter. By default it returns 2 comments on every message. To get all the comments on the message you will have to get it individually message wise.
That must be causing the difference in the number of messages in the two methods mentioned.
I agree with Farhann - The rest API endpoint returns only top two comments for any message by default. To get all the comments for a post, you have to make a separate request.
With the use of the Data Export API, all the comments along with the message (public and private) are also exported which increases the count of the number of the messages. While, the API call returns only recent 2 comments on any message by default.
The data export includes private messages. Private messages will not be returned by that API call.
Check if the messages you are not seeing are private messages.

What is meaning of different fields returned by get login form call?

I am looking for specific meaning of following fields
valueIdentifier
valueMask
fieldType
FieldInfoMultiFixed
AutoRegFieldInfoSingle
FieldInfoMultiVariable
and in most cases we are getting numerical value for helpText. How do we identify whether helpText is present or not?
A lot of the stuff like FieldInfoMultiFixed/Variable is discussed in the Yodlee SDK Developer guide. Search for either one. They're just basically silly combos where people breakup a single value into multiple fields (like phone number or ssn into 3 textboxes)
As for the helpText, every time i've seen a Yodlee tech respond they say no. The number corresponds to an internal resource identifier that is apparently not exposed through the api. I want to say I saw somebody say that it might be available for things like forum signup/registration (where it would be more useful). The SDK makes mention as if it works as you would expect it to but that is an error.
Currently Yodlee does not have helptext populated for any field. Hence a numerical value is associated to it. In future if any helptext gets added then instead of numerical value you will have text in that field.
Hence if you are receiving numerical values then you should take it as helptext not present.
Shreyans

Flickr Geo queries not returning any data

I cannot get the Flickr API to return any data for lat/lon queries.
view-source:http://api.flickr.com/services/rest/?method=flickr.photos.search&media=photo&api_key=KEY_HERE&has_geo=1&extras=geo&bbox=0,0,180,90
This should return something, anything. Doesn't work if I use lat/lng either. I can get some photos returned if I lookup a place_id first and then use that in the query, except then all the photos returned are from anywhere and not the place id
Eg,
http://api.flickr.com/services/rest/?method=flickr.photos.search&media=photo&api_key=KEY_HERE&placeId=8iTLPoGcB5yNDA19yw
I deleted out my key obviously, replace with yours to test.
Any help appreciated, I am going mad over this.
I believe that the Flickr API won't return any results if you don't put additional search terms in your query. If I recall from the documentation, this is treated as an unbounded search. Here is a quote from the documentation:
Geo queries require some sort of limiting agent in order to prevent the database from crying. This is basically like the check against "parameterless searches" for queries without a geo component.
A tag, for instance, is considered a limiting agent as are user defined min_date_taken and min_date_upload parameters — If no limiting factor is passed we return only photos added in the last 12 hours (though we may extend the limit in the future).
My app uses the same kind of geo searching so what I do is put in an additional search term of the minimum date taken, like so:
http://api.flickr.com/services/rest/?method=flickr.photos.search&media=photo&api_key=KEY_HERE&has_geo=1&extras=geo&bbox=0,0,180,90&min_taken_date=2005-01-01 00:00:00
Oh, and don't forget to sign your request and fill in the api_sig field. My experience is that the geo based searches don't behave consistently unless you attach your api_key and sign your search. For example, I would sometimes get search results and then later with the same search get no images when I didn't sign my query.