Yammer API - Paging - api

I am trying to gather a range of messages through the rest API, and am aware that you can only retrieve 20 results at a time. I have tried incrementing a page variable, but this has no affect, and I am just getting the same results each time no matter the page number (https://www.yammer.com/api/v1/messages.json?page=6). I have proceeded to use the newer_than and older_than parameters to page through the results, and it works to some extent, but it appears to be excluding records. I am using the following approach below:
Since just setting a newer_than only results in the 20 most recent records as long as they are newer than the id that is sent in the newer_than parameter, I am also setting a dynamic older_than parameter.
Send request with only a newer than parameter. This returns the 20 most recent records. (eg. ww.yammer.com/api/v1/messages.json?newer_than=235560157)
Extract the ID of the 20th id in the JSON, and using this to populate the older_than parameter. The result is 20 different records. (eg.ww.yammer.com/api/v1/messages.json?newer_than=235560157&older_than=405598096)
Repeat step 2 until no results are returned since the newer_than and older_than parameters will eventually overlap.
The problem is that the set of records that is returned with this method is less than the number of records that is returned for messages from the data export API. I am working under the assumption that newer message IDs are always generated with a value greater than any older messages.
Could I possibly be misunderstanding how paging through results is supposed to be implemented with the REST API?
Any help would be much appreciated!
Thanks in advance!

First of all, the page parameter works only for the search API.
Secondly, the way you are trying to fetch messages will not return any comments on the messages or will return top 2 comments on any message based on the "extended" parameter. By default it returns 2 comments on every message. To get all the comments on the message you will have to get it individually message wise.
That must be causing the difference in the number of messages in the two methods mentioned.

I agree with Farhann - The rest API endpoint returns only top two comments for any message by default. To get all the comments for a post, you have to make a separate request.
With the use of the Data Export API, all the comments along with the message (public and private) are also exported which increases the count of the number of the messages. While, the API call returns only recent 2 comments on any message by default.

The data export includes private messages. Private messages will not be returned by that API call.
Check if the messages you are not seeing are private messages.

Related

BigQuery-Java: difference between QueryResponse and GetQueryResultsResponse

In sample code provided by Google, 2 classes are used to fetch results. QueryResponse and GetQueryResultsResponse.
I am not able to understand purpose of these 2 classes and do we have to use these 2 classes?
We are getting data from both: queryResponse.getRows() and queryResults.getRows()
I have gone through docs but could not figure out. what is the difference between these 2 classes and which is better to use?
Those two results are virtually identical (in fact, they are identical in the raw HTTP request). The difference is how you get them.
QueryResponse is returned by jobs.query(). This method can be used to run a query, but has only limited configuration options. It is intended as a convenience function. For more query options (such as setting a destination table, allowing large results, etc), use jobs.insert(). Another limitation of jobs.query() is that it may time out before the query has completed. Partly, this is because many clients (such as in AppEngine) require all HTTP requests to finish within 30 seconds or so. If jobs.query() times out, it will still report a job id that can be used to fetch the results with jobs.get_query_results().
GetQueryResultsResponse is returned by jobs.get_query_results(). This can be used to get the results of a query started by either jobs.query() or jobs.insert(). Query results (if you don't specify a destination table) are available for 24 hours after the query completes. jobs.get_query_results() allows you to fetch these results at any time. jobs.query() only gives you the query results once.
There is a further difference between the two, which is that jobs.query() just returns the first page of results. jobs.get_query_results() can be used to get multiple pages of results.
Hopefully this clarifies things a bit.

jsFiddle API to get row count of user's fiddles

So, I had a nice thing going on a jsFiddle where I listed all my fiddles on one page:
jsfiddle.net/show
However, they have been changing things slowly this year, and I've already had to make some changes to keep it running. The newest change is rather annoying. Of course, I like to see ALL my fiddles all at once, make it easier to just hit ctrl+f and find what I might be looking for, but they' made it hard to do now. Used to I could just set the limit to 99999, and see everything, but now it appears I can't go past how many i actually have (186 atm).
I tried using a start to limit solution, but when it got to last 10|50 (i tried like start={x}&limit10 and start={x}&limit50) it would die. Namely because last pull had to be exact count. Example, I have 186, and use the by 10's solution, then it would die at start=180&limit=10.
I've search the API docs but can't seem to find a row count or anything of that manner. Anyone know of a good feasible solution that wont have me overloading there servers doing a constant single row check?
I'm having the same problem as you are. Then I checked the docs (Displaying user’s fiddles - Result) and found out that if you include callback=Api parameter, an additional overallResultSetCount field is included in the JSON response. I checked your fiddles and currently you have total of 229 public fiddles.
The solution I can think of will force you to only request twice. The first request's parameters doesn't matter as long as you have callback=Api. Then you send the second request in which your limit will be overallResultSetCount value.
Edit:
It's not in the documentation, however, I think the result set is limited to 200 entries only (hence your start/limit is from 0 - 199). I tried to query more than the 200 range but I get a Error 500. I couldn't find another user whose fiddle count is more than 200 (most of the usernames I tested only have less than 100 fiddles like zalun, oskar, and rpflorence).
Based on this new observation, you can update your script like this:
I have tested that if the total fiddle count is less than 200,
adding start=0&limit=199 parameter will only return all the
fiddles. Hence, you can add that parameter on your initial call.
Check if your total result set is more than 200. If yes, update your
parameters to reflect the range for the remaining result set (in
this case, start=199&limit=229) and add the new result set to your
old result set. Else, show/print the result set you initially got from your first query.
Repeat steps 1 and 2, if your total count reaches 400, 600, etc (any
multiple of 200).

Getting specific Backbone.js models from a collection without getting all models first

I'm new to Backbone.js. I'm intrigued by the idea that you can just supply a URL to a collection and then proceed to create, update, delete, and get models from that collection and it handle all the interaction with the API.
In the small task management sample applications and numerous demo's I've seen of this on the web, it seems that the collection.fetch() is used to pull down all models from the server then do something with them. However, more often than not, in a real application, you don't want to pull down hundreds of thousands or even millions of records by issuing a GET statement to the API.
Using the baked-in connection.sync method, how can I specify parameters to GET specific record sets? For example, I may want to GET records with a date of 2/1/2014 or GET records that owned by a specific user id.
In this question, collection.find is used to do this, but does this still pull down all records to the client first then "finds" them or does the collection.sync method know to specify arguments when doing a GET to the server?
You do use fetch, but you provide options as seen in collection.fetch([options]).
So for example to obtain the one model where id is myIDvar:
collection.fetch(
{
data: { id: myIDvar },
success: function (model, response, options) {
// do a little dance;
}
};
My offhand recollections is that find, findWhere and where would invoke all models being downloaded and then the filtering taking place on the client. I believe with fetch the filtering takes places on the server side.
You can implement some kind of pagination on server side and update your collection with limited number of records. In this case all your data will be up to date with backend.
You can do it by overriding fetch method with you own implementaion, or specify params
For example:
collection.fetch({data: {page: 3})
You can also use find where method here
collection.findWhere(attributes)

CouchDB API query with ?limit=0 returns one row - bug or feature?

I use CouchDB 1.5.0 and noticed a strange thing:
When I query some API action, for example:
curl -X GET "http://localhost:5984/mydb/_changes?limit=1"
I get the same result with limit=1 and with limit=0 and with limit=-55. In all cases is a one row from the start of list.
Although, PostgreSQL returns:
Zero rows when LIMIT 0
Message ERROR: LIMIT must not be negative when LIMIT -55
My question is mainly concerned with the API design. I would like to know your opinions.
It's a flaw or maybe it's good/acceptable practice?
This is how the _changes api is designed. If you do not specify the type of feed i.e long-poll, continuous etc the default is to return a list of all the changes in a single results array.
If you want a row by row result of the changes in the database specify the type of feed in the url like so
curl -X GET "http://localhost:5984/mydb/_changes?feed=continuous"
Another point to note that in the _changes api using 0 has the same effect as using 1 in limit parameter.

BigQuery paging issues with tableData.list()

We're trying to load 120,000 rows from a BigQuery table using tableData.list. Our first call,
https://www.googleapis.com/bigquery/v2/projects/{myProject}/datasets/{myDataSet}/tables/{myTable}/data?maxResults=90000
returns a pageToken as expected and the first 22482 (1 to 22482) rows. We assume this is due to the 10mb serialized JSON limit. Our second call however,
https://www.googleapis.com/bigquery/v2/projects/{myProject}/datasets/{myDataSet}/tables/{myTable}/data?maxResults=90000&pageToken=CIDBB777777QOGQIBCIL6BIQSC7QK===
returns not the next rows, but 22137 rows starting at row 900001 to the 112137, without a pageToken.
Strangely, If we change maxResults to 100,000, we get rows starting from 100,000.
We are able to work around this using startRowIndex to page. In this case, we start with the first call being startRowIndex =0 and we never get a pageToken in the response. We keep making calls until all rows are retrieved. However, the concern is without pageTokens, if the row order changes while the calls are being made, the data could be invalid.
We are able to reproduce this behavior on multiple tables with different sizes and structures.
Is there any issue with paging, or should we be structuring our calls differently?
This is a known, high priority bug. That has been fixed.