Youtube API problem - when searching for playlists, start-index does not work past 100 - api

I have been trying to get the full list of playlists matching a certain keyword. I have discovered however that using start-index past 100 brings the same set of results as using start-index=1. It does not matter what the max-results parameter is - still the same results. The total results returned however is way above 100, thus it cannot be that the query returned only 100 results.
What might the problem be? Is it a quota of some sort or any other authentication restriction?
As an example - the queries bring the same result set, whether you use start-index=1, or start-index=101, or start-index = 201 etc:
http://gdata.youtube.com/feeds/api/playlists/snippets?q=%22Jan+Smit+Laura%22&max-results=50&start-index=1&v=2
Any idea will be much appreciated!
Regards
Christo

I made an interface for my site, and the way I avoided this problem is to do a query for a large number, then store the results. Let your web page then break up the results and present them however is needed.
For example, if someone wants to do a search of over 100 videos, do the search and collect the results, but only present them with the first group, say 10. Then when the person wants to see the next ten, you get them from the list you stored, rather than doing a new query.
Not only does this make paging faster, but it cuts down on the constant queries to the YouTube database.
Hope this makes sense and helps.

Related

Get a random search result from Amazon CloudSearch

My query is like:
/2013-01-01/search?q=(and author:'william' category:'Videos')&q.parser=structured&expr.random=_rand&return=_all_fields&size=1
and returns a video. However, I want a random videoId on every request.
Using the expression &expr.random=_rand; I'm unable to fetch a random result and I have failed to find any solution in documentation.
How can I get a random search result on every request?
The CloudSearch docs are quite lacking. I needed something similar and came up short with Google so I just started guessing by applying what I already knew about Solr and I found a solution:
/2013-01-01/search?q=what&sort=_rand_1 desc
Note that _rand will also work however after the first search it will always return the same results which is actually ideal if you need to paginate through a random result set (it works the same way in Solr). So, to get random results every time you need to randomly generate an _something and append it to _rand.
You can accomplish this through pagination by setting the start param to a random value from 0 to hits.found and requesting size=1:
search?q=matchall&q.parser=structured&size=1&start={yourRandomNumber}
If the number of documents in your index is fluctuating, you'll need to make 2 queries: one to get the max number of results (comes back as hits.found), and another to retrieve the random result.

BigQuery-Java: difference between QueryResponse and GetQueryResultsResponse

In sample code provided by Google, 2 classes are used to fetch results. QueryResponse and GetQueryResultsResponse.
I am not able to understand purpose of these 2 classes and do we have to use these 2 classes?
We are getting data from both: queryResponse.getRows() and queryResults.getRows()
I have gone through docs but could not figure out. what is the difference between these 2 classes and which is better to use?
Those two results are virtually identical (in fact, they are identical in the raw HTTP request). The difference is how you get them.
QueryResponse is returned by jobs.query(). This method can be used to run a query, but has only limited configuration options. It is intended as a convenience function. For more query options (such as setting a destination table, allowing large results, etc), use jobs.insert(). Another limitation of jobs.query() is that it may time out before the query has completed. Partly, this is because many clients (such as in AppEngine) require all HTTP requests to finish within 30 seconds or so. If jobs.query() times out, it will still report a job id that can be used to fetch the results with jobs.get_query_results().
GetQueryResultsResponse is returned by jobs.get_query_results(). This can be used to get the results of a query started by either jobs.query() or jobs.insert(). Query results (if you don't specify a destination table) are available for 24 hours after the query completes. jobs.get_query_results() allows you to fetch these results at any time. jobs.query() only gives you the query results once.
There is a further difference between the two, which is that jobs.query() just returns the first page of results. jobs.get_query_results() can be used to get multiple pages of results.
Hopefully this clarifies things a bit.

jsFiddle API to get row count of user's fiddles

So, I had a nice thing going on a jsFiddle where I listed all my fiddles on one page:
jsfiddle.net/show
However, they have been changing things slowly this year, and I've already had to make some changes to keep it running. The newest change is rather annoying. Of course, I like to see ALL my fiddles all at once, make it easier to just hit ctrl+f and find what I might be looking for, but they' made it hard to do now. Used to I could just set the limit to 99999, and see everything, but now it appears I can't go past how many i actually have (186 atm).
I tried using a start to limit solution, but when it got to last 10|50 (i tried like start={x}&limit10 and start={x}&limit50) it would die. Namely because last pull had to be exact count. Example, I have 186, and use the by 10's solution, then it would die at start=180&limit=10.
I've search the API docs but can't seem to find a row count or anything of that manner. Anyone know of a good feasible solution that wont have me overloading there servers doing a constant single row check?
I'm having the same problem as you are. Then I checked the docs (Displaying user’s fiddles - Result) and found out that if you include callback=Api parameter, an additional overallResultSetCount field is included in the JSON response. I checked your fiddles and currently you have total of 229 public fiddles.
The solution I can think of will force you to only request twice. The first request's parameters doesn't matter as long as you have callback=Api. Then you send the second request in which your limit will be overallResultSetCount value.
Edit:
It's not in the documentation, however, I think the result set is limited to 200 entries only (hence your start/limit is from 0 - 199). I tried to query more than the 200 range but I get a Error 500. I couldn't find another user whose fiddle count is more than 200 (most of the usernames I tested only have less than 100 fiddles like zalun, oskar, and rpflorence).
Based on this new observation, you can update your script like this:
I have tested that if the total fiddle count is less than 200,
adding start=0&limit=199 parameter will only return all the
fiddles. Hence, you can add that parameter on your initial call.
Check if your total result set is more than 200. If yes, update your
parameters to reflect the range for the remaining result set (in
this case, start=199&limit=229) and add the new result set to your
old result set. Else, show/print the result set you initially got from your first query.
Repeat steps 1 and 2, if your total count reaches 400, 600, etc (any
multiple of 200).

Getting all Twitter Follows (ids) with Groovy?

I was reading an article here and it looks like he is grabbing the IDs by the 100s. I thought it was possible to grab by 5000 each time?
The reason I'm asking is because sometimes there are profiles with much larger amounts of followers and you wouldn't have enough actions to do it all in one hour if one was to grab it by 100 each time.
So is it possible to grab 5000 ids each time, if so, how would I do this?
GET statuses/followers as shown in that article has been deprecated, but did used to return batches of 100
If you're trying to get follower ids, you would use GET followers/ids. This does return batches of up to 5000, and should just require you to change the URL slightly (see example URL at the bottom of the documentation page)

Flickr Geo queries not returning any data

I cannot get the Flickr API to return any data for lat/lon queries.
view-source:http://api.flickr.com/services/rest/?method=flickr.photos.search&media=photo&api_key=KEY_HERE&has_geo=1&extras=geo&bbox=0,0,180,90
This should return something, anything. Doesn't work if I use lat/lng either. I can get some photos returned if I lookup a place_id first and then use that in the query, except then all the photos returned are from anywhere and not the place id
Eg,
http://api.flickr.com/services/rest/?method=flickr.photos.search&media=photo&api_key=KEY_HERE&placeId=8iTLPoGcB5yNDA19yw
I deleted out my key obviously, replace with yours to test.
Any help appreciated, I am going mad over this.
I believe that the Flickr API won't return any results if you don't put additional search terms in your query. If I recall from the documentation, this is treated as an unbounded search. Here is a quote from the documentation:
Geo queries require some sort of limiting agent in order to prevent the database from crying. This is basically like the check against "parameterless searches" for queries without a geo component.
A tag, for instance, is considered a limiting agent as are user defined min_date_taken and min_date_upload parameters — If no limiting factor is passed we return only photos added in the last 12 hours (though we may extend the limit in the future).
My app uses the same kind of geo searching so what I do is put in an additional search term of the minimum date taken, like so:
http://api.flickr.com/services/rest/?method=flickr.photos.search&media=photo&api_key=KEY_HERE&has_geo=1&extras=geo&bbox=0,0,180,90&min_taken_date=2005-01-01 00:00:00
Oh, and don't forget to sign your request and fill in the api_sig field. My experience is that the geo based searches don't behave consistently unless you attach your api_key and sign your search. For example, I would sometimes get search results and then later with the same search get no images when I didn't sign my query.