Google Classroom API suddenly returning quota errors - api

I have routines that synchronize Class/Roster information between an SIS and Google Classroom. Everything has been running smoothly until very recently (11/1/2016). Now we're seeing the following message in all of our log files for routines that handle Classroom syncs.
Insufficient tokens for quota group and limit 'DefaultGroupUSER-100s' of service 'classroom.googleapis.com', using the limit by ID...
We perform batch requests whenever possible and these errors are showing up in individual batch "part" responses. The fact that these errors suddenly started showing up for ALL of our Classroom routines makes me think that something changed on the Google end of things.
I've been playing around with the throttling on our end by changing both the number of requests that we send in each batch (docs say that you can send 1000 per batch) as well as the total number of requests/batches that we're sending per 100 seconds (docs say you can send 50/s/client and also 5/s/user). Interestingly, the quotas indicated in the development console display slightly different but I assume they are to be interpreted in conjunction with one another.
I've throttled things down to the point where we're not even getting close to 5 requests per second and I'm still getting these errors back from the server.
Can someone provide some suggestions or solutions. Has anyone experienced this lately?
Let me know if I any additional information is needed.

Related

What is the difference between parsing betting website for live scores vs official website API?

I want to monitor some live scores on soccer matches. I have 2 ways to do this:
official api from the website(free)
parse websites source code myself and get data from it( need to do it every second)
What is the difference? Is calling API faster?
This can depend on quite a lot external to this specific scenario, but given the context, yes the API's would much faster. The difference is in what data is being sent/received/parsed.
In either scenario you'd need some timer to tick and parse the results (website or API) so there's no performance difference in the "wait code", but the big difference will be in the data itself that is parsed. When you call the API, chances are more likely that you will send a specific parameter or call a specific function that indicates what you're looking for, pseudo-code example:
SoccerSiteApi.GetValue(SCORE, team1, team2);
Or
SoccerSiteApi.GetCurrentScores(team1, team2);
By calling the API, you are only sending and receiving a few hundred bytes (or more depending on data) and getting back exactly what you want, that is, you don't need to parse the scores out of the values sent back since they are the scores, so no processing time is spent doing anything additional with the data itself.
If, however, you were to parse the entire web site, you would need to make an HTTP GET request (and all that entails) to get the entire page (which could be a couple hundred KB or MB depending on content) and then spend processing time extracting the exact data you were looking for, and then doing this every second.
So the biggest difference is amount of data and time spent processing it.
Hope that can help

Using the Google Directory API to provision thousands of users

I'm trying to write an application that creates mail accounts for thousands of users using the Google Directory API. Creating them one by one works, but is extremely slow. I tried to use the batch requests which is suppose to support up to 1000 requests at once. However with that, only around 50 users are created successfully and the rest of the requests throw 403 errors. If I change the batch size to 40 instead, after the first batch, many requests fail with 5xx errors.
If the batch requests are still limited by the same rate limits, the seem to be worthless as I could just send those requests individually at that slow rate. Is there a better way to do this or is there something else I should do instead?
Batching the requests will certainly save network roundtrips (which can be pretty expensive if you have thousand of users to process). However, the server will still have to execute the request one by one even if it is batched. Take a look at the documentation on Admin SDK
https://developers.google.com/admin-sdk/directory/v1/guides/batch
The special note said: "A set of n requests batched together counts toward your usage limit as n requests, not as one request. The batch request is taken apart into a set of requests before processing."

Why isn't an update reflected right away when posting changes to a Rally story via the REST API?

I've noticed when retrieving a story/defect after first updating it, sometimes the retrieve response returns the field values as if the update never happened. Retrying the retrieve after a short delay (~500ms) returns the updated field values as expected. Is this a known behaviour? Is there any way of avoiding this?
I'm using the Rally API 2.0 - https://rally1.rallydev.com/slm/webservice/v2.0/
The update is being performed using this URI:
POST /slm/webservice/v2.0/Defect/14173461229?key=<key> HTTP/1.1
I'm retrieving the story after update as follows:
GET /slm/webservice/v2.0/artifact?query=(ObjectId%20=%2014173461229)&start=1&pagesize=20&fetch=true HTTP/1.1
What is your integration doing that it needs to re-poll the artifact within < 1 second of POST'ing an update? Is there a second process that does polling that is revealing the latency for the updates? Does your integration run multiple threads? Does the response time vary at all depending on time of day, etc.? There are any number of factors that could be at play here, but 500 ms doesn't seem like an un-reasonable refresh rate given factors such as latency over HTTP/S as well as server-side database and cache updates. That said, for an in-depth look you may wish to inquire with Rally Support (rallysupport#rallydev.com) as they have tools that can help evaluate server-side response time corresponding to requests by specific UserID.

Fetching stats on multiple tracks from soundcloud

I want to get all tracks with 0 to 1 plays and am looking at the playback_count stats from http://api.soundcloud.com/tracks/90891876.json?client_id=XXX URL, where playback_count is included in the json response. We have almost 1500 sound snippets, is it possible to make a script that fetches this data ~1500 times or will I get throttled for spamming the connection to the API? We will only use this stats a couple of times to measure how our campaign is going trying to increase plays. Or is it possible to get this data in just one request?
I saw this question just earned the "Tumbleweed" badge and I felt bad.
If the tracks are all owned by the same user, you can use this endpoint:
http://api.soundcloud.com/users/{id}/tracks
If you just have a list of tracks, you can use this endpoint:
http://api.soundcloud.com/tracks?ids=123,234,765,456,etc
See the "filters" section of the docs here: http://developers.soundcloud.com/docs/api/reference#tracks
But keep in mind that although the HTTP spec does not impose a limit on the length of the querystring, the default apache settings will return an error somewhere around 4000 characters. That's probably around 400 tracks for this endpoint. Play around with it. Maybe soundcloud has a limit on the number of tracks per query.
You could put your players embedded on a website (server) and track them with Analytics. I made a script for that: http://vitorventurin.com/tracking-soundcloud-with-google-analytics/

How to skip known entries when syncing with Google Reader?

for writing an offline client to the Google Reader service I would like to know how to best sync with the service.
There doesn't seem to be official documentation yet and the best source I found so far is this: http://code.google.com/p/pyrfeed/wiki/GoogleReaderAPI
Now consider this: With the information from above I can download all unread items, I can specify how many items to download and using the atom-id I can detect duplicate entries that I already downloaded.
What's missing for me is a way to specify that I just want the updates since my last sync.
I can say give me the 10 (parameter n=10) latest (parameter r=d) entries. If I specify the parameter r=o (date ascending) then I can also specify parameter ot=[last time of sync], but only then and the ascending order doesn't make any sense when I just want to read some items versus all items.
Any idea how to solve that without downloading all items again and just rejecting duplicates? Not a very economic way of polling.
Someone proposed that I can specify that I only want the unread entries. But to make that solution work in the way that Google Reader will not offer this entries again, I would need to mark them as read. In turn that would mean that I need to keep my own read/unread state on the client and that the entries are already marked as read when the user logs on to the online version of Google Reader. That doesn't work for me.
Cheers,
Mariano
To get the latest entries, use the standard from-newest-date-descending download, which will start from the latest entries. You will receive a "continuation" token in the XML result, looking something like this:
<gr:continuation>CArhxxjRmNsC</gr:continuation>`
Scan through the results, pulling out anything new to you. You should find that either all results are new, or everything up to a point is new, and all after that are already known to you.
In the latter case, you're done, but in the former you need to find the new stuff older than what you've already retrieved. Do this by using the continuation to get the results starting from just after the last result in the set you just retrieved by passing it in the GET request as the c parameter, e.g.:
http://www.google.com/reader/atom/user/-/state/com.google/reading-list?c=CArhxxjRmNsC
Continue this way until you have everything.
The n parameter, which is a count of the number of items to retrieve, works well with this, and you can change it as you go. If the frequency of checking is user-set, and thus could be very frequent or very rare, you can use an adaptive algorithm to reduce network traffic and your processing load. Initially request a small number of the latest entries, say five (add n=5 to the URL of your GET request). If all are new, in the next request,
where you use the continuation, ask for a larger number, say, 20. If those are still all new, either the feed has a lot of updates or it's been a while, so continue on in groups of 100 or whatever.
However, and correct me if I'm wrong here, you also want to know, after you've downloaded an item, whether its state changes from "unread" to "read" due to the person reading it using the Google Reader interface.
One approach to this would be:
Update the status on google of any items that have been read locally.
Check and save the unread count for the feed. (You want to do this before the next step, so that you guarantee that new items have not arrived between your download of the newest items and the time you check the read count.)
Download the latest items.
Calculate your read count, and compare that to google's. If the feed has a higher read count than you calculated, you know that something's been read on google.
If something has been read on google, start downloading read items and comparing them with your database of unread items. You'll find some items that google says are read that your database claims are unread; update these. Continue doing so until you've found a number of these items equal to the difference between your read count and google's, or until the downloads get unreasonable.
If you didn't find all of the read items, c'est la vie; record the number remaining as an "unfound unread" total which you also need to include in your next calculation of the local number you think are unread.
If the user subscribes to a lot of different blogs, it's also likely he labels them extensively, so you can do this whole thing on a per-label basis rather than for the entire feed, which should help keep the amount of data down, since you won't need to do any transfers for labels where the user didn't read anything new on google reader.
This whole scheme can be applied to other statuses, such as starred or unstarred, as well.
Now, as you say, this
...would mean that I need to keep my own read/unread state on the client and that the entries are already marked as read when the user logs on to the online version of Google Reader. That doesn't work for me.
True enough. Neither keeping a local read/unread state (since you're keeping a database of all of the items anyway) nor marking items read in google (which the API supports) seems very difficult, so why doesn't this work for you?
There is one further hitch, however: the user may mark something read as unread on google. This throws a bit of a wrench into the system. My suggestion there, if you really want to try to take care of this, is to assume that the user in general will be touching only more recent stuff, and download the latest couple hundred or so items every time, checking the status on all of them. (This isn't all that bad; downloading 100 items took me anywhere from 0.3s for 300KB, to 2.5s for 2.5MB, albeit on a very fast broadband connection.)
Again, if the user has a large number of subscriptions, he's also probably got a reasonably large number of labels, so doing this on a per-label basis will speed things up. I'd suggest, actually, that not only do you check on a per-label basis, but you also spread out the checks, checking a single label each minute rather than everything once every twenty minutes. You can also do this "big check" for status changes on older items less often than you do a "new stuff" check, perhaps once every few hours, if you want to keep bandwidth down.
This is a bit of bandwidth hog, mainly because you need to download the full article from Google merely to check the status. Unfortunately, I can't see any way around that in the API docs that we have available to us. My only real advice is to minimize the checking of status on non-new items.
The Google API hasn't yet been released, at which point this answer may change.
Currently, you would have to call the API and dis-regard items already downloaded, which as you said isn't terribly efficient as you will be re-downloading items every time, even if you already have them.