Using the Google Directory API to provision thousands of users - batch-processing

I'm trying to write an application that creates mail accounts for thousands of users using the Google Directory API. Creating them one by one works, but is extremely slow. I tried to use the batch requests which is suppose to support up to 1000 requests at once. However with that, only around 50 users are created successfully and the rest of the requests throw 403 errors. If I change the batch size to 40 instead, after the first batch, many requests fail with 5xx errors.
If the batch requests are still limited by the same rate limits, the seem to be worthless as I could just send those requests individually at that slow rate. Is there a better way to do this or is there something else I should do instead?

Batching the requests will certainly save network roundtrips (which can be pretty expensive if you have thousand of users to process). However, the server will still have to execute the request one by one even if it is batched. Take a look at the documentation on Admin SDK
https://developers.google.com/admin-sdk/directory/v1/guides/batch
The special note said: "A set of n requests batched together counts toward your usage limit as n requests, not as one request. The batch request is taken apart into a set of requests before processing."

Related

Get github API results more than 100

I want to develop github alike issue tracker.
For that i have been working on this below api.
https://api.github.com/repos/facebook/react/issues?per_page=100
But this api results only 100 results per request as per docs.
Is there a way i can get all results of issues and not just 100,i can make multiple request but i don't think it is feasible way of doing it.
Issue object itself contain author,label,assignee so needed all results at once.
Is there any way to do it?
No, there is no way to get all of the results without pagination. GitHub, like almost all major web sites, has a time limit on the amount of time a request can take. If you have a repository with, say, 150 000 issues, then any reasonable operation on all of those issues will take longer than the timeout. Therefore, it doesn't make sense for GitHub to allow you to disable pagination in this way because the request would invariably fail anyway.
Even if you use the GraphQL API, you still get a limited number of results. If you want to fetch all of the issues, you'll need to make multiple requests.

Google Classroom API suddenly returning quota errors

I have routines that synchronize Class/Roster information between an SIS and Google Classroom. Everything has been running smoothly until very recently (11/1/2016). Now we're seeing the following message in all of our log files for routines that handle Classroom syncs.
Insufficient tokens for quota group and limit 'DefaultGroupUSER-100s' of service 'classroom.googleapis.com', using the limit by ID...
We perform batch requests whenever possible and these errors are showing up in individual batch "part" responses. The fact that these errors suddenly started showing up for ALL of our Classroom routines makes me think that something changed on the Google end of things.
I've been playing around with the throttling on our end by changing both the number of requests that we send in each batch (docs say that you can send 1000 per batch) as well as the total number of requests/batches that we're sending per 100 seconds (docs say you can send 50/s/client and also 5/s/user). Interestingly, the quotas indicated in the development console display slightly different but I assume they are to be interpreted in conjunction with one another.
I've throttled things down to the point where we're not even getting close to 5 requests per second and I'm still getting these errors back from the server.
Can someone provide some suggestions or solutions. Has anyone experienced this lately?
Let me know if I any additional information is needed.

Present variable information within a single mturk HIT

I'd like to use mturk to have 10 workers visit my website, log in with a test account, and enter some information on their profile. I don't want them to see each other's entries, so each worker should get login information for a different test account when they view the HIT.
This almost looks like what mturk's template feature is for -- I could upload a CSV with the information for each test account. But if I understand correctly, that will make 10 separate HITs, and allow one worker to do all 10 of them. Is there any way to have mturk put information that varies between workers into a single HIT?
Here are the solutions I'm currently aware of:
Use the CLI to automate creation of a bunch of different HITs. This would be a lot of work, and also make approving and retrieving the results cumbersome.
Direct workers to a survey website that's capable of doing what I want, and have them get the login information there.
Dynamically fill in part of the HIT using an AJAX request to an external website and database. That seems like crazy overkill for something so simple.
Are there other options?

Fetching stats on multiple tracks from soundcloud

I want to get all tracks with 0 to 1 plays and am looking at the playback_count stats from http://api.soundcloud.com/tracks/90891876.json?client_id=XXX URL, where playback_count is included in the json response. We have almost 1500 sound snippets, is it possible to make a script that fetches this data ~1500 times or will I get throttled for spamming the connection to the API? We will only use this stats a couple of times to measure how our campaign is going trying to increase plays. Or is it possible to get this data in just one request?
I saw this question just earned the "Tumbleweed" badge and I felt bad.
If the tracks are all owned by the same user, you can use this endpoint:
http://api.soundcloud.com/users/{id}/tracks
If you just have a list of tracks, you can use this endpoint:
http://api.soundcloud.com/tracks?ids=123,234,765,456,etc
See the "filters" section of the docs here: http://developers.soundcloud.com/docs/api/reference#tracks
But keep in mind that although the HTTP spec does not impose a limit on the length of the querystring, the default apache settings will return an error somewhere around 4000 characters. That's probably around 400 tracks for this endpoint. Play around with it. Maybe soundcloud has a limit on the number of tracks per query.
You could put your players embedded on a website (server) and track them with Analytics. I made a script for that: http://vitorventurin.com/tracking-soundcloud-with-google-analytics/

Many requests in an API vs Many separate requests

I am making an application based on Google maps API. This requires requesting for distance between two cities. Now I want distances between many cities.
So should I use "for loop" and make many requests separately or should I send all the cities names in one link. Which one will work faster? And which one will be better?
For sure you should avoid sending multiple requests, because each roundtrip to a server takes time.
However when you are grouping many requests this can also take a long time (both to send, and to process on the server), and affect the user experience (long waiting time).
In your case I suspect that the "for loop" will not load to a lot of data, and server side processing will also not be too heavy, so sending a grouped single request should be the way to go.
You can use the "DirectionService" sevices ,which is providing by Google i.e "api3".
You can find the distance between the Many cities ,it takes one origin point ,destination point and 8 way points (total 10 places) for one request and it provides a JSON file
in return ,which contains all the information (distance in KM,value meters,city names and lot more) .Please check this link, https://developers.google.com/maps/documentation/javascript/directions . i hope this answer will meet your requirement,otherwise don't mind.