IBM Watson Concept Insights get related concepts (corpus) using cURL timing out - api

I am getting this error -> {"code": 500, "message": "Forwarding error"} every time I try to get related concepts from my private account and corpus. The error seems to be a timeout error since it always dies at 2:30.
I've replaced the sample provided by IBM to point to my account and corpus. Does anybody know why this is occurring?
curl -u "{username}":"{password}" "https://gateway.watsonplatform.net/concept-insights/api/v2/corpora/accountid/corpus/related_concepts?limit=3&level=0"
cURL result
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 44 0 44 0 0 0 0 --:--:-- 0:02:30 --:--:-- 10
Corpus status
{"id":"/corpora/accountid/corpus","documents":10,"last_updated":"0001-01-01T00:00:00Z","build_status":{"ready":10,"error":0,"processing":0}}
NOTE: I do not get this error if I use the public example provided by IBM on the API. I have also masked my account id, corpus, username, and password for this public posting.

Unfortunately, since the error is corpus specific (since you mentioned you can the API to work on the public corpus), we would need to know more information about your corpus (like account id and corpus id) in order to help you out.
One way to allow you to provide this information privately is to open a ticket with the Bluemix system (there are 2 options described here)
https://developer.ibm.com/bluemix/support/#support
If you list the "Watson Concept Insights" service in the ticket, we will get your information.

Related

How to filter the response via API

Wanted to know if this is possible, I have 2 APIs I am testing.
API 1. Gives a list of total jobs posted by the user.
Response =
"jobId": 15596, "jobTitle": "PHP developer"
API 2. Gives the following response.
"total CVs": 19, "0-7days": 12,"status": "New Resume"
meaning in bucket New resume we have a total of 19 CVs and in 19 Cvs 12 Cvs have an aging of 12 days. This response basically related to the jobs posted.
When i Hit the API i am getting the correct numbers but on front end the API 1 will be used as dropdown to select the jobs and then New Resume, ageing and total Cvs will be shown according to that jobs.
I wanted is it possible to test the two API's togther sort of using filter like on front end or the only way to test is to check if the response i am getting is correct.

how to make sense of People APIs responses?

When calling People API's endpoints, especially in Batch requests, we're getting many different types of error responses.
Some have useful explanation in the error message, like:
Quota exceeded for quota metric 'Daily Contact Writes (Batch requests
cost 200 quota)' and limit 'Daily Contact Writes (Batch requests cost
200 quota) per day per user' of service 'people.googleapis.com' for
consumer 'project_number:XXX'.
Which you can detect and properly handle, e.g. wait for 24 hours before retrying that request, but some are more cryptic, such as:
Resource has been exhausted (e.g. check quota).
This does mention rate-limiting, but for which quota? Is it per-user or per GCP project? When can we retry this?
Note that we're getting this for the first batch call when syncing a user account, so I'm guessing this is not per-user quota, but there's no mention of such rate-limits in the docs.
Specifically, having issues handling:
"Sync quota exceeded"
"Resource has been exhausted (e.g. check quota)"
"MY_CONTACTS_OVERFLOW_COUNT"
Here's what I have so far, feel free to edit this answer to add more insights:
Authentication or Google backend issues:
"invalid_grant": bad access token
"Insufficient Permission": access token doesn't contain required scope
"The service is currently unavailable.": Google issue
"Internal error encountered.": Google issue
"Authentication backend unavailable.": Google issue
Quota and rate-limiting:
"Sync quota exceeded": ???
"Quota exceeded for quota metric X": A specific quota had been exceeded (per min / daily will be part of the message)
"Resource has been exhausted (e.g. check quota)": ???
"MY_CONTACTS_OVERFLOW_COUNT": ???
Bad requests:
"Request contains an invalid argument": something is wrong in the request, usually a Person object with some illegal info item
"Request contains a person.etag that is different than the current person.etag": An attempt to update a person that was recently updated on Google's side, need to fetch again
"Request person.etag is different than the current person.etag": same as above
"Requested entity was not found": An attempt to update a no-longer existing person
"Contact person resources are not found": same as above
"Contact group name is empty, expected to be non empty": An attempt to create/update a group with an empty name.
"Contact group name already exists": An attempt to create a group with the same name
"MY_CONTACTS_OVERFLOW_COUNT" happens when you try to insert contacts to a google account, but they already have the maximum number of contacts.
I am not 100% sure, but this limit seems to be ~20,000 for "normal"/"free" google accounts.
edit- The limit is 25,000, since 2011: https://workspaceupdates.googleblog.com/2011/05/need-more-contacts-in-gmail-contacts.html

How to build an ongoing alert that catches sudden spikes for a certain http error code?

I could really use an ongoing alert that catches a sudden rise (spike) in a certain error code (such as 404 or 502 etc...)
I tried giving this some thought on how to achieve that, and... Well... I could really use your help with the script :-)
From my understanding the search query should "know" or, "sense" the normal traffic (not sure for how long, maybe for 1hr, 2hrs) and alert when there is a spike in the error code compared to 1-2 hours ago.
I think the error code spike threshold should be more than 5% of total traffic, while occurring for longer than 90 seconds.
Here is a Splunk Query I use today, I appreciate your help tuning it to what I described above:
tag=NginxLogs host=www1 OR host=www2 |stats count by status|eventstats sum(count) as total|eval perc=round((count/total)*100,2)|where status="404" AND perc>5
The top command automatically provides the count and percent.
http://docs.splunk.com/Documentation/Splunk/7.1.2/SearchReference/Top
tag=NginxLogs host=www1 OR host=www2
| top status
| search percent > 5 AND status > 399
If you have the url,http request method and user in your splunk logs, you can add it as a part of this alert. Example:
tag=NginxLogs host=www1 OR host=www2
| eventstats distinct_count(userid) as NoOfUsersAffected by requestUri,status,httpmethod
| top status,httpmethod,NoOfUsersAffected by requestUri
| search NoOfUsersAffected > 2 AND ((status>499 AND percentage > 5) OR (StatusCode=400 AND percentage > 95))
You can use the following alert message:
$result.percent$ % ($result.count$ calls) has StatusCode $result.status$ for
$result.requestUri$ - $result.httpmethod$.
$result.NoOfUsersAffected$ users were affected
You will get alert like:
21.19 % (850 calls) has StatusCode 500 for https://app.test.com/hello - GET.
90 users are affected

Podio Create Item rate limit after 25 calls

I have to create items in podio using the api. When i let my program go full speed i noticed that after 5 - 6 items I get an error response from podio saying:
{
"error_propagate":false,
"error":"rate_limit",
"error_description":"You have hit the rate limit. Please wait 300 seconds before trying again",
"request":{
"url":"http://api.podio.com/oauth/token",
"query_string":"",
"method":"POST"
}
}
I tought the rate limit was 5000 calls/H and I get this error after 25 calls...
I added a thread.sleep in my code, and now it seems to be better, but even when I let the thread sleep for 10s I still get this error, I have now set the thread.sleep to 20 sec and it seems to work.
Is there a hidden rate limit to the number off calls per second ?
I think you are using Username password authentication here. The token request endpoint have lower limit from my experience. So the best way to solve this is to store and reuse the access tokens, instead of re-authenticating every time your program runs.
Podio API client libraries provide convenience methods to do this. See this links:
http://podio.github.io/podio-dotnet/sessions/
http://podio.github.io/podio-php/sessions
The rate limit is 1000 calls/H. so you can put sleep accordingly.

How to avoid Hitting the 10 sec limit per user

We run multiple short queries in parallel, and hit the 10 sec limit.
According to the docs, throttling might occur if we hit a limit of 10 API requests per user per project.
We send a "start query job", and then we call the "getGueryResutls()" with timeoutMs of 60,000, however, we get a response after ~ 1 sec, we look for JOB Complete in the JSON response, and since it is not there, we need to send the GetQueryResults() again many times and hit the threshold, that is causing an error, not a slowdown. the sample code is below.
our questions are as such:
1. What is a "user" is it an appengine user, is it a user-id that we can put in the connection string or in the query itslef?
2. Is it really per API project of BigQuery?
3. What is the behavior?we got an error: "Exceeded rate limits: too many user/method api request limit for this user_method", and not a throttling behavior as the doc say and all of our process fails.
4. As seen below in the code, why we get the response after 1 sec & not according to our timeout? are we doing something wrong?
Thanks a lot
Here is the a sample code:
while (res is None or 'jobComplete' not in res or not res['jobComplete']) :
try:
res = self.service.jobs().getQueryResults(projectId=self.project_id,
jobId=jobId, timeoutMs=60000, maxResults=maxResults).execute()
except HTTPException:
if independent:
raise
Are you saying that even though you specify timeoutMs=60000, it is returning within 1 second but the job is not yet complete? If so, this is a bug.
The quota limits for getQueryResults are actually currently much higher than 10 requests per second. The reason the docs say only 10 is because we want to have the ability to throttle it down to that amount if someone is hitting us too hard. If you're currently seeing an error on this API, it is likely that you're calling it at a very high rate.
I'll try to reproduce the problem where we don't wait for the timeout ... if that is really what is happening it may be the root of your problems.
def query_results_long(self, jobId, maxResults, res=None):
start_time = query_time = None
while res is None or 'jobComplete' not in res or not res['jobComplete']:
if start_time:
logging.info('requested for query results ended after %s', query_time)
time.sleep(2)
start_time = datetime.now()
res = self.service.jobs().getQueryResults(projectId=self.project_id,
jobId=jobId, timeoutMs=60000, maxResults=maxResults).execute()
query_time = datetime.now() - start_time
return res
then in appengine log I had this:
requested for query results ended after 0:00:04.959110