TableData.insertAll with templateSuffix - frequent 503 errors - google-bigquery

We are using TableData.insertAll with a templateSuffix and are experiencing frequent 503 errors with our usage pattern.
We set the templateSuffix based on two pieces of information - the name of the event being inserted and the data of the event being inserted. E.g. 'NewPlayer20160712'. The table ID is set to 'events'.
In most cases this works as expected, but relatively often it will fail and return an error. Approximately 1 in every 200 inserts will fail, which seems way too often for expected behaviour.
The core of our event ingestion service looks like this:
//Handle all rows in rowsBySuffix
async.mapLimit(Object.keys(rowsBySuffix), 5, function(suffix) {
//Construct request for suffix
var request = {
projectId: "tactile-analytics",
datasetId: "discoducksdev",
tableId: "events",
resource: {
"kind": "bigquery#tableDataInsertAllRequest",
"skipInvalidRows": true,
"ignoreUnknownValues": true,
"templateSuffix": suffix, // E.g. NewPlayer20160712
"rows": rowsBySuffix[suffix]
},
auth: jwt // valid google.auth.JWT instance
};
//Insert all rows into BigQuery
var cb = arguments[arguments.length-1];
bigquery.tabledata.insertAll(request, function(err, result) {
if(err) {
console.log("Error insertAll. err=" + JSON.stringify(err) + ", request.resource=" + JSON.stringify(request.resource));
}
cb(err, result);
});
}, arguments[arguments.length-1]);
A typical error would look like this:
{
   "code": 503,
   "errors": [
      {
         "domain": "global",
         "reason": "backendError",
         "message": "Error encountered during execution. Retrying may solve the problem."
      }
   ]
}
The resource part for the insertAll that fails looks like this:
{
   "kind": "bigquery#tableDataInsertAllRequest",
   "skipInvalidRows": true,
   "ignoreUnknownValues": true,
   "templateSuffix": "GameStarted20160618",
   "rows": [
      {
         "insertId": "1f4786eaccd1c16d7ce865fea4c7af89",
         "json": {
            "eventName": "gameStarted",
            "eventSchemaHash": "unique-schema-hash-value",
            "eventTimestamp": 1466264556,
            "userId": "f769dc78-3210-4fd5-a2b0-ca4c48447578",
            "sessionId": "821f8f40-ed08-49ff-b6ac-9a1b8194286b",
            "platform": "WEBPLAYER",
            "versionName": "1.0.0",
            "versionCode": 12345,
            "ts_param1": "2016-06-04 00:00",
            "ts_param2": "2014-01-01 00:00",
            "i_param0": 598,
            "i_param1": 491,
            "i_param2": 206,
            "i_param3": 412,
            "i_param4": 590,
            "i_param5": 842,
            "f_param0": 5945.442,
            "f_param1": 1623.4111,
            "f_param2": 147.04747,
            "f_param3": 6448.521,
            "b_param0": true,
            "b_param1": false,
            "b_param2": true,
            "b_param3": true,
            "s_param0": "Im guesior ti asorne usse siorst apedir eamighte rel kin.",
            "s_param1": "Whe autiorne awayst pon, lecurt mun.",
            "eventHash": "1f4786eaccd1c16d7ce865fea4c7af89",
            "collectTimestamp": "1468346812",
            "eventDate": "2016-06-18"
         }
      }
   ]
}
We have noticed that, if we avoid including the name of the event in the suffix (e.g. the NewPlayer part) and instead just have the date as the suffix, then we never experience these errors.
Is there any way that this can be made to work reliably?

Backend errors happen, we usually see 5 from 10000 requests. We simply retry, and we have more constant rate, and we can provide a reconstructable use case we put a ticket on the Bigquery issue tracker. This way if there is something wrong with our project it can be investigated.
https://code.google.com/p/google-bigquery/

Related

Aggregating Fields from AWS CloudWatch Logs

I am pretty new to CloudWatch unfortunately, and have a rather complex task. So after some extensive work, we have some logs that break down all of our clicks into reasons that it was tossed or did not work as good as it could have. It comes in with a title of "Successful Parse Report".
Successful Parse Report: {
"<IDENTIFIER>": {
"b39fjoa...................................................": {
"errorCounts": {},
"recordCounts": {
"optouts": 0,
"clicks": 100
},
"warningCounts": {
"<FIELD 1>": {
"<REASON1>": 1
},
"<FIELD2>": {
"<REASON2>": 100
},
"<FIELD3>": {
"<REASON3>": 100
},
"<FIELD4>": {
"<REASON4>": 12
},
"<FIELD5>": {
"<REASON5>": 9
},
"<FIELD6>": {
"<REASON6>": 14
}
}
}
}
}
As you can see, there are a few fields. This means 100 clicks happened, and each "REASON" is a different thing I need aggregated with the "FIELD" as the identifier on the graph.I also need this done on all logs for each of the lambdas I am using this for.
My question is, knowing each log I am trying to aggregate starts with the same "Successful Parse Report", how can I possibly aggregate all of the clicks from all of my logs, have each "REASON" aggregate across all the same logs, have the "FIELD" be the identifier for each "REASON", and graph it to be able to visualize these counts in real time as they come in.

Apache superset with mongoDB(NO SQL database)

I am using MongoDB. My task is to build Dashboard charts for the data. So, I am using Apache superset. I connected MongoDB to apache drill as it wont connect directly with superset. Then connected apache drill to Apachesueperset. My collection is nested. How can I process this nested data to get use for dashboard charts.My data looks as below
{
"_id": {
"$oid": "6229d3cfdbfc81a8777e4821"
},
"jobs": [
{
"job_ID": {
"$oid": "62289ded8079821eb24760e0"
},
"New": false,
"Expired": false
},
{
"job_ID": {
"$oid": "6228a252fb4554dd5c48202a"
},
"New": true,
"Expired": true
},
{
"job_ID": {
"$oid": "622af1c391b290d34701af9f"
},
"New": true,
"Expired": false
}
],
"email": "mani2090996#ail.com"
}
I am querying in apache drill as follows
SELECT flat.fill FROM (SELECT FLATTEN(t.jobs) AS fill FROM mongo.recruitingdb.flatten.`Vendorjobs` t) flat WHERE flat.fill.New = flase;
And i am getting parsing error
org.apache.drill.common.exceptions.UserRemoteException: PARSE ERROR: Encountered "." at line 1, column 123.
Superset doesn't really handle nested data very well. Drill does however, so you'll have to craft queries to produce columns that can be visualized.
Take a look here: https://drill.apache.org/docs/json-data-model/
and here: https://drill.apache.org/docs/querying-complex-data-introduction/.
UPDATE:
Try the query below. The FROM clause may not be exactly right, but you should get the idea from this.
Note that you can access maps in Drill in two ways:
tablename.mapname.field OR
mapname['field']
You can do this for any level of nesting.
SELECT mongoTable.jobs.job_ID.`$oid` AS job_ID,
mongoTable.jobs.`New` AS new,
mongoTable.jobs.`Expired` AS expired
FROM
(
SELECT flatten(jobs) AS jobs
FROM mongo.recruitingdb.flatten.`Vendorjobs` AS t1
WHERE t1.jobs.New = false
) AS mongoTable

Karate API : * def from response is throwing syntax error

I am setting up an E2E test and chaining my request/responses. I am defining variables from each response and using them in the next call.
Its working up to a point, and then a problem surfaces when defining off the 2nd response.
If I def operationId, operationSubject, or operationStatus (e.g response.operationId), it works.
If I store anything from the results (e.g response.results.0.personId) it throws this error
Expected ; but found .0
response.results.0.personId
My response:
{
"operationId": "922459ecxxxxx",
"operationSubject": "BATCH_ENROLLMENT",
"operationStatus": "PROCESSED",
"results": {
"0": {
"personId": "367a73b5xxxx",
"status": "PRE_AUTH",
"email": "mquinter+TEST.69387488#email.com",
"loanNumber": null
},
"1": {
"personId": "56f060fd-e34xxxxxx",
"status": "PRE_AUTH",
"email": "mquintxxxx#email.com",
"loanNumber": null
}
}
}
That's not how to access data in JSON. See this similar question: https://stackoverflow.com/a/71847841/143475
Maybe you meant to do this:
* def foo = response.results[0].personId
https://stackoverflow.com/users/143475/peter-thomas
I see the issue - It wasn't finding the response because I wasn't giving it enough time before the next call.
I put a sleep in there and its working as expected.
Thanks

Collapsing a group using Google Sheets API

So as a workaround to difficulties creating a new sheet with groups I am trying to create and collapse these groups in a separate call to batchUpdate. I can call request an addDimensionGroup successfully, but when I request updateDimensionGroup to collapse the group I just created, either in the same API call or in a separate one, I get this error:
{
"error": {
"code": 400,
"message": "Invalid requests[1].updateDimensionGroup: dimensionGroup.depth must be \u003e 0",
"status": "INVALID_ARGUMENT"
}
}
But I'm passing depth as 0 as seen by the following JSON which I send in my request:
{
"requests":[{
"addDimensionGroup":{
"range":{
"dimension":"ROWS",
"sheetId":0,
"startIndex":2,
"endIndex":5}
}
},{
"updateDimensionGroup":{
"dimensionGroup":{
"range": {
"dimension":"ROWS",
"sheetId":0,
"startIndex":2,
"endIndex":5
},
"depth":0,
"collapsed":true
},
"fields":"*"
}
}],
"includeSpreadsheetInResponse":true}',
...
I'm not entirely sure what I am supposed to provide for "fields", the documentation for UpdateDimensionGroupRequest says it is supposed to be a string ("string ( FieldMask format)"), but the FieldMask definition itself shows the possibility of multiple paths, and doesn't tell me how they are supposed to be separated in a single string.
What am I doing wrong here?
The error message is actually instructing you that the dimensionGroup.depth value must be > 0:
If you call spreadsheets.get() on your sheet, and request only the DimensionGroup data, you'll note that your created group is actually at depth 1:
GET https://sheets.googleapis.com/v4/spreadsheets/{SSID}?fields=sheets(rowGroups)&key={API_KEY}
This makes sense, since the depth is (per API spec):
depth numberThe depth of the group, representing how many groups have a range that wholly contains the range of this group.
Note that any given particular DimensionGroup "wholly contains its own range" by definition.
If your goal is to change the status of the DimensionGroup, then you need to set its collapsed property:
{
"requests":
[
{
"updateDimensionGroup":
{
"dimensionGroup":
{
"range":
{
"sheetId": <your sheet id>,
"dimension": "ROWS",
"startIndex": 2,
"endIndex": 5
},
"collapsed": true,
"depth": 1
},
"fields": "collapsed"
}
}
]
}
For this particular Request, the only attribute you can set is collapsed - the other properties are used to identify the desired DimensionGroup to manipulate. Thus, specifying fields: "*" is equivalent to fields: "collapsed". This is not true for the majority of requests, so specifying fields: "*" and then omitting a non-required request parameter is interpreted as "Delete that missing parameter from the server's representation".
To change a DimensionGroup's depth, you must add or remove other DimensionGroups that encompass it.

square connect api batch processing

I need assistance with batch processing, especially in adding tax codes to items.
I'm experimenting with the square batch processing feature and my sample cases are create 2 items and add the tax code to them. In all 4 requests - 2 for creating item, 2 to 'put' the tax code. I have tried the following orders:
1. create the two items; add the taxes
2. create one item; add tax code to that item; create second item, add code to the second item.
In both instances, the result is the same - the taxes are applied to only one item. For the second item, the response I get is:
{
"status_code":404,
"body":{
"type":"not_found",
"message":"NotFound"
},
"request_id":4
}
To help with the investigation, here's the sample json that I use in the cURL request.
{
"requests":[
{
"method":"POST",
"relative_path":"\/v1\/me\/items",
"access_token":"XXX-YYY",
"body":
{
"id":126,
"name":"TestItem",
"description":"TestItemDescription",
"category_id":"DF1F51FB-11D6-4232-B138-2ECE3D89D206",
"variations":[
{
"name":"var1",
"pricing_type":"FIXED_PRICING",
"price_money":
{
"currency_code":"CAD",
"amount":400
},
"sku":"123444:QWEFASDERRG"
}
]},
"request_id":1
},
{
"method":"PUT",
"relative_path":"\/v1\/me\/items\/126\/fees\/7F2D50D8-43C1-4518-8B8D-881CBA06C7AB",
"access_token":"XXX-YYY",
"request_id":2
},
{
"method":"POST",
"relative_path":"\/v1\/me\/items",
"access_token":"XXX-YYY",
"body":
{
"id":127,
"name":"TestItem1",
"description":"TestItemDescription1",
"category_id":"DF1F51FB-11D6-4232-B138-2ECE3D89D206",
"variations":[
{
"name":"var1",
"pricing_type":"FIXED_PRICING",
"price_money":
{
"currency_code":"CAD",
"amount":400
},
"sku":"123444:QWEFASDERRG1"
}
]
},
"request_id":3
},
{
"method":"PUT",
"relative_path":"\/v1\/me\/items\/127\/fees\/7F2D50D8-43C1-4518-8B8D-881CBA06C7AB",
"access_token":"XXX-YYY",
"request_id":4
}
]
}
Below is the full response that I receive indicating successful creation of two items and only one successful tax push.
[
{
"status_code":200,
"body":
{
"visibility":"PUBLIC",
"available_online":false,
"available_for_pickup":false,
"id":"126",
"description":"TestItemDescription",
"name":"TestItem",
"category_id":"DF1F51FB-11D6-4232-B138-2ECE3D89D206",
"category":
{
"id":"DF1F51FB-11D6-4232-B138-2ECE3D89D206",
"name":"Writing Instruments"
},
"variations":[
{
"pricing_type":"FIXED_PRICING",
"track_inventory":false,
"inventory_alert_type":"NONE",
"id":"4c70909b-90bd-4742-b772-e4fabe636557",
"name":"var1",
"price_money":
{
"currency_code":"CAD",
"amount":400
},
"sku":"123444:QWEFASDERRG",
"ordinal":1,
"item_id":"126"
}
],
"modifier_lists":[],
"fees":[],
"images":[]
},
"request_id":1
},
{
"status_code":200,
"body":{},
"request_id":2
},
{
"status_code":200,
"body":
{
"visibility":"PUBLIC",
"available_online":false,
"available_for_pickup":false,
"id":"127",
"description":"TestItemDescription1",
"name":"TestItem1",
"category_id":"DF1F51FB-11D6-4232-B138-2ECE3D89D206",
"category":
{
"id":"DF1F51FB-11D6-4232-B138-2ECE3D89D206",
"name":"Writing Instruments"
},
"variations":[
{
"pricing_type":"FIXED_PRICING",
"track_inventory":false,
"inventory_alert_type":"NONE",
"id":"6de8932f-603e-4cd9-99ad-67f6c7777ffd",
"name":"var1",
"price_money":
{
"currency_code":"CAD",
"amount":400
},
"sku":"123444:QWEFASDERRG1",
"ordinal":1,
"item_id":"127"
}
],
"modifier_lists":[],
"fees":[],
"images":[]
},
"request_id":3
},
{
"status_code":404,
"body":
{
"type":"not_found",
"message":"NotFound"
},
"request_id":4
}
]
I have checked through going for the list of items and both items with their item ID's are present in the inventory. So the questions I have are, Why the tax is applied to one item and not to the other? How to resolve it?
From the Square docs:
Note the following when using the Submit Batch endpoint:
You cannot include more than 30 requests in a single batch.
Recursive
requests to the Submit Batch endpoint are not allowed (i.e., none of
the requests included in a batch can itself be a request to this
endpoint).
There is no guarantee of the order in which batched
requests are performed.
(emphasis mine).
If you want to use the batch API, you will have to create parent entities like items first, then in a separate batch request apply any child entities like fees, discounts, etc... Alternately, you can just make separate requests. There may not be much benefit from using the batch API in this case.