Extract element from output array in a Copy Data activity - azure-data-factory-2

I have a copy data activity that dynamically adds a datetime suffix to the sink file name, which is based on utcnow(). This corresponds to the start datetime in the copy data activity. I am looking to extract the 'start' element from the executionDetails array in the output:
{
"dataRead": 0,
"dataWritten": 86,
"filesWritten": 1,
"sourcePeakConnections": 1,
"sinkPeakConnections": 1,
"rowsRead": 0,
"rowsCopied": 0,
"copyDuration": 4,
"throughput": 0,
"errors": [],
"effectiveIntegrationRuntime": "FXL",
"usedParallelCopies": 1,
"executionDetails": [
{
"source": {
"type": "SqlServer"
},
"sink": {
"type": "AzureBlobFS"
},
"status": "Succeeded",
"start": "2019-08-06T12:29:20.477586Z",
"duration": 4,
"usedParallelCopies": 1,
"detailedDurations": {
"queuingDuration": 3,
"transferDuration": 1
}
}
]
}
Assuming the activity is called CopyData, I want to set the value of start to a variable. I am struggling to get this, a simple #activity('CopyData').output.executionDetails.start does not work, telling me to assign an integer value of the executionDetails array. However trying #activity('CopyData').output.executionDetails[3] errors telling me the range is (0,0). I am looking for a method to extract the datetimestamp into a string variable.
I can store executionDetails in an array variable, but still unable thereafter to extract the start value.

Already worked it out, there range is 0,0 because there is only 1 array in executionDetails containing various values. So, I just need to call the array with [0] and then call the start value, so:
#activity('CopyData').output.executionDetails[0].start

Related

How to set one schema for n-1 elements and another for the nth element

For example, look at the schema:
"type": "array",
"items": {
"oneOf":[
{"type": "number"},
{"type": "string"}
]
}
}
The schema would then allow a list of numbers and the last element would be a string. and the string can be optional.
For example:
{
"list_of_items" : [ 3, 4, 5, "hello"]
}
is allowed.
{
"list_of_items" : [ 3, "hello" , 4, 5 ]
}
Is not allowed because text is only allowed on the last element.
From reading the documentation. I see how to do a constant amount of numbers and a text element.
Or all numbers,
But I don't see how to create n-1 numbers and text for the nth.
It could be that is is not something json-schema can do.
Thanks.

Problems matching a long value in Rest Assured json body

I have the following response:
[
{
"id": 53,
"fileUri": "abc",
"filename": "abc.jpg",
"fileSizeBytes": 578466,
"createdDate": "2018-10-15",
"updatedDate": "2018-10-15"
},
{
"id": 54,
"fileUri": "xyz",
"filename": "xyz.pdf",
"fileSizeBytes": 88170994,
"createdDate": "2018-10-15",
"updatedDate": "2018-10-15"
}
]
and I am trying to match the id value to the object in JUnit like so:
RestAssured.given() //
.expect() //
.statusCode(HttpStatus.SC_OK) //
.when() //
.get(String.format("%s/%s/file", URL_BASE, id)) //
.then() //
.log().all() //
.body("", hasSize(2)) //
.body("id", hasItems(file1.getId(), file2.getId()));
But when the match occurs it tries to match an int to a long. Instead I get this output:
java.lang.AssertionError: 1 expectation failed.
JSON path id doesn't match.
Expected: (a collection containing <53L> and a collection containing <54L>)
Actual: [53, 54]
How does one tell Rest Assured that the value is indeed a long even though it might be short enough to fit in an int? I can cast the file's id to an int and it works, but that seems sloppy.
The problem is that when converting from json to java type, int type selected,
one solution is to compare int values.
instead of
.body("id", hasItems(file1.getId(), file2.getId()));
use
.body("id", hasItems(new Long(file1.getId()).intValue(), new Long(file2.getId()).intValue()));

bigquery nested object : No such field

I have a table with this schema :
I'm trying to upload some data from Google Coud Storage using the python client. The file is JSON newline delimited. Most of my lines don't have the field "passenger_origin.accuracy" but when the filed is present I have the following error :
Error while reading
data, error message: JSON parsing error in row starting at position
2122510: No such field: driver_origin.accuracy. (error code: invalid)
Error while reading
data, error message: JSON parsing error in row starting at position
2126317: No such field: passenger_origin.accuracy. (error code:
invalid)
Example of an invalid row :
{
"id": 1479443,
"is_obsolete": 0,
"seat_count": 1,
"is_ticket_checked": 0,
"score": 0.3709318902,
"is_multimodal": 0,
"fake_paths": 0,
"passenger_origin": {
"id": 2204,
"poi_uuid": "15b4e52c-7c58-442c-98df-1eb06079f6bb",
"user_id": 1987,
"accuracy": 250.0,
"disabled": 0,
"last_update": "2017-03-10T15:15:39",
"created": "2016-02-05T17:06:26",
"modified_by_user": 1,
"is_recurrent": 0,
"source": 1,
"hidden_by_user": 0,
"kind": 2,
},
"driver_origin": {
"id": 412491,
"poi_uuid": "47e90b6d-e178-4e02-9f02-f4ea5f8beaa1",
"user_id": 71471,
"disabled": 0,
"last_update": "2017-11-02T10:09:09",
"created": "2017-11-02T10:09:09",
"modified_by_user": 0,
"is_recurrent": 0,
"source": 1,
"hidden_by_user": 0,
"kind": 2,
},
"passenger_destination": {
"id": 2203,
"poi_uuid": "c531c3ca-47f0-4003-8098-1272fee8d018",
"user_id": 1987,
"accuracy": 250.0,
"disabled": 0,
"last_update": "2017-03-10T15:12:42",
"created": "2016-02-05T17:06:19",
"modified_by_user": 1,
"is_recurrent": 0,
"source": 1,
"hidden_by_user": 0,
"kind": 1,
}
}
The table is created before the upload of the data and is not modified since. I don't understand why the upload is failing on theses fields ? Do the RECORD fields have to be REPEATED ?
To ignore the fields that aren't present in the schema, use a combination of:
configuration.load.ignoreUnknownValues
configuration.load.maxBadRecords
Setting the first to true and the second to some arbitrarily-high number, e.g. 100000, will enable the load to succeed even if there are extra fields.
The problem was configuration.load.autodetect was set to True. I set it to False and the problem was fix

Setting date format in Google Sheets using API and Python

I'm trying to set the date format on a column so that dates are displayed like this: 14-Aug-2017. This is the way I'm doing it:
requests = [
{
'repeatCell':
{
'range':
{
'startRowIndex': 1,
'startColumnIndex': 4,
'endColumnIndex': 4
},
'cell':
{
"userEnteredFormat":
{
"numberFormat":
{
"type": "DATE",
"pattern": "dd-mmm-yyyy"
}
}
},
'fields': 'userEnteredFormat.numberFormat'
}
}
]
body = {"requests": requests}
response = service.spreadsheets().batchUpdate(spreadsheetId=SHEET, body=body).execute()
I want all the cells in column E except the header cell to be updated, hence the range definition. I used http://wescpy.blogspot.co.uk/2016/09/formatting-cells-in-google-sheets-with.html and https://developers.google.com/sheets/api/samples/formatting as the basis for this approach.
However, the cells don't show their contents using that format. They continue to be in "Automatic" format, either showing the numeric value that I'm storing (the number of days from 1st Jan 1900) or (sometimes) the date.
Adding sheetId to the range definition doesn't alter the outcome.
I'm not getting an error back from the service and the response only contains the spreadsheetId and an empty replies structure [{}].
What am I getting wrong?
I've found the error - the endColumnIndex needs to be 5, not 4.
I didn't read that first linked article carefully enough!

Is it possible to turn an array returned by the Mongo GeoNear command (using Ruby/Rails) into a Plucky object?

As a total newbie I have been trying to get the geoNear command working in my rails application and it appear to be working fine. The major annoyance for me is that it is returning an array with strings rather than keys which I can call on to pull out data.
Having dug around, I understand that MongoMapper uses Plucky to turn the the query resultant into a friendly object which can be handled easily but I haven't been able to find out how to transform the result of my geoNear query into a plucky object.
My questions are:
(a) Is it possible to turn this into a plucky object and how do i do that?
(b) If it is not possible how can I most simply and systematically extract each record and each field?
here is the query in my controller
#mult = 3963 * (3.14159265 / 180 ) # Scale to miles on earth
#results = #db.command( {'geoNear' => "places", 'near'=> #search.coordinates , 'distanceMultiplier' => #mult, 'spherical' => true})
Here is the object i'm getting back (with document content removed for simplicity)
{"ns"=>"myapp-development.places", "near"=>"1001110101110101100100110001100010100010000010111010", "results"=>[{"dis"=>0.04356444023196527, "obj"=>{"_id"=>BSON::ObjectId('4ee6a7d210a81f05fe000001'),...}}], "stats"=>{"time"=>0, "btreelocs"=>0, "nscanned"=>1, "objectsLoaded"=>1, "avgDistance"=>0.04356444023196527, "maxDistance"=>0.0006301239824196907}, "ok"=>1.0}
Help is much appreciated!!
Ok so lets say you store the results into a variable called places_near:
places_near = t.command( {'geoNear' => "places", 'near'=> [50,50] , 'distanceMultiplier' => 1, 'spherical' => true})
This command returns an hash that has a key (results) which maps to a list of results for the query. The returned document looks like this:
{
"ns": "test.places",
"near": "1100110000001111110000001111110000001111110000001111",
"results": [
{
"dis": 69.29646421910687,
"obj": {
"_id": ObjectId("4b8bd6b93b83c574d8760280"),
"y": [
1,
1
],
"category": "Coffee"
}
},
{
"dis": 69.29646421910687,
"obj": {
"_id": ObjectId("4b8bd6b03b83c574d876027f"),
"y": [
1,
1
]
}
}
],
"stats": {
"time": 0,
"btreelocs": 1,
"btreelocs": 1,
"nscanned": 2,
"nscanned": 2,
"objectsLoaded": 2,
"objectsLoaded": 2,
"avgDistance": 69.29646421910687
},
"ok": 1
}
To iterate over the responses just iterate as you would over any list in ruby:
places_near['results'].each do |result|
# do stuff with result object
end