Related
I want to delete a key from JSON. I found two examples shared by #Peter Thomas. I tried both and unfortunately, none worked.
Example 1
* def json = { a: 1, b: 2 }
* def key = 'b'
* if (true) karate.remove('json', key)
* match json == { a: 1 }
Error
javascript evaluation failed: if (true) karate.remove('json', key), unexpected path: b
Example 2
* def json = { a: 1, b: 2 }
* def key = 'b'
* if (true) delete json[key]
* match json == { a: 1 }
Error
actual: {a=1, b=2}, expected: {a=1}, reason: actual value has 1 more key(s) than expected: {b=2}
You probably are on an old version. Upgrade. Or maybe it is a bug, so follow this process: https://github.com/karatelabs/karate/wiki/How-to-Submit-an-Issue
I am using the following SQL (from another question) which contains temporary functions.
create temp function extract_keys(input string) returns array<string> language js as """
return Object.keys(JSON.parse(input));
""";
create temp function extract_values(input string) returns array<string> language js as """
return Object.values(JSON.parse(input));
""";
create temp function extract_all_leaves(input string) returns string language js as '''
function flattenObj(obj, parent = '', res = {}){
for(let key in obj){
let propName = parent ? parent + '.' + key : key;
if(typeof obj[key] == 'object'){
flattenObj(obj[key], propName, res);
} else {
res[propName] = obj[key];
}
}
return JSON.stringify(res);
}
return flattenObj(JSON.parse(input));
''';
select col || replace(replace(key, 'value', ''), '.', '-') as col, value,
from your_table,
unnest([struct(extract_all_leaves(data) as json)]),
unnest(extract_keys(json)) key with offset
join unnest(extract_values(json)) value with offset
using(offset)
I want to save the above query as a view, but I cannot include the temporary functions, so I planned to define these as user-defined functions that can be called as part of the view.
When defining the functions, I'm having some trouble getting the input and output types defined correctly. Here's the three user defined functions.
CREATE OR REPLACE FUNCTION `dataset.json_extract_all_leaves`(Obj String)
RETURNS String
LANGUAGE js AS """
function flattenObj(obj, parent = '', res = {}){
for(let key in obj){
let propName = parent ? parent + '.' + key : key;
if(typeof obj[key] == 'object'){
flattenObj(obj[key], propName, res);
} else {
res[propName] = obj[key];
}
}
return JSON.stringify(res);
}
return flattenObj(JSON.parse(input));
"""
CREATE OR REPLACE FUNCTION `dataset.json_extract_keys`(input String)
RETURNS Array<String>
LANGUAGE js AS """
return Object.keys(JSON.parse(input));
"""
finally
CREATE OR REPLACE FUNCTION `dataform.json_extract_values`(input STRING)
RETURNS Array<String>
LANGUAGE js AS """
return Object.values(JSON.parse(input));
"""
Those three functions are created successfully, but when I come to use them in this view
WITH extract_all AS (
select
id,
field,
created,
key || replace(replace(key, 'value', ''), '.', '-') as key_name, value,
FROM `dataset.raw_keys_and_values`,
unnest([struct(`dataset.json_extract_all_leaves`(setting_value) as json)]),
unnest(`dataset.json_extract_keys`(json)) key with offset
join unnest(`dataset.json_extract_values`(json)) value with offset
using(offset)
)
SELECT *
FROM
extract_all
This fails with the following error
Error: Multiple errors occurred during the request. Please see the `errors` array for complete details. 1. Failed to coerce output value "{\"value\":true}" to type ARRAY<STRING>
I understand there's a mismatch somewhere between the expected return value of json_extract_values, but I can't understand if it's in the SQL or JavaScript UDF?
Revised Answer
I've given the original ask another read and contrasted with some experimentation in my test data set.
While I'm unable to reproduce the given error, I did experience related difficulty with the following line:
unnest([struct(`dataset.json_extract_all_leaves`(setting_value) as json)]),
Put simply, the function being called takes a string (presumably a stringified JSON value) and returns a similarly stringified JSON value with the result. Because UNNEST can only be used with arrays, the author surrounds the output with [struct and ] which may be the issue. Again, in an effort to yield the same result as I do below, but using the original functions, I would propose that the SQL block be updated to the following:
create temp function extract_keys(input string) returns array<string> language js as """
return Object.keys(JSON.parse(input));
""";
create temp function extract_values(input string) returns array<string> language js as """
return Object.values(JSON.parse(input));
""";
create temp function extract_all_leaves(input string) returns string language js as '''
function flattenObj(obj, parent = '', res = {}){
for(let key in obj){
let propName = parent ? parent + '.' + key : key;
if(typeof obj[key] == 'object'){
flattenObj(obj[key], propName, res);
} else {
res[propName] = obj[key];
}
}
return JSON.stringify(res);
}
return flattenObj(JSON.parse(input));
''';
WITH extract_all AS (
select
id,
field,
created,
properties
FROM
UNNEST([
STRUCT<id int, field string, created DATE, properties string>(1, 'michael', DATE(2022, 5, 1), '[[{"name":"Andy","age":7},{"name":"Mark","age":5},{"name":"Courtney","age":6}], [{"name":"Austin","age":8},{"name":"Erik","age":6},{"name":"Michaela","age":6}]]'),
STRUCT<id int, field string, created DATE, properties string>(2, 'sarah', DATE(2022, 5, 2), '[{"name":"Angela","age":9},{"name":"Ryan","age":7},{"name":"Andrew","age":7}]'),
STRUCT<id int, field string, created DATE, properties string>(3, 'rosy', DATE(2022, 5, 3), '[{"name":"Brynn","age":4},{"name":"Cameron","age":3},{"name":"Rebecca","age":5}]')
])
AS myData
)
SELECT
id,
field,
created,
key,
value
FROM (
SELECT
*
FROM extract_all,
UNNEST(extract_keys(extract_all_leaves(properties))) key WITH OFFSET
JOIN UNNEST(extract_values(extract_all_leaves(properties))) value WITH OFFSET
USING(OFFSET)
)
Put simply - remove the extract_all_leaves line with its array casting and perform it in the offset-joined pair of keys and values, then put all that in a subquery so you can cleanly pull out just the columns you want.
And to explicitly answer the asked question, I believe the issue is in the SQL because of the type casting in the offending line and my own inability to get it to cleanly pair with the subsequent UNNEST queries against its output.
Original Answer
I gather that you've got some sort of JSON object in your settings_value field and you're trying to sift out a result that shows the keys and values of that object alongside the other columns in your dataset.
As others mentioned in the comments, this is a bit of a puzzle to figure out precisely why your query isn't working without any sample data, so happy to re-visit this if you can provide a record or two I can drop in to validate against, but here's an end-to-end that yields my guess as to what you're aiming for. In lieu of that, I've created some sample records intended to be in the same spirit of what you provided.
Based on your use of joining by the offset, I'm supposing that you're really just wanting to see all the keys and their values, paired with the other columns. Assuming that's true, I propose using a different JavaScript function that yields an array of all the key/value pairs instead of two separate functions to yield their own arrays. It simplifies the query (and more importantly, works):
create temp function extract_all_leaves(input string) returns string language js as r'''
function flattenObj(obj, parent = '', res = {}){
for(let key in obj){
let propName = parent ? parent + '.' + key : key;
if(typeof obj[key] == 'object'){
flattenObj(obj[key], propName, res);
} else {
res[propName] = obj[key];
}
}
return JSON.stringify(res);
}
return flattenObj(JSON.parse(input));
''';
create temp function extract_key_values(input string) returns array<struct<key string, value string>> language js as r"""
var parsed = JSON.parse(input);
var keys = Object.keys(parsed);
var result = [];
for (var ii = 0; ii < keys.length; ii++) {
var o = {key: keys[ii], value: parsed[keys[ii]]};
result.push(o);
}
return result;
""";
WITH extract_all AS (
select
id,
field,
created,
properties
FROM
UNNEST([
--STRUCT<id int, field string, created DATE, properties string>(1, 'michael', DATE(2022, 5, 1), '[[{"name":"Andy","age":7},{"name":"Mark","age":5},{"name":"Courtney","age":6}], [{"name":"Austin","age":8},{"name":"Erik","age":6},{"name":"Michaela","age":6}]]'),
STRUCT<id int, field string, created DATE, properties string>(2, 'sarah', DATE(2022, 5, 2), '[{"name":"Angela","age":9},{"name":"Ryan","age":7},{"name":"Andrew","age":7}]'),
STRUCT<id int, field string, created DATE, properties string>(3, 'rosy', DATE(2022, 5, 3), '[{"name":"Brynn","age":4},{"name":"Cameron","age":3},{"name":"Rebecca","age":5}]')
])
AS myData
)
SELECT
id,
field,
created,
key,
value
FROM (
SELECT
*
FROM extract_all
CROSS JOIN UNNEST(extract_key_values(extract_all_leaves(properties)))
)
And I believe this yields a result more like what you're seeking:
id
field
created
key
value
2
sarah
2022-05-02
0.name
Angela
2
sarah
2022-05-02
0.age
9
2
sarah
2022-05-02
1.name
Ryan
2
sarah
2022-05-02
1.age
7
2
sarah
2022-05-02
2.name
Andrew
2
sarah
2022-05-02
2.age
7
3
rosy
2022-05-03
0.name
Brynn
3
rosy
2022-05-03
0.age
4
3
rosy
2022-05-03
1.name
Cameron
3
rosy
2022-05-03
1.age
3
3
rosy
2022-05-03
2.name
Rebecca
3
rosy
2022-05-03
2.age
5
Of course, if this isn't at all in the right place of where you're trying to get to.
I'm looking for a way to retrive index value via metatable. This is my attempt:
local mt = { __index =
{
index = function(t, value)
local value = 0
for k, entry in ipairs(t) do
if (entry == value) then
value = k
end
end
return value
end
}
}
t = {
"foo", "bar"
}
setmetatable(t,mt)
print(t.index(t,"foo"))
Result is 0 instead of 1. Where I'm wrong?
My attempt:
local mt = {
__index = function(t,value)
for index, val in pairs(t) do
if value == val then
return index
end
end
end
}
t = {
"foo",
"bar",
"aaa",
"bbb",
"aaa"
}
setmetatable(t,mt)
print(t["aaa"]) -- 3
print(t["asd"]) -- nil
print(t["bbb"]) -- 4
print(t["aaa"]) -- 3
print(t["bar"]) -- 2
print(t["foo"]) -- 1
Result is 0 instead of 1. Where [am I] wrong?
The code for the index function is wrong; the problem is not related to the (correct) metatable usage. You're shadowing the parameter value when you declare local value = 0. Subsequent entry == value comparisons yield false as the strings don't equal 0. Rename either the parameter or the local variable:
index = function(t, value)
local res = 0
for k, entry in ipairs(t) do
if entry == value then
res = k
end
end
return res
end
An early return instead of using a local variable in the first place works as well and helps improve performance.
To prevent such errors from happening again, consider getting a linter like Luacheck, which will warn you if you shadow variables. Some editors support Luacheck out of the box; otherwise there are usually decent plugins available.
I'm working on an algorithm type challenge, and i am debugging via print statements and i can't seem to figure out why the the values for keys are not what i am expecting
var mapNums = mutableMapOf<Int, Int>()
//imaginary array
//var nums = [34,28,11,21,3,34,8,7,34,7,31,7,3,28,18]
var count = 0
for (n in nums) {
if (mapNums.containsKey(n)) {
count ++
mapNums[n] = count
} else if (!mapNums.containsKey(n)) {
count = 1
mapNums[n] = count
}
}
println(mapNums)
//prints {34=2, 28=4, 11=1, 21=1, 3=3, 8=1, 7=2, 31=1, 18=1}
as you can see the key and values aren't what theyre supposed to be and i am not sure why.
You can use the following code to generate the desired map:
val nums = intArrayOf(34, 28, 11, 21, 3, 34, 8, 7, 34, 7, 31, 7, 3, 28, 18).toList()
println(nums.groupingBy { it }.eachCount())
try it yourself
Here groupingBy creates a Grouping source using the same element as the key selector. Then eachCount groups elements from the Grouping source by key and counts elements in each group.
You can also refer the documentation for more info about groupingBy and eachCount.
It's because you reuse the same count variable outside of the loop so it keeps incrementing from different keys.
Instead you should get the current count from the map, then put it back one higher:
val nums = intArrayOf(34,28,11,21,3,34,8,7,34,7,31,7,3,28,18)
val mapNums = mutableMapOf<Int, Int>()
for (n in nums) {
val count = mapNums[n] ?: 0
mapNums[n] = count + 1
}
println(mapNums) // {34=3, 28=2, 11=1, 21=1, 3=2, 8=1, 7=3, 31=1, 18=1}
Firstly check n number is contain this map as key, if found then increment 1 its value using plus method. If not found any value from the map, it will null and check if null and set 1.
var mapNums = mutableMapOf<Int, Int>()
//imaginary array
var nums = arrayOf(34,28,11,21,3,34,8,7,34,7,31,7,3,28,18)
for (n in nums) {
mapNums[n] = mapNums[n]?.plus(1) ?: 1
}
println(mapNums)
I have a stream of data from a CSV. It is a flat structured database.
E.g.:
a,b,c,d
a,b,c,e
a,b,f
This essentially transforms into:
Node id,Nodename,parent id,level
100, a , 0 , 1
200, b , 100 , 2
300, c , 200 , 3
400, d , 300 , 4
500, e , 300 , 4
600, f , 200 , 3
Can this be done using Pentaho? I have gone through the transformation steps. But nothing strikes me as usable for this purpose. Please let me know if there is any step that I may have missed.
Your CSV file contains graph or tree definition. The output format is rich (node_id needs to be generated, parent_id needs to be resolved, level needs to be set). There are few issues you will face when processing this kind of CSV file in Pentaho Data Integration:
Data loading & processing:
Rows do not have same length (sometimes 4 nodes, sometimes 3 node).
Load whole rows. And then split rows to nodes and process one node per record stream item.
You can calculate output values in the same step as where the nodes are split.
Solution Steps:
CSV file input: Load data from CSV. Settings: No header row; Delimiter = ';'; One output column named rowData
Modified Java Script Value: Split rowData to nodes and calculate output values: nodeId, nodeName, parentId, nodeLevel [See the code below]
Sort rows: Sort rows by nodeName. [a,b,c,d,a,b,c,e,a,b,f >> a,a,a,b,b,c,c,d,e,f]
Unique rows: Delete duplicate rows by nodeName. [a,a,a,b,b,c,c,d,e,f >> a,b,c,d,e,f]
Text file output: Write out results.
Modified Java Script Value Code:
function writeRow(nodeId, nodeName, parentId, nodeLevel){
newRow = createRowCopy(getOutputRowMeta().size());
var rowIndex = getInputRowMeta().size();
newRow[rowIndex++] = nodeId;
newRow[rowIndex++] = nodeName;
newRow[rowIndex++] = parentId;
newRow[rowIndex++] = nodeLevel;
putRow(newRow);
}
var nodeIdsMap = {
a: "100",
b: "200",
c: "300",
d: "400",
e: "500",
f: "600",
g: "700",
h: "800",
}
// rowData from record stream (CSV input step)
var nodes = rowData.split(",");
for (i = 0; i < nodes.length; i++){
var nodeId = nodeIdsMap[nodes[i]];
var parentNodeId = (i == 0) ? "0" : nodeIdsMap[nodes[i-1]];
var level = i + 1;
writeRow(nodeId, nodes[i], parentNodeId, level);
}
trans_Status = SKIP_TRANSFORMATION;
Modified Java Script Value Field Settings:
Fieldname; Type; Replace value'Fieldname' or 'Rename to'
nodeId; String; N
nodeName; String; N
parent_id; String; N
nodeLevel; String; N