Filtering down a Karate test response object to get a sub-list? - karate

Given this feature file:
Feature: test
Scenario: filter response
* def response =
"""
[
{
"a": "a",
"b": "a",
"c": "a",
},
{
"d": "ab",
"e": "ab",
"f": "ab",
},
{
"g": "ac",
"h": "ac",
"i": "ac",
}
]
"""
* match response[1] contains { e: 'ab' }
How can I filter the response down so that it is equal to:
{
"d": "ab",
"e": "ab",
"f": "ab",
}
Is there a built-in way to do this? In the same way as you can filter a List using a Java stream?

Sample code:
Feature: test
Scenario: filter response
* def response =
"""
[
{
"a": "a",
"b": "a",
"c": "a",
},
{
"d": "ab",
"e": "ab",
"f": "ab",
},
{
"g": "ac",
"h": "ac",
"i": "ac",
}
]
"""
* def filt = function(x){ return x.e == 'ab' }
* def items = get response[*]
* def res = karate.filter(items, filt)
* print res

Related

Constructing a pandas DataFrame with columns and sub-columns from dict

I have a dict of the following form
dict = {
"Lightweight_model_20221103_downscale_1536px_RecOut": {
"CRR": "75.379",
"Sum Time": 33132,
"Sum Detection Time": 18406,
"images": {
"uk_UA_02 (1).jpg": {
"Time": "877",
"Time_detection": "469"
},
"uk_UA_02 (10).jpg": {
"Time": "914",
"Time_detection": "323"
},
"uk_UA_02 (11).jpg": {
"Time": "1169",
"Time_detection": "428"
},
"uk_UA_02 (12).jpg": {
"Time": "881",
"Time_detection": "371"
},
"uk_UA_02 (13).jpg": {
"Time": "892",
"Time_detection": "335"
}
}
},
"Lightweight_model_20221208_RecOut": {
"CRR": "71.628",
"Sum Time": 41209,
"Sum Detection Time": 25301,
"images": {
"uk_UA_02 (1).jpg": {
"Time": "916",
"Time_detection": "573"
},
"uk_UA_02 (10).jpg": {
"Time": "927",
"Time_detection": "442"
},
"uk_UA_02 (11).jpg": {
"Time": "1150",
"Time_detection": "513"
},
"uk_UA_02 (12).jpg": {
"Time": "1126",
"Time_detection": "531"
},
"uk_UA_02 (13).jpg": {
"Time": "921",
"Time_detection": "462"
}
}
}
}
and I want to make DataFrame with sub-columns in output like on image
[![enter image description here][1]][1]
but I don't understand how to open subdicts in ['images']
when I use code
df = pd.DataFrame.from_dict(dict, orient='index')
df_full = pd.concat([df.drop(['images'], axis=1), df['images'].apply(pd.Series)], axis=1)
receive dictionaries in columns whit filenames
[![result][2]][2]
how to open nested dicts and convert them to sub-columns
[1]: https://i.stack.imgur.com/hGrKo.png
[2]: https://i.stack.imgur.com/8LlUW.png
Here is one way to do it with the help of Pandas json_normalize, MultiIndex.from_product, and concat methods:
import pandas as pd
df = pd.DataFrame.from_dict(dict, orient='index')
# Save first columns and add a second empty level header
tmp = df[["CRR", "Sum Time", "Sum Detection Time"]]
tmp.columns = [tmp.columns, ["", "", ""]]
dfs= [tmp]
# Process "images" column
df = pd.DataFrame.from_dict(df["images"].to_dict(), orient='index')
# Create new second level column header for each column in df
for col in df.columns:
tmp = pd.json_normalize(df[col])
tmp.index = df.index
tmp.columns = pd.MultiIndex.from_product([[col], tmp.columns])
dfs.append(tmp)
# Concat everything in a new dataframe
new_df = pd.concat(dfs, axis=1)
Then:
print(new_df)
Outputs:

Convert PyTorch AutoTokenizer to TensorFlow TextVectorization

I have a PyTorch encoder loaded on my PC with transformers.
I saved it in JSON with tokenizer.save_pretrained(...) and now I need to load it on another PC with TensorFlow TextVectorization as I don't have access to the transformers library.
How can I convert ? I read about the tf.keras.preprocessing.text.tokenizer_from_json but it does not work.
In PyTorch JSON I have :
{
"version": "1.0",
"truncation": null,
"padding": null,
"added_tokens": [...],
"normalizer": {...},
"pre_tokenizer": {...},
"post_processor": {...},
"decoder": {...},
"model": {...}
}
and TensorFlow is expecting, with TextVectorizer :
def __init__(
self,
max_tokens=None,
standardize="lower_and_strip_punctuation",
split="whitespace",
ngrams=None,
output_mode="int",
output_sequence_length=None,
pad_to_max_tokens=False,
vocabulary=None,
idf_weights=None,
sparse=False,
ragged=False,
**kwargs,
):
or with the tokenizer_from_json these kind of fields :
config = tokenizer_config.get("config")
word_counts = json.loads(config.pop("word_counts"))
word_docs = json.loads(config.pop("word_docs"))
index_docs = json.loads(config.pop("index_docs"))
# Integer indexing gets converted to strings with json.dumps()
index_docs = {int(k): v for k, v in index_docs.items()}
index_word = json.loads(config.pop("index_word"))
index_word = {int(k): v for k, v in index_word.items()}
word_index = json.loads(config.pop("word_index"))
tokenizer = Tokenizer(**config)
Simply "tf.keras.preprocessing.text.tokenizer_from_json.()" but you may need to correct format in JSON.
Sample: The sample they using " I love cats " -> " Sticky "
import tensorflow as tf
text = "I love cats"
tokenizer = tf.keras.preprocessing.text.Tokenizer(num_words=10000, oov_token='<oov>')
tokenizer.fit_on_texts([text])
# input
vocab = [ "a", "b", "c", "d", "e", "f", "g", "h", "I", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "_" ]
data = tf.constant([["_", "_", "_", "I"], ["l", "o", "v", "e"], ["c", "a", "t", "s"]])
layer = tf.keras.layers.StringLookup(vocabulary=vocab)
sequences_mapping_string = layer(data)
sequences_mapping_string = tf.constant( sequences_mapping_string, shape=(1,12) )
print( 'result: ' + str( sequences_mapping_string ) )
print( 'tokenizer.to_json(): ' + str( tokenizer.to_json() ) )
new_tokenizer = tf.keras.preprocessing.text.tokenizer_from_json(tokenizer.to_json())
print( 'new_tokenizer.to_json(): ' + str( new_tokenizer.to_json() ) )
Output:
result: tf.Tensor([[27 27 27 9 12 15 22 5 3 1 20 19]], shape=(1, 12), dtype=int64)
tokenizer.to_json(): {"class_name": "Tokenizer", "config": {"num_words": 10000, "filters": "!\"#$%&()*+,-./:;<=>?#[\\]^_`{|}~\t\n", "lower": true, "split": " ", "char_level": false, "oov_token": "<oov>", "document_count": 1, "word_counts": "{\"i\": 1, \"love\": 1, \"cats\": 1}", "word_docs": "{\"cats\": 1, \"love\": 1, \"i\": 1}", "index_docs": "{\"4\": 1, \"3\": 1, \"2\": 1}", "index_word": "{\"1\": \"<oov>\", \"2\": \"i\", \"3\": \"love\", \"4\": \"cats\"}", "word_index": "{\"<oov>\": 1, \"i\": 2, \"love\": 3, \"cats\": 4}"}}
new_tokenizer.to_json(): {"class_name": "Tokenizer", "config": {"num_words": 10000, "filters": "!\"#$%&()*+,-./:;<=>?#[\\]^_`{|}~\t\n", "lower": true, "split": " ", "char_level": false, "oov_token": "<oov>", "document_count": 1, "word_counts": "{\"i\": 1, \"love\": 1, \"cats\": 1}", "word_docs": "{\"cats\": 1, \"love\": 1, \"i\": 1}", "index_docs": "{\"4\": 1, \"3\": 1, \"2\": 1}", "index_word": "{\"1\": \"<oov>\", \"2\": \"i\", \"3\": \"love\", \"4\": \"cats\"}", "word_index": "{\"<oov>\": 1, \"i\": 2, \"love\": 3, \"cats\": 4}"}}

Make a property bag from a list of keys and values

I have a list containing the keys and another list containing values (obtained from splitting a log line). How can I combine the two to make a proeprty-bag in Kusto?
let headers = pack_array("A", "B", "C");
datatable(RawData:string)
[
"1,2,3",
"4,5,6",
]
| expand fields = split(RawData, ",")
| expand dict = ???
Expected:
dict
-----
{"A": 1, "B": 2, "C": 3}
{"A": 4, "B": 5, "C": 6}
Here's one option, that uses the combination of:
mv-apply: https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/mv-applyoperator
pack(): https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/packfunction
make_bag(): https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/make-bag-aggfunction
let keys = pack_array("A", "B", "C");
datatable(RawData:string)
[
"1,2,3",
"4,5,6",
]
| project values = split(RawData, ",")
| mv-apply with_itemindex = i key = keys to typeof(string) on (
summarize dict = make_bag(pack(key, values[i]))
)
values
dict
[ "1", "2", "3"]
{ "A": "1", "B": "2", "C": "3"}
[ "4", "5", "6"]
{ "A": "4", "B": "5", "C": "6"}

Acessing array in vega lite

I need to perform an operation in vega-lite/Kibana 6.5 similar to the next one. I need to divide y axis by "data.values[0].b". How can I perform this operation?
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"description": "A simple bar chart with embedded data.",
"data": {
"values": [
{"a": "A", "b": 28}, {"a": "B", "b": 55}, {"a": "C", "b": 43},
{"a": "D", "b": 91}, {"a": "E", "b": 81}, {"a": "F", "b": 53},
{"a": "G", "b": 19}, {"a": "H", "b": 87}, {"a": "I", "b": 52}
]
},
"mark": "bar",
"encoding": {
"x": {"field": "a", "type": "ordinal"},
"y": {"field": "b", "type": "quantitative"}
}
Please take a look at Transform Calculate topic in Vega Lite docs.
You can do :
"transform": [
{"calculate": "datum.a / datum.b", "as": "y"}
]
Notice that 'datum' is the keyword used to access data set.

C3JS Acces value shown on X axis

I have simple bar chart like this:
Here is my C3JS
var chart = c3.generate({
data: {
json:[{"A": 67, "B": 10, "site": "Google", "C": 12}, {"A": 10, "B": 20, "site": "Amazon", "C": 12}, {"A": 25, "B": 10, "site": "Stackoverflow", "C": 8}, {"A": 20, "B": 22, "site": "Yahoo", "C": 12}, {"A": 76, "B": 30, "site": "eBay", "C": 9}],
mimeType: 'json',
keys: {
x: 'site',
value: ['A','B','C']
},
type: 'bar',
selection: {
enabled: true
},
onselected: function(d,element)
{
alert('selected x: '+chart.selected()[0].x+' value: '+chart.selected()[0].value+' name: '+chart.selected()[0].name);
},
groups: [
['A','B','C']
]
},
axis: {
x: {
type: 'category'
}
}
});
After some chart elemnt is selected (clicked), alert shows X and Value and Name attributes of first selected element. For example "selected x: 0 value: 67 name: A" after I click on left-top chart element. How can I get value shown on X axis? In this case it is "Google".
Property categories is populated when the x-axis is declared to be of type category as it is in this case. So to get the data from the x-axis you needs to call the .categories() function.
onselected: function(d,element){alert(chart.categories()[d.index]);}
https://jsfiddle.net/4bos2qzx/1/