Karate - How to construct two tables, using lines from each to validate against the other [duplicate] - api

I want to use single row under examples in cucumber like below:
Examples:
| data1 | data2|paymentOp|
| MySql | uk1 |??????????|
Where paymentOp is a number which I am getting from java method which has List as an argument. The method returns each of the numbers which I want to pass it under paymentOp.
There is an absolute way to iterate it by copy the row and paste it again in the table but I don't want that because the method has a dynamic result which may return 2 or 5 set of numbers.
Is it possible to achieve it using Karate?
How to proceed further. Any lead here would be much appreciated!

You can combine Examples: with dynamic behavior. Please read this example (especially the second one): https://github.com/intuit/karate/blob/master/karate-demo/src/test/java/demo/outline/examples.feature
Since you have difficulties reading the docs and examples (:P) here is a simple example. Take some time to understand it carefully.
Background:
* def data = { one: 1, two: 2, three: 3 }
Scenario Outline:
* match data.<key> == <value>
Examples:
| key | value |
| one | 1 |
| two | 2 |
| three | 3 |

Related

Cypher: How to create a recursive cost query with alternates?

I have the following structure:
(:pattern)-[:contains]->(:pattern)
...basically a hierarchy of patterns that use other patterns as content. These constitute trees.
Certain patterns are generated by certain generators:
(:generator)-[:canProduce]->(:pattern)
The canProduce relationship has a cost value associated with it as a property. Multiple generators can create the same pattern.
I would like to figure out, with a query, what patterns I need to generate to produce a particular output - and which generators to choose to have the lowest cost. I started like this:
MATCH (p:pattern {name: 'preciousPattern'})-[:contains *]->(ps:pattern) RETURN ps
so far so good. The results don't contain the starting pattern, so I made this:
MATCH (p:pattern {name: 'preciousPattern'})-[:contains *]->(ps:pattern)
WITH p+collect(ps) as list
UNWIND list as patterns
RETURN patterns
That does not feel elegant, but it also does not provide the hierarchy
I can of course do a path query (MATCH path = MATCH...) but the results don't seem very useful.
Also, now I need to connect the cost from the generator relationship.
I tried this:
MATCH (p:pattern {name: 'awesome'})-[:contains *]->(ps:pattern)
WITH p+collect(ps) as list
UNWIND list as rec
CALL {
WITH rec
MATCH (rec)-[r:canGenerate]-(g:generator)
return r.GenCost as GenCost, g.name AS GenName
}
return rec.name, GenCost , GenName
The problem I have now is that if any of the patterns that are part of another pattern can be generated by multiple generators, I just get double entries in the list, but what I want is separate lists for each alternative possibility, so that I can generate the cost.
This is my pattern tree:
Awesome
input1
input2
input 3
Input 3 can be generated by 2 different generators. I now get:
Awesome | 2 | MainGen
input1 | 3 | TestGen1
input2 | 2.5 | TestGen2
input3 | 1.25 | TestGen3
input4 | 1.4 | TestGen4
What I want is this: Two lists (or n, in the general case, where I might have n possible paths), one
Awesome | 2 | MainGen
input1 | 3 | TestGen1
input2 | 2.5 | TestGen2
input3 | 1.25 | TestGen3
and one:
Awesome | 2 | MainGen
input1 | 3 | TestGen1
input2 | 2.5 | TestGen2
input4 | 1.4 | TestGen4
each set representing one alternative set, so that I can calculate the costs and compare.
I have no idea how to do something like that. Any suggestions?

Returning a value based on multiple conditions in excel

Consider the following data:
Item | Overall | Individual | newColumn
A | Fail | Pass | blank
A | Fail | Fail | blank
B | Fail | Pass | issue
B | Fail | Pass | issue
C | Pass | Pass | blank
I have the logic built out for the first 3 columns already. There are two levels of fails in this data:
overall, and
individual.
If any of the individual fail, the overall fails. Sometimes the overall can fail even though all the individuals are fine. This logic is already built out.
I am trying to find a formula for the newColumn. If all the individuals are a pass for a given item (example item B), but the overall is still a fail, the cell should return the text "issue". It is ok if it returns issue twice, not sure if you can non-dupe that part. I've tried various forms of countifs/and/ors and creating columns that count distinct values but I always find a scenario where it will break the logic.
Try this:
=IF(COUNTIFS($A$2:$A$6,A2,$C$2:$C$6,"Fail"),"blank",IF(B2="Fail","Issue","blank"))
As required
If you add a new column with the formula:
=IF(B2="Fail",IF(COUNTIFS(A:A,A2,C:C,"fail")=0,"issue",""),"")
Then this should work on the assumptions:
For each item if one of the overalls are false they are all false
The only two possible values are "Pass" and "Fail" for columns B & C
If you require the word blank instead of a blank cell then use:
=IF(B2="Fail",IF(COUNTIFS(A:A,A2,C:C,"fail")=0,"issue","blank"),"blank")

How to apply a custom filtering function on a Spark DataFrame

I have a DataFrame of the form:
A_DF = |id_A: Int|concatCSV: String|
and another one:
B_DF = |id_B: Int|triplet: List[String]|
Examples of concatCSV could look like:
"StringD, StringB, StringF, StringE, StringZ"
"StringA, StringB, StringX, StringY, StringZ"
...
while a triplet is something like:
("StringA", "StringF", "StringZ")
("StringB", "StringU", "StringR")
...
I want to produce the cartesian set of A_DF and B_DF, e.g.;
| id_A: Int | concatCSV: String | id_B: Int | triplet: List[String] |
| 14 | "StringD, StringB, StringF, StringE, StringZ" | 21 | ("StringA", "StringF", "StringZ")|
| 14 | "StringD, StringB, StringF, StringE, StringZ" | 45 | ("StringB", "StringU", "StringR")|
| 18 | "StringA, StringB, StringX, StringY, StringG" | 21 | ("StringA", "StringF", "StringZ")|
| 18 | "StringA, StringB, StringX, StringY, StringG" | 45 | ("StringB", "StringU", "StringR")|
| ... | | | |
Then keep just the records that have at least two substrings (e.g StringA, StringB) from A_DF("concatCSV") that appear in B_DF("triplet"), i.e. use filter to exclude those that don't satisfy this condition.
First question is: can I do this without converting the DFs into RDDs?
Second question is: can I ideally do the whole thing in the join step--as a where condition?
I have tried experimenting with something like:
val cartesianRDD = A_DF
.join(B_DF,"right")
.where($"triplet".exists($"concatCSV".contains(_)))
but where cannot be resolved. I tried it with filter instead of where but still no luck. Also, for some strange reason, type annotation for cartesianRDD is SchemaRDD and not DataFrame. How did I end up with that? Finally, what I am trying above (the short code I wrote) is incomplete as it would keep records with just one substring from concatCSV found in triplet.
So, third question is: Should I just change to RDDs and solve it with a custom filtering function?
Finally, last question: Can I use a custom filtering function with DataFrames?
Thanks for the help.
The function CROSS JOIN is implemented in Hive, so you could first do the cross-join using Hive SQL:
A_DF.registerTempTable("a")
B_DF.registerTempTable("b")
// sqlContext should be really a HiveContext
val result = sqlContext.sql("SELECT * FROM a CROSS JOIN b")
Then you can filter down to your expected output using two udf's. One that converts your string to an array of words, and a second one that gives us the length of the intersection of the resulting array column and the existing column "triplet":
import scala.collection.mutable.WrappedArray
import org.apache.spark.sql.functions.col
val splitArr = udf { (s: String) => s.split(",").map(_.trim) }
val commonLen = udf { (a: WrappedArray[String],
b: WrappedArray[String]) => a.intersect(b).length }
val temp = (result.withColumn("concatArr",
splitArr(col("concatCSV"))).select(col("*"),
commonLen(col("triplet"), col("concatArr")).alias("comm"))
.filter(col("comm") >= 2)
.drop("comm")
.drop("concatArr"))
temp.show
+----+--------------------+----+--------------------+
|id_A| concatCSV|id_B| triplet|
+----+--------------------+----+--------------------+
| 14|StringD, StringB,...| 21|[StringA, StringF...|
| 18|StringA, StringB,...| 21|[StringA, StringF...|
+----+--------------------+----+--------------------+

Value to table header in Pentaho

Hi I'm quite new in Pentaho Spoon and I have a problem:
I have a table like this:
model | type | color| q
--1---| --1-- | blue | 1
--1---| --2-- | blue | 2
--1---| --1-- | red | 1
--1---| --2-- | red | 3
--2---| --1-- | blue | 4
--2---| --2-- | blue | 5
And I would like to create a single table (to export in csv or excel) for each model grouped by type with the value of the group as header and as value the q value:
table-1.csv
type | blue | red
--1--| -1-- | -1-
--2--| -2-- | -3-
table-2.csv
type | blue
--1--| -4-
--2--| -5-
I tried with row denormalizer but nothing.
Any suggestion?
Typically it's helpful to see what you have done in order to offer help, but I know how counterintuitive the "help" on this step is.
Make sure you sort the rows on Model and Type before sending them to the denormalizer step. Then give this a try:
As for splitting the output into files, there are a few ways to handle that. Take a look at the Switch/Case step using the Model field.
Also, if you haven't found them already, take a look at the sample files that come with the PDI download. They should be in ...pdi-ce-6.1.0.1-196\data-integration\samples. They can be more helpful than the online documentation sometimes.
Row denormalizer can't be used here if number of colors is unknown, also, you can't define text output fields dynamically.
There are few ways that I can see without using java and js steps. One of them is based on the following idea: we can prepare rows with two columns:
Row Model
type|blue|red 1
1|1|1 1
2|2|3 1
type|blue 2
1|4 2
2|5 2
Then we can prepare filename for each row using Model field and then easily output all rows using text output where file name is taken from filename field. In this case all records will be exported into two files without additional efforts.
Here you can find sample transformation: copy-paste me into new transformation
Please note that it's a sample solution that works only with csv. Also it works only if you have the same number of colors for each type inside model. It's just a hint how to use spoon, it's not a complete solution.

Generate automatically all the variables and value labels in SPSS

I have the variable labels and value labels in a table in my database, like this
id_variable_label | variable_label | id_value_label | value_label | id_father_label
---------------------------------------------------------------------------------------------------------
1 | father_label | null | null | null
null | father_label | 1 | child01 | 1
null | father_label | 2 | child02 | 1
Is there a way to generate automatically all the variables and value labels when I import the data from my database through a ODBC connection?
There isn't a direct way to do this, but if you read that table as an SPSS dataset, it would be pretty simple to generate the labels with a little Python code.
Note also that if your labeling is static, you can use APPLY DICTIONARY to copy labels from one dataset to another, so saving one fully labeled file would allow you to propagate that to others that are similarly structured.
You can use SPSS syntax to create variable and value labels.
See the SPSS commands VARIABLE LABELS and VALUE LABELS.
Here's a tutorial here that explains how you can use them.
You could generate the syntax from your database.