I have a rather complex query and I'm not sure if it's doable in psql.
I have 3 tables, checklists, checklist_items and a junction table that maintains a many-to-many relationship between checklist & checklist_items.
checklists
| id | description |
| -------- | ---------------- |
| 01 | 'Upon Dispatch' |
| 02 | 'Upon Return' |
checklist_items
| id | type |elements |
| -- | ----------| -------------------------|
| 01 | 'combobox'| [] |
| 02 | 'text' | [] |
| 03 | 'select' | ['option 1', 'option 2'] |
junction_table
| checklist_id | checklist_item_id |
| ------------ | ----------------- |
| 01 | 01 |
| 01 | 02 |
| 01 | 03 |
| 02 | 02 |
I was trying to write a query to get all checklists along with all checklist_items associated with each individual checklist.
Something along these lines
[
{
id: '01',
description: 'Upon Dispatch',
items: [
{
id: '01',
type: 'combobox',
elements: []
},
{
id: '02',
type: 'text',
elements: []
},
{
id: '01',
type: 'select',
elements: ['option 1', 'option 2']
}
]
},
{
id: '02',
description: 'Upon Return',
items: [
{
id: '02',
type: 'text',
elements: []
}
]
}
]
Current solution is me getting all checklists, iterating over them to find their checklist_items in junction_table, populate checklist_items with a join and add items field to each checklist.
const checklists = await knex.select('*').from('checklists');
for (let checklist of checklists) {
let data = await knex('junction_table')
.join(
'checklist_items',
'junction_table.checklist_item_id',
'=',
'checklist_items.id'
)
.select('id', 'type', 'elements')
.where({ checklist_id: checklist.id });
checklist.items = data;
}
Is there a better way to achieve this ? preferably without iterating over the checklists array in code.
how to parameterize the item block in code :-
Scenario Outline: parameterization
* text query =
"""
{
"add":"Product",
"item":[
{"pn":"12345","qn":1,"m":"mk"}
]
}
"""
Given url baseURL
And request { query: '#(query)' }
And header Accept = 'application/json'
When method post
Then status 200
Examples:
| item_num |
| 12345 |
| 67890 |
Scenario Outline:
* def json = { add: 'Product', item: [{ pn: '<itemNum>', qn: 1, m: 'mk'}]}
* print json
Examples:
| itemNum |
| 12345 |
| 67890 |
I found a lot of introductions to parameterized tests/ test cases for test cafe, but the syntax is completely different to the one I am using. I guess they're for the discontinued paid version. How can I do the same thing with the free version? I'm not looking for user roles specifically, I want to write tests with parameters in general.
Do you want to do something like this?
This works for me perfectly
import { Selector } from 'testcafe';
fixture `Your fixture`
.page `http://some_url.com`
const testCases = [
{ name: 'name1', param: 'param1' },
{ name: 'name2', param: 'param2' }
...
];
for (const c of testCases) {
test(`Test ${c.name}`, async t => {
yourTestMethod(c.param)
});
}
An additional twist can be added by using a combination of JS and YAML
import YamlTableReader, {fixtureData, TestData} from "./YamlTableReader";
var table = fixtureData `
| ID | N1 | N2 | Equals |
| Should Be equal | 1 | 1 | true |
| Shouldn't be equal | 1 | 2 | false |
| Shouldn't be equal | 1 | "hans" | false |
| Should be equal | hans | "hans" | true |
`;
table.forEach(row => {
test('Should be equal', t => {
row["Equals"] == (row["N1"] === row["N2"]));
}
});
Simple Sources for this can be found here https://github.com/deicongmbh/jasmine-param-tests
I have a table with one nested repeated field called article_id and a string field that contains a json string.
Here is the schema of my table:
Here is an example row of the table:
[
{
"article_id": "2732930586",
"author_names": [
{
"AuN": "h kanahashi",
"AuId": "2591665239",
"AfN": null,
"AfId": null,
"S": "1"
},
{
"AuN": "t mukai",
"AuId": "2607493793",
"AfN": null,
"AfId": null,
"S": "2"
},
{
"AuN": "y yamada",
"AuId": "2606624579",
"AfN": null,
"AfId": null,
"S": "3"
},
{
"AuN": "k shimojima",
"AuId": "2606600298",
"AfN": null,
"AfId": null,
"S": "4"
},
{
"AuN": "m mabuchi",
"AuId": "2606138976",
"AfN": null,
"AfId": null,
"S": "5"
},
{
"AuN": "t aizawa",
"AuId": "2723380540",
"AfN": null,
"AfId": null,
"S": "6"
},
{
"AuN": "k higashi",
"AuId": "2725066679",
"AfN": null,
"AfId": null,
"S": "7"
}
],
"extra_informations": "{
\"DN\": \"Experimental study for improvement of crashworthiness in AZ91 magnesium foam controlling its microstructure.\",
\"S\":[{\"Ty\":1,\"U\":\"https://shibaura.pure.elsevier.com/en/publications/experimental-study-for-improvement-of-crashworthiness-in-az91-mag\"}],
\"VFN\":\"Materials Science and Engineering\",
\"FP\":283,
\"LP\":287,
\"RP\":[{\"Id\":2024275625,\"CoC\":5},{\"Id\":2035451257,\"CoC\":5}, {\"Id\":2141952446,\"CoC\":5},{\"Id\":2126566553,\"CoC\":6}, {\"Id\":2089573897,\"CoC\":5},{\"Id\":2069241702,\"CoC\":7}, {\"Id\":2000323790,\"CoC\":6},{\"Id\":1988924750,\"CoC\":16}],
\"ANF\":[
{\"FN\":\"H.\",\"LN\":\"Kanahashi\",\"S\":1},
{\"FN\":\"T.\",\"LN\":\"Mukai\",\"S\":2},
{\"FN\":\"Y.\",\"LN\":\"Yamada\",\"S\":3},
{\"FN\":\"K.\",\"LN\":\"Shimojima\",\"S\":4},
{\"FN\":\"M.\",\"LN\":\"Mabuchi\",\"S\":5},
{\"FN\":\"T.\",\"LN\":\"Aizawa\",\"S\":6},
{\"FN\":\"K.\",\"LN\":\"Higashi\",\"S\":7}
],
\"BV\":\"Materials Science and Engineering\",\"BT\":\"a\"}"
}
]
In the extra_information.ANF I have an nested array that contains some more author name information.
The nested repeated author_name field has a sub-field author_name.S which can be mapped into extra_informations.ANF.S for a join. Using this mapping I am trying to achieve the following table:
| article_id | author_names.AuN | S | extra_information.ANF.FN | extra_information.ANF.LN|
| 2732930586 | h kanahashi | 1 | H. | Kanahashi |
| 2732930586 | t mukai | 2 | T. | Mukai |
| 2732930586 | y yamada | 3 | Y. | Yamada. |
| 2732930586 | k shimojima | 4 | K. | Shimojima |
| 2732930586 | m mabuchi | 5 | M. | Mabuchi |
| 2732930586 | t aizawa | 6 | T. | Aizawa |
| 2732930586 | k higashi | 7 | K. | Higashi |
The primary problem I faced is that when I convert a json_string using JSON_EXTRACT(extra_information,"$.ANF"), it does not give me an array, instead it gives me the string format of the nested repeated array, which I could not convert into an array.
Is it possible to produce such table using standards-sql in bigquery?
Option 1
This is based on REGEXP_REPLACE function and few more functions (REPLACE, SPLIT, etc.) to manipulate with result. Note - we need extra manipulation because wildcards and filters are not supported in JsonPath expressions in BigQuery?
#standard SQL
SELECT
article_id, author.AuN, author.S,
REPLACE(SPLIT(extra, '","')[OFFSET(0)], '"FN":"', '') FirstName,
REPLACE(SPLIT(extra, '","')[OFFSET(1)], 'LN":"', '') LastName
FROM `table` , UNNEST(author_names) author
LEFT JOIN UNNEST(SPLIT(REGEXP_REPLACE(JSON_EXTRACT(extra_informations, '$.ANF'), r'\[{|}\]', ''), '},{')) extra
ON author.S = CAST(REPLACE(SPLIT(extra, '","')[OFFSET(2)], 'S":', '') AS INT64)
Option 2
To overcome BigQuery "limitation" for JsonPath, you can use custom function as the example below shows:
Note : it uses jsonpath-0.8.0.js that can be downloaded from https://code.google.com/archive/p/jsonpath/downloads and assumed to be uploaded to Google Cloud Storage - gs://your_bucket/jsonpath-0.8.0.js
#standard SQL
CREATE TEMPORARY FUNCTION CUSTOM_JSON_EXTRACT(json STRING, json_path STRING)
RETURNS STRING
LANGUAGE js AS """
try { var parsed = JSON.parse(json);
return jsonPath(parsed, json_path);
} catch (e) { return null }
"""
OPTIONS (
library="gs://your_bucket/jsonpath-0.8.0.js"
);
SELECT
article_id, author.AuN, author.S,
CUSTOM_JSON_EXTRACT(extra_informations, CONCAT('$.ANF[?(#.S==', CAST(author.S AS STRING), ')].FN')) FirstName,
CUSTOM_JSON_EXTRACT(extra_informations, CONCAT('$.ANF[?(#.S==', CAST(author.S AS STRING), ')].LN')) LastName
FROM `table`, UNNEST(author_names) author
As you can see - now you can do all magic in one simple JsonPath
I have a dataframe that I can import in elasticsearch without any problem. But each row will be created as a new record. I want to save the destination number as ID and update the document with additional records.
from StringIO import StringIO
import pandas as pd
u_cols = ["destination", "status", "batchid", "dateint", "message", "senddate"]
audit_trail = StringIO('''
918968400000 | DELIVRD | abcd_xyz-e89a4ebd3729675c | 20150226103700 | "some company is advertising here" | 2015-04-02 13:12:18
918968400000 | DELIVRD | efgh_xyz-e89a4ebd3729675c | 20160226103700 | "some company is advertising here" | 2016-04-02 13:12:18
8918968400000 | FAILED | abcd_xyz-e89a4ebd3729675c | 20150826103700 | "some company is advertising here" | 2015-08-02 13:12:18
8918968400000 | DELIVRD | xyz_abc-e89a4ebd3729675c | 20140226103700 | "some company is advertising here" | 2014-04-02 13:12:18
918968400000 | FAILED | abcd_pqr-e89a4ebd3729675c | 20150221103700 | "some company is advertising here" | 2015-04-21 13:12:18
''')
df11 = pd.read_csv(audit_trail, sep="|", names = u_cols )
import json
tmp = df11.to_json(orient = "records")
df_json= json.loads(tmp)
mylist=[]
for doc in df_json:
action = { "_index": "myindex3", "_type": "myindex1type", "_source": doc }
mylist.append(action)
import elasticsearch
from elasticsearch import helpers
es = elasticsearch.Elasticsearch('http://23.23.186.196:9200')
helpers.bulk(es, mylist)
In the above case I expect only 2 documents. One document ID 918968400000 with 3 records and the other 8918968400000 with only 2 records. These records will be nested something like this...
doc={"campaigns" : [{"status": "FAILED", "batchid": "abcd_xyz-e89a4ebd3729675c", "dateint": 20150826103700, "message" : "some company is advertising here", "senddate": "2015-08-02 13:12:18"},
{"status": "DELIVRD", "batchid": "xyz_abc-e89a4ebd3729675c", "dateint": 20140226103700, "message" : "some company is advertising here", "senddate": "2014-04-02 13:12:18" }]}
res = es.index(index="test-index", doc_type='tweet', id=8918968400000, body=doc)
I need the pandas dataframe to use bulk API to insert the data as shown above. Is it possible?
Update
I changed the option type from index to update. This stores all the fields those I need. But it does not store them as nested objects.
mylist=[]
for id, doc in enumerate(df_json):
mydoc[id] = {}
mydoc[id]['doc']=doc
mydoc[id]['doc_as_upsert'] = True
action = { "_index": "myindex9", "_type": "myindex1type", "_id": doc['destination'] , "_on_type": "update", "_source": mydoc }
mylist.append(action)
Is there any way to store rows as nested objects?