Not able to read '#' symbol from a csv file in karate [duplicate] - karate

I want to write data-driven tests passing dynamic values reading from external file (csv).
Able to pass dynamic values from csv for simple strings (account number & affiliate id below). But, using embedded expressions, how can I pass dynamic values from csv file for "DealerReportFormats" json array below?
Any help is highly-appreciated!!
Scenario Outline: Dealer dynamic requests
Given path '/dealer-reports/retrieval'
And request read('../DealerTemplate.json')
When method POST
Then status 200
Examples:
| read('../DealerData.csv') |
DealerTemplate.json is below
{
"DealerId": "FIXED",
"DealerName": "FIXED",
"DealerType": "FIXED",
"DealerCredentials": {
"accountNumber": "#(DealerCredentials_AccountNumber)",
"affiliateId": "#(DealerCredentials_AffiliateId)"
},
"DealerReportFormats": [
{
"name": "SalesReport",
"format": "xml"
},
{
"name": "CustomerReport",
"format": "txt"
}
]
}
DealerData.csv:
DealerCredentials_AccountNumber,DealerCredentials_AffiliateId
testaccount1,123
testaccount2,12345
testaccount3,123456

CSV is only for "flat" structures, so trying to mix that with JSON is too ambitious in my honest opinion. Please look for another framework if needed :)
That said I see 2 options:
a) use proper quoting and escaping in the CSV
b) refer to JSON files
Here is an example:
Scenario Outline:
* json foo = foo
* print foo
Examples:
| read('test.csv') |
And test.csv is:
foo,bar
"{ a: 'a1', b: 'b1' }",test1
"{ a: 'a2', b: 'b2' }",test2
I leave it as an exercise to you if you want to escape double-quotes. It is possible.
Option (b) is you can refer to stand-alone JSON files and read them:
foo,bar
j1.json,test1
j2.json,test2
And you can do * def foo = read(foo) in your feature.

Related

Text processing to fetch the attributes

Find below the input data:
[{"acc_id": 166211981, "archived": true, "access_key": "ALLLJNXXXXXXXPU4C7GA", "secret_key": "X12J6SixMaFHoXXXXZW707XXX24OXXX", "created": "2018-10-03T05:56:01.208069Z", "description": "Data Testing", "id": 11722990697, "key_field": "Ae_Appl_Number", "last_modified": "2018-10-03T08:44:20.324237Z", "list_type": "js_variables", "name": "TEST_AE_LI_KEYS_003", "project_id": 1045199007354, "s3_path": "opti-port/dcp/ue.1045199007354/11722990697"}, {"acc_id": 166211981, "archived": false, "access_key": "ALLLJNXXXXXXXPU4C7GA", "secret_key": "X12J6SixMaFHoXXXXZW707XXX24OXXX", "created": "2018-10-03T08:46:32.535653Z", "description": "Data Testing", "id": 11724290732, "key_field": "Ae_Appl_Number", "last_modified": "2018-10-03T10:11:13.167798Z", "list_type": "js_variables", "name": "TEST_AE_LI_KEYS_001", "project_id": 1045199007354, "s3_path": "opti-port/dcp/ue.1045199007354/11724290732"}]
I want the output file to contain below data:
11722990697,TEST_AE_LI_KEYS_003,opti-port/dcp/ue.1045199007354/11722990697
11724290732,EST_AE_LI_KEYS_001,opti-port/dcp/ue.1045199007354/11724290732
I am able to achieve the same by taking one record at a time and processing it using awk.but i am getting the field names also.
find below my trial:
R=cat in.txt | awk -F '},' '{print $1}'
echo $R | awk -F , '{print $7 " " $11 " " $13}'
I want it to be done for entire file without field names.
AWK/SED is not the right tool for parsing JSON files. Use jq
[root#localhost]# jq -r '.[] | "\(.acc_id),\(.name),\(.s3_path)"' abc.json
166211981,TEST_AE_LI_KEYS_003,opti-port/dcp/ue.1045199007354/11722990697
166211981,TEST_AE_LI_KEYS_001,opti-port/dcp/ue.1045199007354/11724290732
If you don't want to install any other software then you can use python as well which is found on most of the linux machine
[root#localhost]# cat parse_json.py
#!/usr/bin/env python
# Import the json module
import json
# Open the json file in read only mode and load the json data. It will load the data in python dictionary
with open('abc.json') as fh:
data = json.load(fh)
# To print the dictionary
# print(data)
# To print the name key from first and second record
# print(data[0]["name"])
# print(data[1]["name"])
# Now to get both the records use a for loop
for i in range(0,2):
print("%s,%s,%s") % (data[i]["access_key"],data[i]["name"],data[i]["s3_path"])
[root#localhost]# ./parse_json.py
ALLLJNXXXXXXXPU4C7GA,TEST_AE_LI_KEYS_003,opti-port/dcp/ue.1045199007354/11722990697
ALLLJNXXXXXXXPU4C7GA,TEST_AE_LI_KEYS_001,opti-port/dcp/ue.1045199007354/11724290732
Assuming the input data is in a file called input.json, you can use a Python script to fetch the attributes. Put the following content in a file called fetch_attributes.py:
import json
with open("input.json") as fh:
data = json.load(fh)
with open("output.json", "w") as of:
for record in data:
of.write("%s,%s,%s\n" % (record["id"],record["name"],record["s3_path"]))
Then, run the script as:
python fetch_attributes.py
Code Explanation
import json - Importing Python's json library to parse the JSON.
with open("input.json") as fh: - Opening the input file and getting the file handler in if.
data = json.load(fh) - Loading the JSON input file using load() method from the json library which will populate the data variable with a Python dictionary.
with open("output.json", "w") as of: - Opening the output file in write mode and getting the file handler in of.
for record in data: - Loop over the list of records in the JSON.
of.write("%s,%s,%s\n" % (record["id"],record["name"],record["s3_path"])) - Fetching the required attributes from each record and writing them in the file.

How to generate CREATE TABLE script from an existing table

I create a table with Big Query interface. A large table. And I would like to export the schema of this table in Standard SQL (or Legacy SQL) syntax.
Is it possible ?
Thanks !
You can get the DDL for a table with this query:
SELECT t.ddl
FROM `your_project.dataset.INFORMATION_SCHEMA.TABLES` t
WHERE t.table_name = 'your_table_name'
;
As can be read in this question it is not possible to do so and there is a feature request to obtain the output schema of a standard SQL query but seems like it was not finally implemented. Depending on what your use case is, apart from using bq, another workaround is to do a query with LIMIT 0. Results are returned immediately (tested with a 100B row table) with the schema field names and types.
Knowing this you could also automate the procedure in your favorite scripting language. As an example I used Cloud Shell as the CLI and API calls. It makes three successive calls where the first one executes the query and a jobId is obtained (unnecessary fields are not included in request URL), then we obtain the dataset and table IDs correspondent to that particular job and, finally, the schema is retrieved.
I used the jq tool to parse the responses (manual), which comes preinstalled in the Shell, and wrapped everything in a shell function:
result_schema()
{
QUERY=$1
authToken="$(gcloud auth print-access-token)"
projectId=$(gcloud config get-value project 2>\dev\null)
# get the jobId
jobId=$(curl -H"Authorization: Bearer $authToken" \
-H"Content-Type: application/json" \
https://www.googleapis.com/bigquery/v2/projects/$projectId/queries?fields=jobReference%2FjobId \
-d"$( echo "{
\"query\": "\""$QUERY" limit 0\"",
\"useLegacySql\": false
}")" 2>\dev\null|jq -j .jobReference.jobId)
# get destination table
read -r datasetId tableId <<< $(curl -H"Authorization: Bearer $authToken" \
"https://www.googleapis.com/bigquery/v2/projects/$projectId/jobs/$jobId?fields=configuration(query(destinationTable(datasetId%2CtableId)))" 2>\dev\null | jq -j '.configuration.query.destinationTable.datasetId, " " ,.configuration.query.destinationTable.tableId')
# get resulting schema
curl -H"Authorization: Bearer $authToken" https://www.googleapis.com/bigquery/v2/projects/$projectId/datasets/$datasetId/tables/$tableId?fields=schema 2>\dev\null | jq .schema.fields
}
then we can invoke the function by querying a 100B row public dataset (don't specify LIMIT 0 as the function automatically adds it):
result_schema 'SELECT year, month, CAST(wikimedia_project as bytes) AS project_bytes, language AS lang FROM `bigquery-samples.wikipedia_benchmark.Wiki100B` GROUP BY year, month, wikimedia_project, language'
with the following output as the schema (mind the selected fields using casts and aliases to modify the returned schema):
[
{
"name": "year",
"type": "INTEGER",
"mode": "NULLABLE"
},
{
"name": "month",
"type": "INTEGER",
"mode": "NULLABLE"
},
{
"name": "project_bytes",
"type": "BYTES",
"mode": "NULLABLE"
},
{
"name": "lang",
"type": "STRING",
"mode": "NULLABLE"
}
]
This field array can then be copy/pasted (or further automated) in the fields editor when creating a new table using the UI.
I am not sure how it is possible using StandardSQL or Legacy SQL syntax. But you can get the schema in json format using command line.
From this link the command to do it would be:
bq show --schema --format=prettyjson [PROJECT_ID]:[DATASET].[TABLE] > [PATH_TO_FILE]

How to import csv file to postgresql table without using copy command

I am planning to use insert command or anything like bulk insert but not copy command . please help!
You tagged your question with ruby, so a ruby'ish way could be:
Install the smarter_csv gem (https://github.com/tilo/smarter_csv) which lets you parse each line into a hash where the column title is used as the key.
inserts = SmarterCSV.process('/path/to/file.csv')
# [
# { col_name: "value from row 1", ... },
# { col_name: "value from row 2", ... }
# ]
Then you might use whatever ORM or database connector you like, e.g. ActiveRecord:
MyModel.insert(inserts)

JSON file not loading into redshift

I have issues using the copy command in redshift to load in JSON objects, I am receiving a file in the below JSON format which fails when attempting to use the copy command, however when I adjust the json file to the bottom it works. This is not an ideal solution as I am not permiited to modify the JSON file
this works fine :
{
"id": 1,
"name": "Major League Baseball"
}
{
"id": 2,
"name": "National Hockey League"
}
This does not work (notice the extra square brackets)
[
{"id":1,"name":"Major League Baseball"},
{"id":2,"name":"National Hockey League"}
]
this is my json path
{
"jsonpaths": [
"$['id']",
"$['name']"
]
}
The problem with the COPY command is it does not really accept a valid JSON file. Instead, it expects a JSON-per-line which is shown in the documentation, but not obviously mentioned.
Hence, every line is supposed to be a valid JSON but the full file is not. That's why when you modify your file, it works.

Windows scripting to parse a HL7 file

I have a HUGE file with a lot of HL7 segments. It must be split into 1000 (or so ) smaller files.
Since it has HL7 data, there is a pattern (logic) to go by. Each data chunk starts with "MSH|" and ends when next segment starts with "MSH|".
The script must be windows (cmd) based or VBS as I cannot install any software on that machine.
File structure:
MSH|abc|123|....
s2|sdsd|2323|
...
..
MSH|ns|43|...
...
..
..
MSH|sdfns|4343|...
...
..
asds|sds
MSH|sfns|3|...
...
..
as|ss
File in above example, must be split into 2 or 3 files. Also, the files comes from UNIX, so newlines must remain as they are in the source file.
Any help?
This is a sample script that I used to parse large hl7 files into separate files with the new file names based on the data file. Uses REBOL which does not require installation ie. the core version does not make any registry entries.
I have a more generalised version that scans an incoming directory and splits them into single files and then waits for the next file to arrive.
Rebol [
file: %split-hl7.r
author: "Graham Chiu"
date: 17-Feb-2010
purpose: {split HL7 messages into single messages}
]
fn: %05112010_0730.dat
outdir: %05112010_0730/
if not exists? outdir [
make-dir outdir
]
data: read fn
cnt: 0
filename: join copy/part form fn -4 + length? form fn "-"
separator: rejoin [ newline "MSH"]
parse/all data [
some [
[ copy result to separator | copy result to end ]
(
write to-file rejoin [ outdir filename cnt ".txt" ] result
print "Got result"
?? result
cnt: cnt + 1
)
1 skip
]
]
HL7 has a lot of segments - I assume that you know that your file has only MSH segments. So, have you tried parsing the file for the string "(newline)MSH|"? Just keep a running buffer and dump that into an output file when it gets too big.