Query header in s3 select nodejs - sql

I am using s3 select query along with where clause to retrieve data from s3.
The query is working fine and returning the expected result when there's no where clause. Although when I am using where clause, the filtered data is correct, but the key in the object is the first row after the header and not the header.
Example : csv file
A B C
1 2 3
1 5 6
Query : select * from s3object s where s._1 = '1' limit 100
Expected Output : [{A : 1, B:2, C:3}, {A:1, B:5, C:6}]
Actual Output : [{1:1, 2:5, 3:6}]
This is the params object I am using to query :
let params = {
Bucket: S3_BUCKET,
Key: S3_PATH,
Expression: "select * from s3object s where s._1 = '1' limit 100"
ExpressionType: "SQL",
InputSerialization: {
CSV: {
FileHeaderInfo: "NONE",
RecordDelimiter: "\n",
FieldDelimiter: ","
}
},
OutputSerialization: {
CSV: {}
}
};
I get the same output even when I use FileHeaderInfo : "USE", and change the query to select * from s3object s where id = '22' and s.date > '2020-05-01' limit 100
AWS Doc : https://docs.aws.amazon.com/AmazonS3/latest/API/API_SelectObjectContent.html

So it seems, while fetching the query results from s3, it is impossible to get the headers as well. We can query with headerNames, or with columnNumber, but if we use the where clause, then we should use headerNames, and in that case, the header row doesn't come in the results.
So, I have now hardcoded the headers in my api call from where I am calling s3 select query, and appending those in the results.

Change the params to the following should work.
let params = {
Bucket: S3_BUCKET,
Key: S3_PATH,
ExpressionType: "SQL",
Expression: "select * from s3object s where s.A = '1' limit 100"
InputSerialization: {
CSV: {
FileHeaderInfo: "USE",
RecordDelimiter: "\n",
FieldDelimiter: ","
}
},
OutputSerialization: {
JSON: {}
}
};

Related

Null values in CSV Scenario Outline [duplicate]

I am able to read a csv file and convert it to json by
def expectedResponse = read('classpath:somefile.csv')
Suppose I have csv file as below
name,age
praveen,29
joseph,20
1.It is converting all elements as string and stores in the variable as json. How to keep the number as a number ? because it causes match failure which i do later with the actual response.
2.How to get the value 20. Like by specifying joseph, I want to get the age.
I got the jsonpath as
get expectedResponse $.[?(#.member == '<name>')].age
I get the name from examples. So I get it as joseph in runtime. But i get error as reason: not equal (Integer : JSONArray). It is not returning the age alone (Integer value)
Or is there any better way to get it ?
The CSV format does not contain any type information, so everything defaults to "string" and you have to convert it yourself. But this is easy using karate.map().
* text users =
"""
name,age
praveen,29
joseph,20
"""
* csv users = users
* match users == [{ name: 'praveen', age: '29' }, { name: 'joseph', age: '20' }]
* def fun = function(x){ x.age = ~~x.age; return x }
* def users = karate.map(users, fun)
* match users == [{ name: 'praveen', age: 29 }, { name: 'joseph', age: 20 }]

Querying an array of objects in JSONB

I have a table with a column of the data type JSONB. Each row in the column has a JSON that looks something like this:
[
{
"A":{
"AA": "something",
"AB": false
}
},
{
"B": {
"BA":[
{
"BAAA": [1,2,3,4]
},
{
"BABA": {
....
}
}
]
}
}
]
Note: the JSON is a complete mess of lists and objects, and it has a total of 300 lines. Not my data but I am stuck with it. :(
I am using postgresql version 12
How would I write the following queries:
Return all row that has the value of AB set to false.
Return the values of BAAA is each row.
You can find the AB = false rows with a JSON Path query:
select *
from test
where data ## '$[*].A.AB == false'
If you don't know where exactly the key AB is located, you can use:
select *
from test
where data ## '$[*].**.AB == false'
To display all elements from the array as rows, you can use:
select id, e.*
from test
cross join jsonb_array_elements(jsonb_path_query_first(data, '$[*].B.BA.BAAA')) with ordinality as e(item, idx)
I include a column "id" as a placeholder for the primary key column, so that the source of the array element can be determined in the output.
Online example

SQL Server - "for json path" statement does not return more than 2984 lines of JSON string

I'm trying to generate huge amount of data in a complex and nested JSON string using "for json path" statement, and I'm using multiple functions to create different parts of this JSON string, as follow:
declare #queue nvarchar(max)
select #queue = (
select x.ID as layoutID
, l.Title as layoutName
, JSON_QUERY(queue_objects (#productID, x.ID)) as [objects]
from Layouts x
inner join LayoutLanguages l on l.LayoutID = x.ID
where x.ID = #layoutid
group by x.ID, l.Title
for json path
)
select #queue as JSON
Thus far, JSON would be:
{
"root": [{
"layouts": [{
"layoutID": 5
, "layoutName": "foo"
, "objects": []
}]
}]
}
and the "queue_objects" function then would be called to fill out 'objects' array:
queue_objects
select 0 as objectID
, case when (select inherited_counter(#layoutID,0)) > 0 then 'false' else 'true' end as editable
, JSON_QUERY(queue_properties (p.Table2ID)) as propertyObjects
, JSON_QUERY('[]') as inherited
from productList p
where p.Table1ID = #productID
group by p.Table2ID
for json path
And then JSON would be:
{
"root": [{
"layouts": [{
"layoutID": 5
, "layoutName": "foo"
, "objects": [{
"objectID": 1000
, "editable": "true"
, "propertyObjects": []
, "inherited": []
}, {
"objectID": 2000
, "editable": "false"
, "propertyObjects": []
, "inherited": []
}]
}]
}]
}
Also "inherited_counter" and "queue_properties" functions would be called to fill corresponding keys.
This is just a sample, the code won't work as I'm not putting functions here.
But my question is: is it the functions that simultaneously call each other, makes the server return broken JSON string? or it's the server itself that can't handle JSON strings more than 2984 lines?
EDIT: what I mean by 2984 lines, is that I use beautifier on JSON, the server won't return the string line by line, it returns JSON broken, but after beautifying it happens to be 2984 lines of string.
As I wrote in my comment to the OP, this is probably due to SSMS has a limit of how many characters to display in a column in the result grid. It has no impact on the actual result, e.g. the result has all data, it is just that SSMS doesn't display it all.
To fix this, you can increase the number of characters SSMS retrieves:
I would not recommend that - "how long is a piece of string", but instead select the result into a nvarchar(max) variable, and PRINT that variable. That should give you the whole text.
Hope this helps!

ranked full text search results using Lucene with modeshape

I'm trying to get full-text search working with modeshape. I'm particularly interested in ranked results based on lucene index. Here is my repository configuration
"indexProviders": {
"lucene": {
"classname": "lucene",
"directory": "${user.home}/repository/indexes"
}
},
"indexes": {
"textFromFiles": {
"kind": "text",
"provider": "lucene",
"nodeType": "nt:resource",
"columns": "jcr:data(BINARY)"
}
},
I noticed a lucene index created at the specified location. I added 10-15 filesc with varied number of occurrence of search term into repository, and tried searching using some words. I am printing the score as shown below
QueryManager querymgr = session.getWorkspace().getQueryManager();
String query = "SELECT file.* FROM [nt:hierarchyNode] as file LEFT JOIN [nt:resource] as data ON ISCHILDNODE(data , file) WHERE "
+ "contains(data.*, '" + searchText + "')";
Query createQuery = querymgr.createQuery(query, Query.JCR_SQL2);
QueryResult result = createQuery.execute();
RowIterator rows = result.getRows();
while(rows.hasNext()){
Row nextRow = rows.nextRow();
LOGGER.info("score : {}", nextRow.getScore());
}
But, here score is always 1.0 for all results.
Also tried a simpler query without join...
SELECT data.* FROM [nt:resource] as data WHERE contains(data.*, 'searchterm')
but no luck

Bluemix SQLDB Query - Can't figure out JSON Parameter Markings

In my Nodered Bluemix application, I'm trying to make a SqlDB query, but I can't find sufficient documentation or examples on how to use the parameter markings in the query. Are there any examples and further insight into what I am doing wrong? Here is the flow I am having trouble with:
[
{
"id":"7924a83a.03355",
"type":"websocket-listener",
"path":"/ws/dbdata",
"wholemsg":"false"
},
{
"id":"b84efad2.9a2a58",
"type":"function",
"name":"Parse JSON",
"func":"msg.payload = JSON.parse(msg.payload);\nvar begin = msg.payload[0].split(\" \");\nbegin[1] = begin[1]+\":00\";\nvar date1 = begin[0].split(\"-\");\nvar processStart = date1[2]+\"-\"+date1[0]+\"-\"+date1[1]+\" \"+begin[1];\n\nvar end = msg.payload[0].split(\" \");\nend[1] = end[1]+\":00\";\nvar date2 = end[0].split(\"-\");\nvar processEnd = date2[2]+\"-\"+date2[0]+\"-\"+date2[1]+\" \"+end[1];\n\nmsg.payload[0] = processStart;\nmsg.payload[1] = processEnd;\nreturn msg;",
"outputs":1,"noerr":0,"x":381.79998779296875,"y":164.8000030517578,"z":"3f9da5d2.b3f0aa",
"wires":[["4f92b16a.cf981"]]
},
{
"id":"3e20f8a4.06451",
"type":"websocket in",
"name":"dbInput",
"server":"7924a83a.03355",
"client":"",
"x":159.8000030517578,"y":164.8000030517578,"z":"3f9da5d2.b3f0aa",
"wires":[["b84efad2.9a2a58"]]
},
{
"id":"68a4a35.5983f5c",
"type":"debug",
"name":"",
"active":true,"console":"false",
"complete":"true",
"x":970.7999877929688,"y":162.8000030517578,"z":"3f9da5d2.b3f0aa",
"wires":[]
},
{
"id":"5a0aed1c.34279c",
"type":"sqldb in",
"service":"LabSensors-sqldb",
"query":"",
"params":"{msg.begin},{msg.end}",
"name":"db Request",
"x":787.7999877929688,"y":163.8000030517578,"z":"3f9da5d2.b3f0aa",
"wires":[["68a4a35.5983f5c"]]
},
{
"id":"e08c4a85.e95e68",
"type":"debug",
"name":"",
"active":true,"console":"false",
"complete":"true",
"x":791.7999877929688,"y":233.8000030517578,"z":"3f9da5d2.b3f0aa",
"wires":[]
},
{
"id":"4f92b16a.cf981",
"type":"function",
"name":"Construct Query",
"func":"msg.begin = msg.payload[0];\nmsg.end = msg.payload[1];\nmsg.payload = \"SELECT * FROM IOT WHERE TIME >= '?' AND TIME < '?'\";\nreturn msg;",
"outputs":1,"noerr":0,"x":583.7999877929688,"y":163.8000030517578,"z":"3f9da5d2.b3f0aa",
"wires":[["5a0aed1c.34279c",
"e08c4a85.e95e68"]]
}
]
In the node-red documentation for the SQLDB query node it says:
"Parameter Markers is a comma delimited set of json paths. These will replace any question marks that you place in your query, in the order that they appear."
Have you tried removing the curly braces, i.e. to set the "params" field in the node to just "msg.begin,msg.end"?
You just need remove single quotes this is a correct sentence:
msg.payload = "SELECT * FROM IOT WHERE TIME >= ? AND TIME < ?";