SQL Server JSON Query Filtered Return - sql

I have the following JSON data in a single column (varchar(max)) related to a user:
[{
"ExtensionData": {},
"IsDefault": false,
"MethodType": "OneWaySMS"
},
{
"ExtensionData": {},
"IsDefault": false,
"MethodType": "TwoWayVoiceMobile"
},
{
"ExtensionData": {},
"IsDefault": false,
"MethodType": "PhoneAppOTP"
},
{
"ExtensionData": {},
"IsDefault": true,
"MethodType": "PhoneAppNotification"
}]
Any way to query the users record and a leverage a subquery (or something like it) to return me just the MethodType that is in the same block as "isDefault: true"
So the column returned would just say "PhoneAppNotification" based on the example above?

Try this:
select MethodType
from [table]
cross apply OPENJSON([column], '$')
with (IsDefault varchar(80), MethodType varchar(80))
where IsDefault = 'true'

Related

Athena query JSON Array without struct

In Athena how can I structure a select statement to query the below by timestamp? The data is stored as a string
[{
"data": [{
"ct": "26.7"
}, {
"ct": "24.9",
}, {
"ct": "26.8",
}],
"timestamp": "1658102460"
}, {
"data": [{
"ct": "26.7",
}, {
"ct": "25.0",
}],
"timestamp": "1658102520"
}]
I tried the below but it just came back empty.
SELECT json_extract_scalar(insights, '$.timestamp') as ts
FROM history
What I am trying to get to is returning only the data where a timestamp is between X & Y.
When I try doing this as a struct and a cross join with unnest it's very very slow so I am trying to find another way.
json_extract_scalar will not help here cause it returns only one value. Trino improved vastly json path support but Athena has much more older version of the Presto engine which does not support it. So you need to cast to array and use unnest (removed trailing commas from json):
-- sample data
WITH dataset (json_str) AS (
values ('[{
"data": [{
"ct": "26.7"
}, {
"ct": "24.9"
}, {
"ct": "26.8"
}],
"timestamp": "1658102460"
}, {
"data": [{
"ct": "26.7"
}, {
"ct": "25.0"
}],
"timestamp": "1658102520"
}]')
)
-- query
select mp['timestamp'] timestamp,
mp['data'] data
from dataset,
unnest(cast(json_parse(json_str) as array(map(varchar, json)))) as t(mp)
Output:
timestamp
data
1658102460
[{"ct":"26.7"},{"ct":"24.9"},{"ct":"26.8"}]
1658102520
[{"ct":"26.7"},{"ct":"25.0"}]
After that you can apply filtering and process data.

Parse JSON values with arbitrary keys

I have a column in a table that is a JSON string. Part of these strings have the following format:
{
...
"rules": {
"rule_1": {
"results": [],
"isTestMode": true
},
"rule_2": {
"results": [],
"isTestMode": true
},
"rule_3": {
"results": [
{
"required": true,
"amount": 99.31
}
],
"isTestMode": true
},
"rule_4": {
"results": [],
"isTestMode": false
},
...
}
...
}
Within this nested "rules" object, I want to return true if results[0]["required"] = true AND "isTestMode" = false for any of the rules. The catch is that "rule_1", "rule_2", ... "rule_x" can have arbitrary names that aren't known in advance.
Is it possible to write a query that will iterate over all keys in '"rules"' and check if any one of them matches this condition? Is there any other way to achieve this?
If the keys were known in advance then I could do something like this:
WHERE
(JSON_ARRAY_LENGTH(JSON_EXTRACT(json, '$.rules.rule_1.results')) = 1
AND JSON_EXTRACT_SCALAR(json, '$rules.rule_1.results[0].required') = 'true'
AND JSON_EXTRACT_SCALAR(json, '$rules.rule_1.isTestMode') = 'false')
OR (JSON_ARRAY_LENGTH(JSON_EXTRACT(json, '$.rules.rule_2.results')) = 1
AND JSON_EXTRACT_SCALAR(json, '$rules.rule_2.results[0].required') = 'true'
AND JSON_EXTRACT_SCALAR(json, '$rules.rule_2.isTestMode') = 'false')
OR ...
You can extract rules property and transform it to MAP(varchar, json) and process it:
WITH dataset AS (
SELECT * FROM (VALUES
(JSON '{
"rules": {
"rule_1": {
"results": [],
"isTestMode": true
},
"rule_2": {
"results": [],
"isTestMode": true
},
"rule_3": {
"results": [
{
"required": true,
"amount": 99.31
}
],
"isTestMode": true
},
"rule_4": {
"results": [],
"isTestMode": false
}
}
}')
) AS t (json_value))
select cardinality(
filter(
map_values(cast(json_extract(json_value, '$.rules') as MAP(varchar, json))), -- trasnform into MAP and get it's values
js -> cast(json_extract(js, '$.isTestMode') as BOOLEAN) -- check isTestMode
AND cast(json_extract(js, '$.results[0].required') as BOOLEAN) -- check required of first element of `results`
)) > 0
from dataset
Which will give true for provided data.
I was able to solve this with a regex. Not ideal and would still like to know if this can be done using the built in JSON functions.
WHERE REGEXP_LIKE(json, '.*{"results":\[{"required":true,"amount":\d+.\d+"}],"isTestMode":false}.*')

Indexes: Search by Boolean?

I'm having some trouble with FaunaDB Indexes. FQL is quite powerful but the docs seem to be limited (for now) to only a few examples/use cases. (Searching by String)
I have a collection of Orders, with a few fields: status, id, client, material and date.
My goal is to search/filter for orders depending on their Status, OPEN OR CLOSED (Boolean true/false).
Here is the Index I created:
CreateIndex({
name: "orders_all_by_open_asc",
unique: false,
serialized: true,
source: Collection("orders"),
terms: [{ field: ["data", "status"] }],
values: [
{ field: ["data", "unique_id"] },
{ field: ["data", "client"] },
{ field: ["data", "material"] },
{ field: ["data", "date"] }
]
}
So with this Index, I want to specify either TRUE or FALSE and get all corresponding orders, including their data (fields).
I'm having two problems:
When I pass TRUE OR FALSE using the Javascript Driver, nothing is returned :( Is it possible to search by Booleans at all, or only by String/Number?
Here is my Query (in FQL, using the Shell):
Match(Index("orders_all_by_open_asc"), true)
And unfortunately, nothing is returned. I'm probably doing this wrong.
Second (slightly unrelated) question. When I create an Index and specify a bunch of Values, it seems the data returned is in Array format, with only the values, not the Fields. An example:
[
1001,
"client1",
"concrete",
"2021-04-13T00:00:00.000Z",
],
[
1002,
"client2",
"wood",
"2021-04-13T00:00:00.000Z",
]
This format is bad for me, because my front-end expects receiving an Object with the Fields as a key and the Values as properties. Example:
data:
{
unique_id : 1001,
client : "client1",
material : "concrete",
date: "2021-04-13T00:00:00.000Z"
},
{
unique_id : 1002,
client : "client2",
material : "wood",
date: "2021-04-13T00:00:00.000Z"
},
etc..
Is there any way to get the Field as well as the Value when using Index values, or will it always return an Array (and not an object)?
Could I use a Lambda or something for this?
I do have another Query that uses Map and Lambda to good effect, and returns the entire document, including the Ref and Data fields:
Map(
Paginate(
Match(Index("orders_by_date"), date),
),
Lambda('item', Get(Var('item')))
)
This works very nicely but unfortunately, it also performs one Get request per Document returned and that seems very inefficient.
This new Index I'm wanting to build, to filter by Order Status, will be used to return hundreds of Orders, hundreds of times a day. So I'm trying to keep it as efficient as possible, but if it can only return an Array it won't be useful.
Thanks in advance!! Indexes are great but hard to grasp, so any insight will be appreciated.
You didn't show us exactly what you have done, so here's an example that shows that filtering on boolean values does work using the index you created as-is:
> CreateCollection({ name: "orders" })
{
ref: Collection("orders"),
ts: 1618350087320000,
history_days: 30,
name: 'orders'
}
> Create(Collection("orders"), { data: {
unique_id: 1,
client: "me",
material: "stone",
date: Now(),
status: true
}})
{
ref: Ref(Collection("orders"), "295794155241603584"),
ts: 1618350138800000,
data: {
unique_id: 1,
client: 'me',
material: 'stone',
date: Time("2021-04-13T21:42:18.784Z"),
status: true
}
}
> Create(Collection("orders"), { data: {
unique_id: 2,
client: "you",
material: "muslin",
date: Now(),
status: false
}})
{
ref: Ref(Collection("orders"), "295794180038328832"),
ts: 1618350162440000,
data: {
unique_id: 2,
client: 'you',
material: 'muslin',
date: Time("2021-04-13T21:42:42.437Z"),
status: false
}
}
> CreateIndex({
name: "orders_all_by_open_asc",
unique: false,
serialized: true,
source: Collection("orders"),
terms: [{ field: ["data", "status"] }],
values: [
{ field: ["data", "unique_id"] },
{ field: ["data", "client"] },
{ field: ["data", "material"] },
{ field: ["data", "date"] }
]
})
{
ref: Index("orders_all_by_open_asc"),
ts: 1618350185940000,
active: true,
serialized: true,
name: 'orders_all_by_open_asc',
unique: false,
source: Collection("orders"),
terms: [ { field: [ 'data', 'status' ] } ],
values: [
{ field: [ 'data', 'unique_id' ] },
{ field: [ 'data', 'client' ] },
{ field: [ 'data', 'material' ] },
{ field: [ 'data', 'date' ] }
],
partitions: 1
}
> Paginate(Match(Index("orders_all_by_open_asc"), true))
{ data: [ [ 1, 'me', 'stone', Time("2021-04-13T21:42:18.784Z") ] ] }
> Paginate(Match(Index("orders_all_by_open_asc"), false))
{ data: [ [ 2, 'you', 'muslin', Time("2021-04-13T21:42:42.437Z") ] ] }
It's a little more work, but you can compose whatever return format that you like:
> Map(
Paginate(Match(Index("orders_all_by_open_asc"), false)),
Lambda(
["unique_id", "client", "material", "date"],
{
unique_id: Var("unique_id"),
client: Var("client"),
material: Var("material"),
date: Var("date"),
}
)
)
{
data: [
{
unique_id: 2,
client: 'you',
material: 'muslin',
date: Time("2021-04-13T21:42:42.437Z")
}
]
}
It's still an array of results, but each result is now an object with the appropriate field names.
Not too familiar with FQL, but I am somewhat familiar with SQL languages. Essentially, database languages usually treat all of your values as strings until they don't need to anymore. Instead, your query should use the string definition that FQL is expecting. I believe it should be OPEN or CLOSED in your case. You can simply have an if statement in java to determine whether to search for "OPEN" or "CLOSED".
To answer your second question, I don't know for FQL, but if that is what is returned, then your approach with a lamda seems to be fine. Not much else you can do about it from your end other than hope that you get a different way to get entries in API form somewhere in the future. At the end of the day, an O(n) operation in this context is not too bad, and only having to return a hundred or so orders shouldn't be the most painful thing in the world.
If you are truly worried about this, you can break up the request into portions, so you return only the first 100, then when frontend wants the next set, you send the next 100. You can cache the results too to make it very fast from the front-end perspective.
Another suggestion, maybe I am wrong and failed at searching the docs, but I will post anyway just in case it's helpful.
My index was failing to return objects, example data here is the client field:
"data": {
"status": "LIVRAISON",
"open": true,
"unique_id": 1001,
"client": {
"name": "TEST1",
"contact_name": "Bob",
"email": "bob#client.com",
"phone": "555-555-5555"
Here, the client field returned as null even though it was specified in the Index.
From reading the docs, here: https://docs.fauna.com/fauna/current/api/fql/indexes?lang=javascript#value
In the Value Objects section, I was able to understand that for Objects, the Index Field must be defined as an Array, one for each Object key. Example for my data:
{ field: ['data', 'client', 'name'] },
{ field: ['data', 'client', 'contact_name'] },
{ field: ['data', 'client', 'email'] },
{ field: ['data', 'client', 'phone'] },
This was slightly confusing, because my beginner brain expected that defining the 'client' field would simply return the entire object, like so:
{ field: ['data', 'client'] },
The only part about this in the docs was this sentence: The field ["data", "address", "street"] refers to the street field contained in an address object within the document’s data object.
This is enough information, but maybe it would deserve its own section, with a longer example? Of course the simple sentence works, but with a sub-section called 'Adding Objects to Fields' or something, this would make it extra-clear.
Hoping my moments of confusion will help out. Loving FaunaDB so far, keep up the great work :)

How to delete a jsonb item in a nested psql array

I have a table users:
`CREATE TABLE users(
id SERIAL PRIMARY KEY NOT NULL,
username TEXT UNIQUE,
saved_articles JSONB[],
)`
I added a user like so:
"INSERT INTO users (username, saved_articles) VALUES (?, array[]::jsonb[]) RETURNING id, username, saved_articles"
After adding some articles I have this data shape:
{ id: 1,
username: 'test',
saved_articles:
[ { url: 'test',
title: '',
author: '',
source: '',
content:"",
urlToImage: ''
},
{ url: 'not-test',
title: '',
author: '',
source: '',
content:"",
urlToImage: ''
}
]
}
I want to be able to delete a specific item from the saved_articles array based on the url value.
For example, if my url value is 'test', after running the query my data should look like:
{ id: 1,
username: 'test',
saved_articles:
[ { url: 'not-test',
title: '',
author: '',
source: '',
content:"",
urlToImage: ''
}
]
}
First of all, the format of JSONB columns's value should be fixed. That might be tested through CASTing AS JSONB by a SELECT statement such as
SELECT '{ "id": "1",
"username": "test",
"saved_articles":
[ { "url": "test",
"title": "",
"author": "",
"source": "",
"content":"",
"urlToImage": ""
},
{ "url": "not-test",
"title": "",
"author": "",
"source": "",
"content":"",
"urlToImage": ""
}
]}'::jsonb
whether returns error or not.
Then, remove the desired element from the array by use of jsonb_array_elements(json_data -> 'saved_articles') function together with ->> 'url' != 'test' criteria.
And then reconstruct the array by remaining elements by using jsonb_build_array and jsonb_object_agg.
At the last step concatenate the part which doesn't contain that individual array extracted by json_data #- '{saved_articles}' :
SELECT js0||jsonb_object_agg( 'saved_articles', js1 ) AS "Result JSONB"
FROM
(
SELECT json_data #- '{saved_articles}' AS js0, jsonb_build_array( js ) AS js1
FROM tab
CROSS JOIN jsonb_array_elements(json_data -> 'saved_articles') js
WHERE js ->> 'url' != 'test'
) q
GROUP BY js0
Demo

DataTables ajax requires explicit json collection name for the returned datasource?

I recently ran into a problem when implementing the ajax functionality of jquery DataTables. Until I actually gave my json object collection an explicit name I couldn't get anything to display. Shouldn't there be a default data source if nothing named is returned?
Client Side control setup (includes hidden field that supplies data to dynamic anchor:
$('#accountRequestStatus').dataTable(
{
"destroy": true, // within a method that will be called multiple times with new/different data
"processing": true,
"ajax":
{
"type": "GET",
"url": "#Url.Action("SomeServerMethod", "SomeController")",
"data": { methodParam1: 12341, methodParam2: 123423, requestType: 4123421 }
}
, "paging": false
, "columns": [
{ "data": "DataElement1" },
{ "data": "DataElement2", "title": "Col1" },
{ "data": "DataElement3", "title": "Col2" },
{ "data": "DataElement4", "title": "Col3" },
{ "data": "DataElement5", "title": "Col4" },
]
, "columnDefs": [
{
"targets": 0, // hiding first column, userId
"visible": false,
"searchable": false,
"sortable": false
},
{
"targets": 5, // creates action link using the hidden data for that row in column [userId]
"render": function (data, type, row) {
return "<a href='#Url.Action("ServerMethod", "Controller")?someParam=" + row["DataElement1"] + "'>Details</a>"
},
"searchable": false,
"sortable": false
}
]
});
Here's a snippet of my server side code that returns the json collection.
tableRows is a collection of models containing the data to be displayed.
var json = this.Json(new { data = tableRows });
json.JsonRequestBehavior = JsonRequestBehavior.AllowGet;
return json;
As I said before, the ajax call returned data but wouldn't display until I gave the collection a name. Maybe I missed this required step in the documentation, but wouldn't it make sense for the control to wire up to a single returned collection as the default data source and not require the name? Figuring out the name thing equated to about 2+ hours of messin' around trying different things. That's all I'm saying.
Maybe this'll help someone else too...
dataTables does actually have a dataSrc property! dataTables will look for either a data or an aaData section in the JSON. Thats why you finally got it to work with new { data=tableRows }. That is, if dataSrc is not specified! If your JSON differs from that concept you must specify dataSrc :
If you return a not named array / collection [{...},{...}] :
ajax: {
url: "#Url.Action("SomeServerMethod", "SomeController")",
dataSrc: ""
}
If you return a JSON array named different from data or aaData, like customers :
ajax: {
url: "#Url.Action("SomeServerMethod", "SomeController")",
dataSrc: "customers"
}
If the content is nested like { a : { b : [{...},{...}] }}
ajax: {
url: "#Url.Action("SomeServerMethod", "SomeController")",
dataSrc: "a.b"
}
If you have really complex JSON or need to manipulate the JSON in any way, like cherry picking from the content - dataSrc can also be a function :
ajax: {
url: "#Url.Action("SomeServerMethod", "SomeController")",
dataSrc: function(json) {
//do what ever you want
//return an array containing JSON / object literals
}
}
Hope the above clears things up!