How to parse json data in a column with Druid SQL? - sql

I'm trying to parse json data in a column with Druid SQL in Superset SQL lab. My table looks like this:
id
json_scores
0
{"foo": 20, "bar": 10}
1
{"foo": 30, "bar": 10}
I'm looking for something similar to json_extract in MySQL e.g.
SELECT *
FROM my_table
WHERE json_extract(json_scores, '$.foo') > 10;

Druid doesn't support json_extract function. Druid supports only ANSI SQL 92, which does not understand JSON as a data type.
Supported data type are listed in this page: https://docs.imply.io/latest/druid/querying/sql-data-types/
You can use any expressions that are listed here: https://druid.apache.org/docs/latest/misc/math-expr.html#string-functions
In your case consider using regexp_extract:
regexp_extract(json_scprs, '(?<=\"foo\":\s)(\d+)(?=,)', 0) AS foo,

Related

How to remove array wrapper from mariadb server

I want to remove the array wrapper surrounding a query result as I'm running a for loop to push the object into an array. This is my query
"SELECT * FROM jobs WHERE id = ? FOR JSON PATH, WITHOUT_ARRAY_WRAPPER"
but I'm getting this result in postman
{
"status": "Failed",
"message": "You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'JSON PATH, WITHOUT_ARRAY_WRAPPER' at line 1"
}
for json path is a feature of Microsoft SQL Server. There is a standard for JSON in SQL, but don't expect most SQL servers to follow it.
You can get a single JSON object for each row with json_object.
-- {"id": 2, "name": "Bar"}
select
json_object('id', id, 'name', name)
from jobs
where id = 2
Rather than query each job individually and then appending to an array, you can do this in single query using the in operator to query all desired rows at once, and then json_arrayagg to aggregate them into a single array.
-- [{"id": 1, "name": "Foo"},{"id": 3, "name": "Baz"}]
select
json_arrayagg( json_object('id', id, 'name', name) )
from jobs
where id in (1, 3)
This is much more efficient. In general, if you're querying SQL in loops there's a better way.
Demonstration.

How to extract nested json with SQL

I have a table name ‘my doc’. And there is a column which is a nested json called ‘element’. The structure is as below.
I want to extract league_id. How to make it with SQL? Thanks
{
“api”: {
“Results
Different SQL implementations have different standards..
E.g. mysql can handle a JSON field in their table
-- returns {"a": 1, "b": 2}:
SELECT JSON_OBJECT('a', 1, 'b', 2);
https://www.sitepoint.com/use-json-data-fields-mysql-databases/

Presto Produce JSON results

I have a json table which was created by
CREATE TABLE `normaldata_source`(
`column1` int,
`column2` string,
`column3` struct<column4:string>)
A sample data is:
{
"column1": 9,
"column2": "Z",
"column3": {
"column4": "Y"
}
}
If I do
SELECT column3
FROM normaldata_source
it will produce a result {column4=y}. However, I want it to be in json form {"column4": "y"}
Is this possible?
*Edit This query gives me the following result:
SELECT CAST(column3 AS JSON) as column3_json
FROM normaldata_source
As of Trino 357 (formerly known as Presto SQL), you can now cast a row to JSON and it will preserve the column names:
WITH normaldata_source(column1, column2, column3) AS (
VALUES (9, 'Z', cast(row('Y') as row(column4 varchar)))
)
SELECT cast(column3 as json)
FROM normaldata_source
=>
_col0
-----------------
{"column4":"Y"}
(1 row)
I encountered this same problem and was thoroughly stumped on how to proceed in light of deep compositional nesting/structs. I'm using Athena (managed Presto w/ Hive Connector from AWS). In the end, I worked around it by doing a CTAS (create table as select) where I selected the complex column I wanted, under the conditions I wanted) and wrote it to an external table with an underlying SerDe format of JSON. Then, via the HiveConnector's $path magic column (or by listing the files under the external table location), I obtain the resulting files and streamed out of those.
I know this isn't a direct answer to the question at hand - I believe we have to wait for https://github.com/trinodb/trino/pull/3613 in order to support arbitrary struct/array compositions -> json. But maybe this will help someone else who kind of assumed they'd be able to do this.
Although I originally saw this as an annoying workaround, I'm now starting to think it was the right call for my application anyway

SQL to convert JSON object into array of objects in Posgres

I have a column of JSON type in a postgres table. It currently has values like this
{"value": "abc"}
I want to write a SQL query that can change this to
[{"value": "abc", "timestamp": 1465373673}]
The part timestamp: 1465373673 will be hard coded
Any ideas on how this SQL query can be written?
You can use json_build_array and json_build_object:
UPDATE test
set a = json_build_array(
json_build_object('value', a->'value', 'timestamp', 1465373673)
);
Here's a fiddle.
Use the concatenation operator and the function jsonb_build_array():
select jsonb_build_array('{"value": "abc"}'::jsonb || '{"timestamp": 1465373673}');
jsonb_build_array
---------------------------------------------
[{"value": "abc", "timestamp": 1465373673}]
(1 row)
Read JSON Functions and Operators.

How do I load CSV file to Amazon Athena that contains JSON field

I have a CSV (tab separated) in s3 that needs to be queried on a JSON field.
uid\tname\taddress
1\tmoorthi\t{"rno":123,"code":400111}
2\tkiranp\t{"rno":124,"street":"kemp road"}
How can I query this data in Amazon Athena?
I should be able to query like:
select uid
from table1
where address['street']="kemp road";
You could try using the json_extract() command.
From Extracting Data from JSON - Amazon Athena:
You may have source data with containing JSON-encoded strings that you do not necessarily want to deserialize into a table in Athena. In this case, you can still run SQL operations on this data, using the JSON functions available in Presto.
WITH dataset AS (
SELECT '{"name": "Susan Smith",
"org": "engineering",
"projects": [{"name":"project1", "completed":false},
{"name":"project2", "completed":true}]}'
AS blob
)
SELECT
json_extract(blob, '$.name') AS name,
json_extract(blob, '$.projects') AS projects
FROM dataset
This example shows how json_extract() can be used to extract fields from JSON. Thus, you might be able to do something like:
select uid
from table1
where json_extract(address, '$.street') = "kemp road";