In Amazon Athena,
I have some data file in form of one JSON on one row
{ a: 1, b: 2 }
{ a: 2, b: 4 }
{ a: 3, b: 6 }
is it possible to output entire row of data while the table is created with only field a?
SELECT ??? FROM table1 WHERE a > 1;
Output
{ a: 2, b: 4 }
{ a: 3, b: 6 }
Thanks
Yes if your column is a JSON then you can extract it in Athena as so:
SELECT *
FROM my_table
WHERE cast(json_extract_scalar(column_name,
'$.a') as integer) > 1;
There is function concat( sting,string,...) in hive which can be used but I doubt Athena supports UDFs .
Sample code
SELECT concat('{ a: ' , cast(a as string),', b:' ,cast(a as string), ' }' )FROM table1 WHERE a > 1;
Update after comment
In case you have to create table form limited column create view on top of base table.
Related
I'm trying to migrate our current ES to CrateDB and one of the issues I'm facing is searching for two specific values within the same object when this object is part of an array of objects.
CREATE TABLE test.artefact (
id INTEGER,
metadata ARRAY(OBJECT(STATIC) AS (
key_id INTEGER,
value TEXT
))
);
insert into test.artefact(id, metadata) values (
1,
[
{
"key_id" = 1,
"value" = 'TEST1'
},
{
"key_id" = 2,
"value" = 'TEST2'
}
]
);
So basically, I'm trying to search metadata providing key_id and value.
A select like this one finds artefact 1 as a match, even when key and value are in different objects:
select * from test.artefact where 1 = ANY(metadata['key_id']) AND 'TEST2' = ANY(metadata['value'])
I have tried other functions, like UNNEST, with no luck.
Copy from CrateDB Community:
One way that should work is
SELECT *
FROM test.artefact
WHERE {key_id = 1, value = 'TEST2'} = ANY(metadata)
however this is probably not the most performant way.
together with the queries on the fields it might be quick enough.
SELECT *
FROM test.artefact
WHERE
1 = ANY(metadata['key_id'])
AND 'TEST2' = ANY(metadata['value'])
AND {key_id = 1, value = 'TEST2'} = ANY(metadata)
I have a JSON file (array of objects) which I have to convert into a table format using a PostgreSQL query.
Follow Sample Data.
"b", "c", "d", "e" are to be extracted as separate tables as they are arrays and in these arrays, there are objects
I have tried using json_populate_recordset() but it only works if I have a single array.
[{a:"1",b:"2"},{a:"10",b:"20"}]
I have referred to some links and codes.
jsonb_array_element example
postgreSQL functions
Expected Output
Sample Data:
{
"b":[
{columnB1:value, columnB2:value},
{columnB1:value, columnB2:value},
],
"c":[
{columnC1:value, columnC2:value, columnC3:value},
{columnC1:value, columnC2:value, columnC3:value},
{columnC1:value, columnC2:value, columnC3:value}
],
"d":[
{columnD1:value, columnD2:value},
{columnD1:value, columnD2:value},
],
"e":[
{columnE1:value, columnE2:value},
]
}
expected output
b should be one table in which columnA1 and columnA2 are displayed with their values.
Similarly table c, d, e with their respective columns and values.
Expected Output
You can use jsonb_to_recordset() but you need to unnest your JSON. You need to do this inline as this is a JSON Processing Function which cannot used derived values.
I am using validated JSON as simplified and formatted at end of this answer
To unnest your JSON use below notation which extracts JSON object field with the given key.
--one level
select '{"a":1}'::json->'a'
result : 1
--two levels
select '{"a":{"b":[2]}}'::json->'a'->'b'
result : [2]
We now expand this to include json_to_recordset()
select * from
json_to_recordset(
'{"a":{"b":[{"f1":2,"f2":4},{"f1":3,"f2":6}]}}'::json->'a'->'b' --inner table b
)
as x("f1" int, "f2" int); --fields from table b
or using json_array_elements. Either way we need to list our fields. With second solution type will be json not int so you cant sum etc
with b as (select json_array_elements('{"a":{"b":[{"f1":2,"f2":4},{"f1":3,"f2":6}]}}'::json->'a'->'b') as jx)
select jx->'f1' as f1, jx->'f2' as f2 from b;
Output
f1 f2
2 4
3 6
We now use your data structure in jsonb_to_recordset()
select * from jsonb_to_recordset( '{"a":{"b":[{"columnname1b":"value1b","columnname2b":"value2b"},{"columnname1b":"value","columnname2b":"value"}],"c":[{"columnname1":"value","columnname2":"value"},{"columnname1":"value","columnname2":"value"},{"columnname1":"value","columnname2":"value"}]}}'::jsonb->'a'->'b') as x(columnname1b text, columnname2b text);
Output:
columnname1b columnname2b
value1b value2b
value value
For table c
select * from jsonb_to_recordset( '{"a":{"b":[{"columnname1b":"value1b","columnname2b":"value2b"},{"columnname1b":"value","columnname2b":"value"}],"c":[{"columnname1":"value","columnname2":"value"},{"columnname1":"value","columnname2":"value"},{"columnname1":"value","columnname2":"value"}]}}'::jsonb->'a'->'c') as x(columnname1 text, columnname2 text);
Output
columnname1 columnname2
value value
value value
value value
Sample JSON
{
"a": {
"b": [
{
"columnname1b": "value1b",
"columnname2b": "value2b"
},
{
"columnname1b": "value",
"columnname2b": "value"
}
],
"c": [
{
"columnname1": "value",
"columnname2": "value"
},
{
"columnname1": "value",
"columnname2": "value"
},
{
"columnname1": "value",
"columnname2": "value"
}
]
}
}
Well, I came up with some ideas, here is one that worked. I was able to get one table at a time.
https://www.postgresql.org/docs/9.5/functions-json.html
I am using json_populate_recordset.
The column used in the first select statement comes from a table whose column is a JSON type which we are trying to extract into a table.
The 'tablename from column' in the json_populate_recordset function, is the table we are trying to extract followed with b its columns and datatypes.
WITH input AS(
SELECT cast(column as json) as a
FROM tablename
)
SELECT b.*
FROM input c,
json_populate_recordset(NULL::record,c.a->'tablename from column') as b(columnname1 datatype, columnname2 datatype)
Assume i have a table called MyTable and this table have a JSON type column called myjson and this column have next value as a json array hold multiple objects, for example like next:
[
{
"budgetType": "CF",
"financeNumber": 1236547,
"budget": 1000000
},
{
"budgetType": "ENVELOPE",
"financeNumber": 1236888,
"budget": 2000000
}
]
So how i can search if the record has any JSON objects inside its JSON array with financeNumber=1236547
Something like this:
SELECT
t.*
FROM
"MyTable",
LATERAL json_to_recordset(myjson) AS t ("budgetType" varchar,
"financeNumber" int,
budget varchar)
WHERE
"financeNumber" = 1236547;
Obviously not tested on your data, but it should provide a starting point.
with a as(
SELECT json_array_elements(myjson)->'financeNumber' as col FROM mytable)
select exists(select from a where col::text = '1236547'::text );
https://www.postgresql.org/docs/current/functions-json.html
json_array_elements return setof json, so you need cast.
Check if a row exists: Fastest check if row exists in PostgreSQL
I have table (orders) with jsonb[] column named steps in Postgres db.
I need create SQL query to select records where Step1 and Step2 and Step3 has success status
[
{
"step_name"=>"Step1",
"status"=>"success",
"timestamp"=>1636120240
},
{
"step_name"=>"Step2",
"status"=>"success",
"timestamp"=>1636120275
},
{
"step_name"=>"Step3",
"status"=>"success",
"timestamp"=>1636120279
},
{
"step_name"=>"Step4",
"timestamp"=>1636120236
"status"=>"success"
}
]
table structure
id | name | steps (jsonb)
'Normalize' steps into a list of JSON items and check whether every one of them has "status":"success". BTW your example is not valid JSON. All => need to be replaced with : and a comma is missing.
select id, name from orders
where
(
select bool_and(j->>'status' = 'success')
from jsonb_array_elements(steps) j
where j->>'step_name' in ('Step1','Step2','Step3') -- if not all steps but only these are needed
);
You can use JSON value contain operation for check condition exist or not
Demo
select
*
from
test
where
steps #> '[{"step_name":"Step1","status":"success"},{"step_name":"Step2","status":"success"},{"step_name":"Step3","status":"success"}]'
I have 4 tables in PostgreSQL:
HomeSearch(id, client_id, name)
HomeSearchNote(id, homesearch_id, text)
Client(id, user_id)
User(id)
I'm trying to query HomeSearch and return a JSON as follows but only for those entries in HomeSearch for which the Client.user_id is equal to a certain value (say, 100):
HomeSearch {
id:
name:
client: {
id:
user: {
id
}
}
notes: [{
id:
homesearch_id:
text:
},
...]
}
My SQL statement is:
SELECT
*,
( SELECT row_to_json(client) from client where homeSearch.client_id=client.id ) client,
( SELECT json_agg(row_to_json(homeSearchNote)) from homeSearchNote where homeSearchNote.homesearch_id=homeSearch.id) notes
FROM homeSearch
WHERE client->>'user_id'=100
LIMIT 5;
However, this returns:
ERROR: column "client" does not exist
LINE 19: WHERE client->>user_id=100
If I run the query without the WHERE clause in PGAdmin, I can clearly see a table with a 'client' column of type JSON.
Can anyone comment what would be the right way to place / write the WHERE clause ?
Much appreciated!
On way you can re-write the sql is as
With a as
(SELECT *,
( SELECT row_to_json(client) from client where homeSearch.client_id=client.id ) client,
( SELECT json_agg(row_to_json(homeSearchNote)) from homeSearchNote where homeSearchNote.homesearch_id=homeSearch.id) notes
FROM homeSearch
)
Select * from a WHERE client->>'user_id'=100
LIMIT 5;