I have a db table called ex_table and
Location is a column.
when i ran query it shows array structure.
I need extract array element.
My Query was
Select location form ex_table
it shows
[{country=BD, state=NIL, city=NIL}]
how do I select only city form location column?
Try the following:
WITH dataset AS (
SELECT location
FROM ex_table
)
SELECT places.city
FROM dataset, UNNEST (location) AS t(places)
As this is an array of objects, you need to flatten the data. This is done using the UNNEST syntax in Athena. More info on this can be found in the AWS documentation
Related
Hi I have a table like so, and I am trying to unnest a string in the table however been unable to do so.
id
data
1
{"$google_analytics_client_id":"xxxx","fullName":"A","phoneNumber":"+xxxxx","userId":"263175"}
2
{"$google_analytics_client_id":"xxx","fullName":"B","phoneNumber":"+xxxxx","userId":"263143"}
I am trying to get the id and userId. The data part is in string.
The current code is as below, where I plan to see what's being returned so that I can select it.
select *
from table
unnest(data)
You can use the following to retrieve a value from a JSON or JSON formatted string object:
select
id,
json_value(data, '$.userId') as userId,
from sample_data
Information on JSON functions can be found here:
https://cloud.google.com/bigquery/docs/reference/standard-sql/json_functions#json_value
How would I query to select length(int) which is within the array 'details' which is within the 'packets' column? Hopefully, the image attached will explain better than me!
I've tried SELECT packets.details.length FROM test.ssh_data which doesn't work.
This gives me the following error:
illegal column/field reference 'packets.details.length' with intermediate collection 'details' of type 'ARRAY<STRUCT<datestamp:STRING,length:INT>>
Thank you in advance!
In the Impala nested types support, arrays and maps are treated as nested tables. You need to reference them in the FROM clause to unnest them. In that case you can add the array to the from clause, taking care to refer to it via sd, the alias of the table it's inside. E.g.
SELECT d.length FROM test.ssh_data sd, sd.packets.details d
I am using two columns from the iris dataset as an example - sepal_length and sepal_width.
I have two tables
create table iris(sepal_length real, sepal_width real);, and
create table raw_json(data jsonb);
And inside the JSON file I have data like this
[{"sepal_width":3.5,"sepal_length":5.1},{"sepal_width":3.0,"sepal_length":4.9}]
First thing I do is copy raw_json from '/data.json';
So far I have only been able to figure out how to use jsonb_array_elements.
select jsonb_array_elements(data) from raw_json; gives back
jsonb_array_elements
-------------------------------------------
{"sepal_width": 3.5, "sepal_length": 5.1}
{"sepal_width": 3.0, "sepal_length": 4.9}
I want to insert (append actually) the data from raw_json table into iris table. I have figured out that I need to use either jsonb_to_recordset or json_populate_recordset. But how?
Also, could this be done without the raw_json table?
PS - Almost all the existing SE questions use a raw json string inside their queries. So that didn't work for me.
You must extract the jsons from the output of jsonb_array_elements used in from clause
INSERT INTO iris(sepal_length,sepal_width)
select (j->>'sepal_length' ) ::real,
(j->>'sepal_width' ) ::real
from raw_json cross join jsonb_array_elements(data) as j;
DEMO
I am taking samples from a Bayesian statistical model, serializing them with Avro, uploading them to S3, and querying them with Athena.
I need help writing a query that unnests an array in the table.
The CREATE TABLE query looks like:
CREATE EXTERNAL TABLE `model_posterior`(
`job_id` bigint,
`model_id` bigint,
`parents` array<struct<`feature_name`:string,`feature_value`:bigint, `is_zid`:boolean>>,
`posterior_samples` struct <`parameter`:string,`is_scaled`:boolean,`samples`:array<double>>)
The "samples" array in the "posterior_samples" column is where the samples are stored. I have managed to unnest the "posterior_samples" struct with the following query:
WITH samples AS (
SELECT model_id, parents, sample, sample_index
FROM posterior_db.model_posterior
CROSS JOIN UNNEST(posterior_samples.samples) WITH ORDINALITY AS t (sample, sample_index)
WHERE job_id = 111000020709
)
SELECT * FROM samples
Now what I want is to unnest the parents column. Each record in this column is an array of structs. I am trying to create a column that just has an array of values for the "feature_value" keys in that array of structs. (The reason why I want an array is that the parents array can have a length > 1).
In other words for each array in the parents row, I want an array of the same size. That array should contain only the values of the "feature_value" key from the structs in the original array.
Any advice on how to solve this?
Thanks.
You can use transform function described here.
Assuming we have table named samples with structure mentioned in your question. Then you can write query that looks something like as follows
SELECT *, transform(parents, parent -> parent.feature_value) as only_ feature_values
FROM samples
Note : this many not be the perfect query syntactically but you can play with it.
Hope this would help. Cheers :)
One of the great things about postgres is that it allows indexing into a json object.
I have a column of data formatted a little bit like this:
{"Items":
[
{"RetailPrice":6.1,"EffectivePrice":0,"Multiplier":1,"ItemId":"53636"},
{"RetailPrice":0.47,"EffectivePrice":0,"Multiplier":1,"ItemId":"53404"}
]
}
What I'd like to do is find the average RetailPrice of each row with these data.
Something like
select avg(json_extract_path_text(item_json, 'RetailPrice'))
but really I need to do this for each item in the items json object. So for this example, the value in the queried cell would be 3.285
How can I do this?
Could work like this:
WITH cte(tbl_id, json_items) AS (
SELECT 1
, '{"Items": [
{"RetailPrice":6.1,"EffectivePrice":0,"Multiplier":1,"ItemId":"53636"}
,{"RetailPrice":0.47,"EffectivePrice":0,"Multiplier":1,"ItemId":"53404"}]}'::json
)
SELECT tbl_id, round(avg((elem->>'RetailPrice')::numeric), 3) AS avg_retail_price
FROM cte c
, json_array_elements(c.json_items->'Items') elem
GROUP BY 1;
The CTE just substitutes for a table like:
CREATE TABLE tbl (
tbl_id serial PRIMARY KEY
, json_items json
);
json_array_elements() (Postgres 9.3+) to unnest the json array is instrumental.
I am using an implicit JOIN LATERAL here. Much like in this related example:
Query for element of array in JSON column
For an index to support this kind of query consider this related answer:
Index for finding an element in a JSON array
For details on how to best store EAV data:
Is there a name for this database structure?