How to use hive's explode() function for complex struct? - hive

My hive table looks like this :
CREATE EXTERNAL TABLE sample(id STRING,products STRUCT<urls:ARRAY<STRUCT<url:STRING>>,product_names:ARRAY<STRUCT<name:STRING>>,user:ARRAY<STRUCT<user_id:STRING>>>)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
STORED AS TEXTFILE
LOCATION ‘/user/input/sample’;
Is there any way to explode the products field, so that it should store the url,name,user_id into three different columns ?
Can anyone please suggest me out regarding the same ....

you should be able to explode the your three arrays as follow
select url, product_name, user_id from sample
lateral VIEW explode(products.urls) A as url
lateral VIEW explode(products.product_names) B as product_name
lateral VIEW explode(products.user) C as user_id
;

Related

Fetch data from database while comparing id with a column[type=jsonnb] having json data in Postgresql

I have a column called vendor having jsonnb type and a json data like [{"id":"1","name":"Dev"}]
I wanted to select row data puting this column in where clause like WHERE vendor.id=1
So how can i do that, any help will be appriciated
You can use the contains operator #>:
select *
from the_table
where vendor #> '[{"id":"1"}]'::jsonb;

Query from array structure

I have a db table called ex_table and
Location is a column.
when i ran query it shows array structure.
I need extract array element.
My Query was
Select location form ex_table
it shows
[{country=BD, state=NIL, city=NIL}]
how do I select only city form location column?
Try the following:
WITH dataset AS (
SELECT location
FROM ex_table
)
SELECT places.city
FROM dataset, UNNEST (location) AS t(places)
As this is an array of objects, you need to flatten the data. This is done using the UNNEST syntax in Athena. More info on this can be found in the AWS documentation

postgresql - Appending data from JSON file to a table

I am using two columns from the iris dataset as an example - sepal_length and sepal_width.
I have two tables
create table iris(sepal_length real, sepal_width real);, and
create table raw_json(data jsonb);
And inside the JSON file I have data like this
[{"sepal_width":3.5,"sepal_length":5.1},{"sepal_width":3.0,"sepal_length":4.9}]
First thing I do is copy raw_json from '/data.json';
So far I have only been able to figure out how to use jsonb_array_elements.
select jsonb_array_elements(data) from raw_json; gives back
jsonb_array_elements
-------------------------------------------
{"sepal_width": 3.5, "sepal_length": 5.1}
{"sepal_width": 3.0, "sepal_length": 4.9}
I want to insert (append actually) the data from raw_json table into iris table. I have figured out that I need to use either jsonb_to_recordset or json_populate_recordset. But how?
Also, could this be done without the raw_json table?
PS - Almost all the existing SE questions use a raw json string inside their queries. So that didn't work for me.
You must extract the jsons from the output of jsonb_array_elements used in from clause
INSERT INTO iris(sepal_length,sepal_width)
select (j->>'sepal_length' ) ::real,
(j->>'sepal_width' ) ::real
from raw_json cross join jsonb_array_elements(data) as j;
DEMO

Querying using like on JSONB field

I have a field in a PostgreSQL database with a JSONB type in the format of ["tag1","tag2"] and I am trying to implement a search that will provide results for a predictive dropdown (i.e. if a user types "t" and the column above exists both tags are returned.
Any suggestions on how to do this?
I tried the query below but it is not working:
SELECT table.tags::JSONB from table where table.tags::TEXT like 't%';
One way you can do that is using jsonb_array_elements_text() function (https://www.postgresql.org/docs/current/static/functions-json.html)
Example test:
SELECT *
FROM jsonb_array_elements_text($$["tag1","tag2","xtag1","ytag1"]$$::jsonb)
WHERE value LIKE 't%';
value
-------
tag1
tag2
(2 rows)
Since jsonb_array_elements_text() creates set of records and in your case there is no other condition than LIKE then using LATERAL (https://www.postgresql.org/docs/9.5/static/queries-table-expressions.html#QUERIES-LATERAL) should help you out like this:
SELECT T.tags
FROM table T,
LATERAL jsonb_array_elements_text(T.tags) A
WHERE A.value LIKE 't%';

Can get an average of values in a json array using postgres?

One of the great things about postgres is that it allows indexing into a json object.
I have a column of data formatted a little bit like this:
{"Items":
[
{"RetailPrice":6.1,"EffectivePrice":0,"Multiplier":1,"ItemId":"53636"},
{"RetailPrice":0.47,"EffectivePrice":0,"Multiplier":1,"ItemId":"53404"}
]
}
What I'd like to do is find the average RetailPrice of each row with these data.
Something like
select avg(json_extract_path_text(item_json, 'RetailPrice'))
but really I need to do this for each item in the items json object. So for this example, the value in the queried cell would be 3.285
How can I do this?
Could work like this:
WITH cte(tbl_id, json_items) AS (
SELECT 1
, '{"Items": [
{"RetailPrice":6.1,"EffectivePrice":0,"Multiplier":1,"ItemId":"53636"}
,{"RetailPrice":0.47,"EffectivePrice":0,"Multiplier":1,"ItemId":"53404"}]}'::json
)
SELECT tbl_id, round(avg((elem->>'RetailPrice')::numeric), 3) AS avg_retail_price
FROM cte c
, json_array_elements(c.json_items->'Items') elem
GROUP BY 1;
The CTE just substitutes for a table like:
CREATE TABLE tbl (
tbl_id serial PRIMARY KEY
, json_items json
);
json_array_elements() (Postgres 9.3+) to unnest the json array is instrumental.
I am using an implicit JOIN LATERAL here. Much like in this related example:
Query for element of array in JSON column
For an index to support this kind of query consider this related answer:
Index for finding an element in a JSON array
For details on how to best store EAV data:
Is there a name for this database structure?