Hive and sql( any sql query) - sql

I am posting question here related to hive dataware house?I have below sample data
Id,name,transctions
1,apple,(1,1,1,1)
2,mango,(1,1,1)
3,kiwi,(1,1,1)
...........
.......
In transctions column my data is array type. I am expecting below output.
Id,name,transtion
1,apple,4
2,mango,3
3,kiwi,3

I am guessing you want the size of the array:
select id, name, [size(transactions)][1]
from t;

Related

Transforming JSON data to relational data

I want to display data from SQL Server where the data is in JSON format. But when the select process, the data does not appear:
id
item_pieces_list
0
[{"id":2,"satuan":"BOX","isi":1,"aktif":true},{"id":4,"satuan":"BOX10","isi":1,"aktif":true}]
1
[{"id":0,"satuan":"AMPUL","isi":1,"aktif":"true"},{"id":4,"satuan":"BOX10","isi":5,"aktif":true}]
I've written a query like this, but nothing appears. Can anyone help?
Query :
SELECT id, JSON_Value(item_pieces_list, '$.satuan') AS Name
FROM [cisea.bamedika.co.id-hisys].dbo.medicine_alkes AS medicalkes
Your Path is wrong. Your JSON is an array, and you are trying to retrieve it as a flat object
SELECT id, JSON_Value(item_pieces_list,'$[0].satuan') AS Name
FROM [cisea.bamedika.co.id-hisys].dbo.medicine_alkes
Only in the case of data without the [] (array sign) you could use your original query '$.satuan', but since you are using an array I change it to retrieve only the first element in the array '$[0].satuan'

extract values inside an array column in amazon athena

I have a table in athena aws where the column 'metadata_stopinfo' has the structure that you can see in the image.
I am trying to extract values that are inside that array, however when I try
SELECT
"json_extract_scalar"(metadata_stopinfo, '$.city')
FROM "table"
I have the following problem
SYNTAX_ERROR: line 2:5: Unexpected parameters (array(row("address" row("addressline" varchar,"city" varchar,"countrycode" varchar,"countrycodeoriginal" varchar,"state" varchar,"zipcode" varchar),"carrierreference" varchar,"contacts" array(row("contacttype" varchar,"email" varchar,"fax" varchar,"mobilephone" varchar,"name" varchar,"officephone" varchar,"userid" varchar)),"containerinfo" array(row("containerid" varchar,"containeridtype" varchar,"equipmentcode" varchar,"equipmenttype" varchar)),"conveyancelinenumber" varchar,"conveyancetype" varchar,"conveyancetypeoriginal" varchar,"dateinfo" row("arrivalestimateddate" varchar,"arrivalestimateddateend" varchar,"arrivalestimatedendoffset" varchar,"arrivalestimatedoffset" varchar,"arrivalrequesteddate" varchar,"deliveryestimateddate" varchar,"deliveryestimateddateend" varchar,"deliveryestimatedendoffset" varchar,"deliveryestimatedoffset" varchar,"deliveryrequesteddate" varchar,"deliveryrequesteddateend" varchar,"deliveryrequestedendoffset" varchar,"deliveryrequestedoffset" varchar,"departureestimateddate" varchar,"departureestimateddateend" varchar,"departureestimatedendoffset" varchar,"departureestimatedoffset" varchar,"departurerequesteddate" varchar,"pickuprequesteddate" varchar,"pickuprequesteddateend" varchar,"pickuprequestedendoffset" varchar,"pickuprequestedoffset" varchar,"pickupestimateddate" varchar,"pickupestimateddateend" varchar,"pickupestimatedendoffset" varchar,"pickupestimatedoffset" varchar),"deliverynotenumber" varchar,"instructions" array(row("customerspecificsubtype" varchar,"header" boolean,"instructionsubtype" varchar,"instructiontype" varchar,"text" varchar)),"locationid" varchar,"partnercarrieraddress" row("addressline" varchar,"city" varchar,"countrycode" varchar,"countrycodeoriginal" varchar,"state" varchar,"zipcode" varchar),"partnercarriercontacts" array(row("contacttype" varchar,"email" varchar,"fax" varchar,"name" varchar,"officephone" varchar)),"partnercarrierid" varchar,"partnercarriername" varchar,"partnerid" varchar,"partnername" varchar,"partnertimezone" varchar,"partnertype" varchar,"productquantity" row("number" double,"originalunitofmeasure" varchar,"quantitytype" varchar,"unitofmeasure" varchar),"sequencenumber" bigint,"shipmentidentifier" varchar,"stoptype" varchar,"transportinfo" row("description" varchar,"transportcode" varchar,"transportoriginalcode" varchar),"vesselinfo" row("lloydsnumber" varchar,"shipsradiocallnumber" varchar,"vesselname" varchar,"vesselnumber" varchar,"voyagetripnumber" varchar))), varchar(6)) for function json_extract_scalar. Expected: json_extract_scalar(varchar(x), JsonPath) , json_extract_scalar(json, JsonPath)
My question is, how can i extract values inside de column ?
json_extract_scalar unsurprisingly works with json (note that even if yur data was in json format, json_extract_scalar(metadata_stopinfo, '$.city') still would not have worked cause your data is an array), while your column contains array's of row's, so you need to work with it correspondingly. For example you can use indexes to access elements in array (in presto array indexes start from 1):
SELECT
metadata_stopinfo[1] r
FROM "table"
And then access the fields:
The fields may be of any SQL type, and are accessed with field reference operator .
SELECT
metadata_stopinfo[1].city city
FROM "table"
Also you can flatten the array with unnest:
SELECT r.city
FROM "table",
unnest(metadata_stopinfo) as t(r)

Extracting data from JSON field in Amazon Redshift

I am trying to extract some data from a JSON field in Redshift.
Given below is a sample view of the data I am working with.
{"fileFormat":"excel","data":{"name":John,"age":24,"dateofbirth":1993,"Class":"Computer Science"}}
I am able to extract data for the first level namely data corresponding to
fileFormat and data as below:
select CONFIGURATION::JSON -> 'fileFormat' from table_name;
I am trying to extract information under data like name, age,dateofbirth
You could use Redshift's native function json_extract_path_text
- https://docs.aws.amazon.com/redshift/latest/dg/JSON_EXTRACT_PATH_TEXT.html
SELECT
json_extract_path_text(
configuration,
'data',
'name'
)
AS name,
json_extract_path_text(
configuration,
'data',
'age'
)
AS age,
etc
FROM
yourTable

extract data in hive

I have data in a hive table column as below:
customer reason=Other#customer reason free text=Space#customer type=Regular#customer end date=2020-12-31 19:50:00#customer offering criterion=0#customer type=KK#Customer factor=0.00#customer period=0#customer type=TN#customer value=0#customer plan type=M#cttype type=KK#
I want to extract data value 0 against customer value.
I tried below query but it is giving full 'customer value=0' but i want only 0.
Please suggest.
select regexp_replace(regexp_extract(information,'customer period=[^#]*',0),'customer period=','') from detail;
The question is not clear at all.
If you want to limit the output, use LIMIT 1 (where 1 is number of rows you want to limit to)
Please give the following a shot,
---table creation----
create table ch7(custdetails map string,string>)
row format delimited
collection items terminated by '#'
map keys terminated by '=';
----data loading----
Load data local inpath 'your-file-location-in-local' overwrite into table ch7;
----data selection-----
select custdetails["customer value"] from ch7;
Hope this helps!!

explode function in hive

I have the following sample data and I am trying to explode it in hive.. I used split but I know I am missing something..
["[[-80.742426,35.23248],[-80.740424,35.23184],[-80.739583,35.231562],[-80.735935,35.23041],[-80.728624,35.228069],[-80.727753,35.227836],[-80.727294,35.227741],[-80.726762,35.227647],[-80.726321,35.227594],[-80.725687,35.227544],[-80.725134,35.227535],[-80.721502,35.227615],[-80.691298,35.216202],[-80.688009,35.215396],[-80.686516,35.215016],[-80.598433,35.234307]]"]
I used the below query
select explode(split(col, ',')) from sample2;
and the result is this
["[[-80.742426
35.23248]
[-80.740424
35.23184]
[-80.739583
35.231562]
[-80.735935
35.23041]
[-80.728624
35.228069]
[-80.727753
35.227836]
[-80.71143
35.227831]
[-80.711007
35.227795]
[-80.710638
35.227741]
[-80.673884
35.21014]
[-80.672358
35.209481]
[-80.672036
35.209356]
[-80.671686
35.209234]
[-80.67124
35.209099]
[-80.670815
35.209006]
[-80.670267
35.208906]
[-80.669612
35.208833]
[-80.668924
35.208806]
[-80.598433
35.234307]]"]
I need it in below format
[-80.742426,35.23248]
[-80.740424,35.23184]
[-80.739583,35.231562]
[-80.735935,35.23041]
[-80.728624,35.228069]
[-80.727753,35.227836]
[-80.727294,35.227741]
[-80.726762,35.227647]
[-80.726321,35.227594]
[-80.725687,35.227544]
[-80.725134,35.227535]
[-80.721502,35.227615]
[-80.691298,35.216202]
[-80.688009,35.215396]
[-80.686516,35.215016]
[-80.684281,35.214466]
[-80.68396,35.214395]
[-80.683375,35.214231]
[-80.682908,35.214079]
[-80.682444,35.213905]
[-80.682045,35.213733]
[-80.68062,35.213112]
[-80.678078,35.211983]
[-80.676836,35.211447]
[-80.598433,35.234307]
Any help over here..?
You have your data set as arrays of array and you want to explode your data at first level only, so use LATERAL VIEW explode(colname) to explode at the first level.
Below is the SELECT query with explode():
SELECT col1 FROM sample2 LATERAL VIEW EXPLODE(col) explodeVal AS col1;
output generated from your input data set as below:
[-80.742426,35.23248]
[-80.740424,35.23184]
[-80.739583,35.231562]
[-80.735935,35.23041]
[-80.728624,35.228069]
[-80.727753,35.227836]
[-80.727294,35.227741]
[-80.726762,35.227647]
[-80.726321,35.227594]
[-80.725687,35.227544]
[-80.725134,35.227535]
[-80.721502,35.227615]
[-80.691298,35.216202]
[-80.688009,35.215396]
[-80.686516,35.215016]
[-80.684281,35.214466]
[-80.68396,35.214395]
[-80.683375,35.214231]
[-80.682908,35.214079]
[-80.682444,35.213905]
[-80.682045,35.213733]
[-80.68062,35.213112]
[-80.678078,35.211983]
[-80.676836,35.211447]
[-80.598433,35.234307]