I need to do a break line with Concat_ws function Hivesql - hive

I have a table
Account
Code
109
ABC
109
ZYX
109
BVC
I want the data in this format
Account
Code
109
ABC nextline ZYX nextline BVC
I have use the below code
Select concat('/n',(collect_list(code))) from table
Group by account

Related

Mark the record with the lowest value in a group in SQL

I have a table that looks like the below:
ID
ID2
Name
111
223
ABC
111
225
ABC
111
227
ABC
113
234
DEF
113
242
DEF
113
248
DEF
113
259
DEF
113
288
DEF
What I am trying to achieve is to mark the record that has the lowest value in the ID2 table in every ID1 group doing a select statement, e.g.:
ID1
ID2
Name
R
111
223
ABC
Y
111
225
ABC
111
227
ABC
113
234
DEF
Y
113
242
DEF
113
248
DEF
113
259
DEF
113
288
DEF
116
350
GHI
Y
116
356
GHI
How do I achieve this in a SELECT statement?
The window functions should to the trick . Use dense_rank() if you want to see ties.
Select *
,R = case when row_number() over (partition by ID1,Name order by ID2) = 1
then 'Y'
else ''
end
From YourTable
I should add... The window functions can be invaluable. They are well worth your time experimenting with them.

Pig script to tranpose on basis of certain criteria

I have a file containing data in following format:
abc 123 456
cde 45 32
efg 322 654
abc 445 856
cde 65 21
efg 147 384
abc 815 078
efg 843 286
and so on.
How can transpose it into following format using pig:
abc 123 456 cde 45 32 efg 322 654
abc 445 856 cde 65 21 efg 147 348
abc 815 078 efg 843 286
Also, in case cde is missing after abc, it should insert blank spaces instead, since it is a fixed width file.
I tried grouping but it ain't worked for me.
Well, you can do it by writing custom loader. The easiest attempt is to extend PigStorage and override getNext() method making it call record reader three times, instead of 1 and produce unioned Tuple.

result is wrong when retrieving the date

I'm working with PostgreSQL. I have two database tables,i want to get the min and max date stored in table1 daterange column which is of type character varying. table1 and table2 is mapped using sid. i want to get the max and min date range of table1 when compared with sid of table2. Please find the demo here. The result is wrong.
table1:
sid daterange
100 5/25/2017
101 1/24/2017
102 4/4/2014
103 11/12/2007
104 4/24/2012
105 01/15/2017
106 1/1/2017
107 3/11/2016
108 10/10/2001
109 1/10/2016
110 12/12/2016
111 4/24/2017
112 06/28/2015
113 5/24/2017
114 5/22/2017
table2:
sid description
100 success
101 pending
104 pending
105 success
106 success
107 success
110 success
111 pending
112 failed
113 failed
114 pending
Below is my query:
select min(daterange) as minDate,max(daterange) as maxDate from (SELECT to_date(table1.daterange, 'DD/MM/YYYY') as daterange FROM table1,table2 where
table1.sid = table2.sid) tt;
The result is as below which is wrong(mindate and maxdate displayed are wrong dates).
mindate maxdate
2013-12-07 2019-01-07
Please advice. daterange column in table1 is of type character varying.I cannot use ::date to convert to date type, because i need to use this query in my java hibernate code and the java code is not recognizing ::
You have day and month mixed up in the date format string.
Should be
to_date(table1.daterange, 'MM/DD/YYYY')

SQL Query to order data based on other column value

I have the below set of data(current data), where system_id is the ID of the particular system. And pre_system_id's are ID of system where it is dependent. Now I need the order in such a way that rows with no dependent system should come first , then rows with one dependent system come second and so on.
The current result:
System_ID PRE_SYSTEM_ID1 PRE_SYSTEM_ID2 PRE_SYSTEM_ID3 PRE_SYSTEM_ID4
106 100
105
112 105 100 109
100
109 100 105
119 100 109 105 112
102 112 109
104 109 106
The actual result should be like below:
Order System_ID PRE_SYSTEM_ID1 PRE_SYSTEM_ID2 PRE_SYSTEM_ID3 PRE_SYSTEM_ID4
1 100
2 105
3 106 100
4 109 100 105
5 112 105 100 109
6 119 100 109 105 112
7 104 109 106
8 102 112 109 104
The query for the current result is simply
Select * from ImpactedSystem;
Sorting by the various PRE_SYSTEM_IDn columns using the nulls first clause should produce the order you want:
select *
from ImpactedSystem
order by PRE_SYSTEM_ID1 nulls first,
PRE_SYSTEM_ID2 nulls first,
PRE_SYSTEM_ID3 nulls first,
PRE_SYSTEM_ID4 nulls first,
SYSTEM_ID
Finally sort by SYSTEM_ID, to order the values with no dependent IDs.
you can use the below query to obtain the result as well.
select *
from Current_data
order by DECODE(pre_system_td1,null,1),
DECODE(pre_system_td2,null,1),
DECODE(pre_system_td3,null,1),
DECODE(pre_system_td4,null,1);

Using sequences to create group ID

I'm attempting to create group_ids based on a set of item_ids. The only indication that the item_ids are part of a single group is the fact that item_ids are sequential. For example, based on the first two columns below, the output I want is the third:
item item_id group_id
ABC 282 2
ABC 283 2
ABC 284 2
ABC 285 2
ABC 051 3
ABC 052 3
ABC 189 4
ABC 231 5
ABC 232 5
ABC 233 5
ABC 234 5
ABC 247 6
ABC 248 6
ABC 249 6
ABC 250 6
ABC 091 7
ABC 092 7
The group_id doesn't necessarily have to be sequential itself, it only has to be unique. I attempted this with the following code:
create sequence seq
start with 1
minvalue 1
increment by 1
cache 20;
select seq.nextval from dual; --to initialize the sequence
select
item,
item_id,
case when diff = 1 then seq.currval else seq.nextval end group_id
from
(
select
item,
item_id,
(id - lag(id, 1, 0) over (order by 1) diff
from
(
select
item,
item_id
from
table
)
);
But get the following output:
item item_id group_id
ABC 282 2
ABC 283 3
ABC 284 4
ABC 285 5
ABC 051 6
ABC 052 7
ABC 189 8
ABC 231 9
ABC 232 10
ABC 233 11
ABC 234 12
ABC 247 13
ABC 248 14
ABC 249 15
ABC 250 16
ABC 091 17
ABC 092 18
When looking for the cause of the problem, I found an excellent explanation by user ShannonSeverance that details why my solution won't work. However, it didn't provide any suggestions on how to move forward.
Does anyone have any ideas?
You have a problem, because SQL tables are inherently unordered. The following "should" logically work, although it won't in practice:
select ii.*, (item_id - rownum) as grp_id
from item_ids ii;
A sequence of item_ids in order minus the row number is constant. You can use that for a group, at least for a given item. To handle multiple items, concatenate the values together:
select ii.*, item||'-'||(item_id - rownum) as grp_id
from item_ids ii;
To really make this work, you need to add an order by -- this guarantees the ordering of the results from the select. This might work, assuming that there are "holes" between the groups:
select ii.*, item||'-'||(item_id - rownum) as grp_id
from item_ids ii
order by item, item_id;
Otherwise, you need some other column to determine the proper ordering for the items.