Hive - sql max number with multiple rows

Hive - sql max number with multiple rows - hive

For the raw data below, how do I get max number per customer_id for the entire row and null for rest of the row? I can get the max for the data but not able to get in the form #Results
#Raw data
customer_id name location itemno_1 itemno_2 itemno_3 itemno_4 itemno_5
123 Ashley M CA 10 null 10 null null
123 Ashley M CA null 12 null 12 null
143 Donald P FL 15 15 0 1 10
187 Alicia P GA 15 9 null null null
1736 Mike H CT null 8 8 9 null
1736 Mike H CT null null null null null
1876 David M CA null null null null null
532 Matthew T CA null 9 10 10 null
Results
customer_id name location itemno_1 itemno_2 itemno_3 itemno_4 itemno_5
123 Ashley M CA null 12 null null null
143 Donald P FL 15 null null null null
187 Alicia P GA 15 null null null null
1736 Mike H CT null null null null null
1876 David M CA null null null null null
532 Matthew T CA null null null 10 null

Below is the query which produces your expected result.(I have tested it works) I have assumed if 2 item_nos have same max value we will keep value in lowest item_no. For example for customer_id = 123 itemno_2 and itemno_4 has value 12 but kept itemno_2 as 12 and made itemno_4 as null.
select customer_id, name, location1
,CASE WHEN (i1 >= i2 or i2 is null)
AND (i1 >= i3 or i3 is null)
AND (i1 >= i4 or i4 is null)
AND (i1 >= i5 or i5 is null)
THEN i1
ELSE null
END as itemno_1
,CASE WHEN (i2 >= i1 or i1 is null)
AND (i2 >= i3 or i3 is null)
AND (i2 >= i4 or i4 is null)
AND (i2 >= i5 or i5 is null)
AND (i1 <> i2 or i1 is null)
THEN i2
ELSE null
END as itemno_2
,CASE WHEN (i3 >= i1 or i1 is null)
AND (i3 >= i2 or i2 is null)
AND (i3 >= i4 or i4 is null)
AND (i3 >= i5 or i5 is null)
AND (i1 <> i3 or i1 is null)
AND (i2 <> i3 or i2 is null)
THEN i3
ELSE null
END as itemno_3
,CASE WHEN (i4 >= i1 or i1 is null)
AND (i4 >= i2 or i2 is null)
AND (i4 >= i3 or i3 is null)
AND (i4 >= i5 or i5 is null)
AND (i1 <> i4 or i1 is null)
AND (i2 <> i4 or i2 is null)
and (i3 <> i4 or i3 is null)
THEN i4
ELSE null
END as itemno_4
,CASE WHEN (i5 >= i1 or i1 is null)
AND (i5 >= i2 or i2 is null)
AND (i5 >= i3 or i3 is null)
AND (i5 >= i4 or i4 is null)
AND (i1 is null or i1 <> i5)
AND (i2 is null or i2 <> i5)
AND (i3 is null or i3 <> i5)
AND (i4 is null or i4 <> i5)
THEN i5
ELSE null
END as itemno_5
from (
select customer_id, name, location1
,max(itemno_1) as i1
,max(itemno_2) as i2
,max(itemno_3) as i3
,max(itemno_4) as i4
,max(itemno_5) as i5
from default.stack2
group by customer_id, name, location1) a
order by customer_id;
Same thing can also be achieved by writing UDF instead of case statements to find maximum of 5 columns and return as expected.

Related

Row wise and conditions average in sql

Below table input
Atm_ID
C1
C2
C3
C4
C5
R12673
2
5
3
1
10
R34721
3
5
2
1
8
R27835
1
2
2
8
6
I found the average Atm_Id wise but consider the data greater than equal 3 only and divide also count of number of greater than equal 3.
I need following output
Atm_ID
Average
R12673
6
R34721
5.33
R27835
7
Please any one to help me

You can simply use a CTE to generate a temp table and then can generate your desired table -
WITH CTE AS (SELECT Atm_ID,
CASE WHEN C1 >= 3 THEN C1 ELSE 0 END C1,
CASE WHEN C2 >= 3 THEN C2 ELSE 0 END C2,
CASE WHEN C3 >= 3 THEN C3 ELSE 0 END C3,
CASE WHEN C4 >= 3 THEN C4 ELSE 0 END C4,
CASE WHEN C5 >= 3 THEN C5 ELSE 0 END C5,
CASE WHEN C1 >= 3 THEN C1 ELSE 1 END +
CASE WHEN C2 >= 3 THEN C2 ELSE 0 END +
CASE WHEN C3 >= 3 THEN C3 ELSE 0 END +
CASE WHEN C4 >= 3 THEN C4 ELSE 0 END +
CASE WHEN C5 >= 3 THEN C5 ELSE 0 END tot_avg
)
SELECT Atm_ID, (C1 + C2 + C3 + C4 + C5)/tot_avg Average
FROM CTE;

SQL - update field with a value, if corresponding values are all the same

I have an interesting task at hand that I'm trying to figure out how to do.
Let's say I have the following data in a table:
Num1 Acct Amt Type1 Type2 AmtX AmtY AcctBadInd
X12 111 90 X 1 NULL NULL NULL
X12 222 -90 X 1 NULL NULL NULL
X12 333 90 X 1 NULL NULL NULL
Y33 111 75 Y 1 NULL NULL NULL
Y33 444 -75 Y 1 NULL NULL NULL
Z44 111 55 Y 1 NULL NULL NULL
Z44 111 55 Y 0 NULL NULL NULL
Z44 444 -65 Y 1 NULL NULL NULL
Below are a couple examples. Only caveat is that a given Num1 can have any number of records but always >= 2. So it could be 2,3,4,5 and the same logic would apply in all cases.
Verify that ABS(AMT) for all Type2=1 records is the same. If all 3 records have the same AMT, then SET AMTX=ABS(AMT) for that Num1. Alternatively, if Type1 was Y for X12 - then we would instead update AmtY = ABS(AMT)
Num1 = Y33 - In this case we again want to verify that ABS(AMT) is the same where Type2=1. If they are equal, then because Type1=Y, we would set AmtY =75
Num1=Z44 - in this case again verify that ABS(AMT) is the same for Type2=1. If they are not equal, then dont update AmtY, but rather set AcctBadInd = 1
End Result
Num1 Acct Amt Type1 Type2 AmtX AmtY AcctBadInd
X12 111 90 X 1 90 NULL NULL
X12 222 -90 X 1 90 NULL NULL
X12 333 90 X 1 90 NULL NULL
Y33 111 75 Y 1 NULL 75 NULL
Y33 444 -75 Y 1 NULL 75 NULL
Z44 111 55 Y 1 NULL NULL 1
Z44 444 -65 Y 1 NULL NULL 1
Z44 111 55 Y 0 NULL NULL NULL
I'm struggling with this, and I'm not expecting an answer but at least a hint or any help so I can get on my way. More so, if this is doable in a way that I imagine without writing god knows how much code.

If I understand correctly here is how you can do it:
with cte as (
select * , type2, max(abs(Amt)) minAmt, min(abs(Amt)) maxAmt, count(*) cnt
from table
group by Num1 , type2
);
update t1
set AmtX = case when cnt> 1 and Type2=1 and t1.type1 = 'X' and minAmt = maxAmt then minAmt end
, AmtY = case when cnt> 1 and Type2=1 and t1.type1 = 'Y' and minAmt = maxAmt then minAmt end
, AcctBadInd = case when cnt> 1 and Type2=1 and minAmt <> maxAmt then 1 end
from table t1
join cte on t1.Num1 = cte.Num1
and t1.type2 = cte.type2

Using window functions
with t as (
select *,
AmtXnew = case when min(ABS(AMT)) over(partition by Num1, Type2) = max(ABS(AMT)) over(partition by Num1, Type2)
and Type2 = 1 and Type1 ='X' then ABS(AMT)
else NULL end,
AmtYnew = case when min(ABS(AMT)) over(partition by Num1, Type2) = max(ABS(AMT)) over(partition by Num1, Type2)
and Type2 = 1 and Type1 ='Y' then ABS(AMT)
else NULL end,
AcctBadIndnew = case when min(ABS(AMT)) over(partition by Num1, Type2) <> max(ABS(AMT)) over(partition by Num1, Type2)
and Type2 = 1 then 1 else NULL end
from tbl
)
update t set AmtX = AmtXnew,AmtY = AmtYnew, AcctBadInd = AcctBadIndnew;

SQL - COMPARING MULTIPLE FIELD IN A TABLE

I want to compare 3 fields value in a table.
example:column 1 - 10 items
column 2 - 7 items with 3 Null
column 3 - 4 items with 6 null
anybody can help me!
items column1 column2 column3
1 BK1 NULL BK1
2 RK1 RK1 RK1
3 SK1 SK2 NULL
4 AK1 AK1 AK2
5 CK1 CK2 CK2
6 DK1 NULL NULL
7 EK1 EK1 NULL
8 FK1 NULL NULL
9 GK1 GK1 NULL
10 HK1 NULL NULL
Reuslt
items column1 column2 column3 RESULT
1 BK1 NULL BK1 OK
2 RK1 RK1 RK1 OK
3 SK1 SK2 NULL NOT EQUAL
4 AK1 AK1 AK2 NOT EQUAL
5 CK1 CK2 CK2 NOT EQUAL
6 DK1 NULL NULL OK
7 EK1 EK1 NULL OK
8 FK1 NULL NULL OK
9 GK1 GK1 NULL OK
10 HK1 NULL NULL OK

I hope following can be helpful
select * from tablename
where (column1 is not null and column2 is not NULL and column3 is not NULL)
and (column1 = column2 or column2 = column3 or column3 = column1 )
Please check SQLFiddle example
http://sqlfiddle.com/#!3/fe38f/1

select * from tablename
where
(c1 is null or ((c2 is null or c1 = c2) and (c3 is null or c1 = c3)))
and
(c2 is null or ((c1 is null or c1 = c2) and (c3 is null or c2 = c3)))
and
(c3 is null or ((c1 is null or c1 = c3) and (c2 is null or c2 = c3)))

Conditional UNPIVOT in TSQL

Let's say I have a table with an ID column, and several property columns
MyTable (ID, PI, P2, P3, P4)
ID P1 P2 P3 P4
1 A1 B C1 D1
2 C1 C2 B NULL
3 C2 Z NULL NULL
4 X A1 C1 NULL
So, I need to write a query to find out how many distinct property values out there, no matter in which column they are.
Value Count
A1 2
B 2
C1 3
C2 2
X1 1
...
I think I can get this by using UNPIVOT (correct me, if I am wrong)
Now, how can I get similar count but grouped by a number of non-null values in the row (the count of non-null values per row may, or may not include key columns, doesn't matter), i.e. output like this:
Value NonNullCount Count
A1 3 1
A1 4 1
B 3 1
B 4 1
C1 2 3
C1 4 1
C2 3 1
C2 2 1
...

Here is one method, using cross apply for the unpivot:
select vals.p, t.NonNullCount, count(*)
from (select t.*,
((case when p1 is not null then 1 else 0 end) +
(case when p2 is not null then 1 else 0 end) +
(case when p3 is not null then 1 else 0 end) +
(case when p4 is not null then 1 else 0 end)
) as NonNullCount
from table t
) t cross apply
(values (p1), (p2), (p3), (p4)) vals(p)
where vals.p is not null
group by vals.p, t.NonNullCount;

Using Boolean to determine 5-way Where clause

I'm looking at 5 different columns (db made badly unfortunately). If of the five columns two have one "1" value and one "2" value I want this record to be excluded from the results. However, if it only has one of the two values I want it to be included.
I have this so far, but I'm certain it will not include the record if it has even one of the two values.
NOT ((Ew.DocRecvd1 = 10 OR Ew.DocRecvd1 = 11) OR
(Ew.DocRecvd2 = 10 OR Ew.DocRecvd2 = 11) OR
(Ew.DocRecvd3 = 10 OR Ew.DocRecvd3 = 11) OR
(Ew.DocRecvd4 = 10 OR Ew.DocRecvd4 = 11) OR
(Ew.DocRecvd5 = 10 OR Ew.DocRecvd5 = 11))
Thanks.

I would suggest that you count the number of values in each group that you want. And, I would do it in a subquery, just because that makes the code more readable and maintainable.
Here is an example:
from (select t.*,
((case when Ew.DocRecvd1 in (10, 11) then 1 else 0) +
(case when Ew.DocRecvd2 in (10, 11) then 1 else 0) +
(case when Ew.DocRecvd3 in (10, 11) then 1 else 0) +
(case when Ew.DocRecvd4 in (10, 11) then 1 else 0) +
(case when Ew.DocRecvd5 in (10, 11) then 1 else 0) +
) as Num1s,
<something similar> as Num2s
from table t
) t
where Num1s = 2 and Num2s = 1;

You state the filter conditions simply in the where clause. Given a table
create table foobar
(
id int not null primary key ,
c1 int not null ,
c2 int not null ,
c3 int not null ,
c4 int not null ,
c5 int not null ,
)
go
You can say
select *
from foobar
where not ( 2 = case c1 when 1 then 1 else 0 end
+ case c2 when 1 then 1 else 0 end
+ case c3 when 1 then 1 else 0 end
+ case c4 when 1 then 1 else 0 end
+ case c5 when 1 then 1 else 0 end
and 1 = case c1 when 2 then 1 else 0 end
+ case c2 when 2 then 1 else 0 end
+ case c3 when 2 then 1 else 0 end
+ case c4 when 2 then 1 else 0 end
+ case c5 when 2 then 1 else 0 end
)
The other approach which might run faster is to use as mask table, containing the conditions you want to exclude. Something like this one:
create table mask
(
c1 tinyint null ,
c2 tinyint null ,
c3 tinyint null ,
c4 tinyint null ,
c5 tinyint null ,
unique clustered ( c1,c2,c3,c4,c5) ,
)
In your case, there are only 30 conditions to be excluded:
c1 c2 c3 c4 c5
---- ---- ---- ---- ----
NULL NULL 1 1 2
NULL NULL 1 2 1
NULL NULL 2 1 1
NULL 1 NULL 1 2
NULL 1 NULL 2 1
NULL 1 1 NULL 2
NULL 1 1 2 NULL
NULL 1 2 NULL 1
NULL 1 2 1 NULL
NULL 2 NULL 1 1
NULL 2 1 NULL 1
NULL 2 1 1 NULL
1 NULL NULL 1 2
1 NULL NULL 2 1
1 NULL 1 NULL 2
1 NULL 1 2 NULL
1 NULL 2 NULL 1
1 NULL 2 1 NULL
1 1 NULL NULL 2
1 1 NULL 2 NULL
1 1 2 NULL NULL
1 2 NULL NULL 1
1 2 NULL 1 NULL
1 2 1 NULL NULL
2 NULL NULL 1 1
2 NULL 1 NULL 1
2 NULL 1 1 NULL
2 1 NULL NULL 1
2 1 NULL 1 NULL
2 1 1 NULL NULL
(30 row(s) affected)
The actual query is trivial then (and if you have a covering index on the columns to be tested, the test is done with index seeks and so should perform extremely well:
select *
from dbo.foobar t
where not exists ( select *
from mask m
where t.c1 = m.c1
and t.c2 = m.c2
and t.c3 = m.c3
and t.c4 = m.c4
and t.c5 = m.c6
)
The advantage of this approach is that the ruleset is table-driven, meaning future changes to the rules are just data modifications to your mask table.
You could also use a positive set of rules, but in your case, the set is bigger (>200 positive cases as opposed to the 30 negative cases).

OK, I think I've found the result I wanted.
I used the following in the WHERE clause of my query:
NOT
(2 =
(CASE WHEN Ew.DocRecvd1 = 10 THEN 1 ELSE 0 END
+
CASE WHEN Ew.DocRecvd2 = 10 THEN 1 ELSE 0 END
+
CASE WHEN Ew.DocRecvd3 = 10 THEN 1 ELSE 0 END
+
CASE WHEN Ew.DocRecvd4 = 10 THEN 1 ELSE 0 END
+
CASE WHEN Ew.DocRecvd5 = 10 THEN 1 ELSE 0 END
+
CASE WHEN Ew.DocRecvd1 = 11 THEN 1 ELSE 0 END
+
CASE WHEN Ew.DocRecvd2 = 11 THEN 1 ELSE 0 END
+
CASE WHEN Ew.DocRecvd3 = 11 THEN 1 ELSE 0 END
+
CASE WHEN Ew.DocRecvd4 = 11 THEN 1 ELSE 0 END
+
CASE WHEN Ew.DocRecvd5 = 11 THEN 1 ELSE 0 END))
It is only possible in my DB to get these two documents in one of five places within one record, so the count could not go over 2 with the two documents i'm looking for.
Kudos to Nicholas Carey and Gordon Linoff for keying me into what I could do and look for!

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Hive - sql max number with multiple rows - hive

Related

Row wise and conditions average in sql

SQL - update field with a value, if corresponding values are all the same

SQL - COMPARING MULTIPLE FIELD IN A TABLE

Conditional UNPIVOT in TSQL

Using Boolean to determine 5-way Where clause

Categories

Resources