SQL how to find a multi column maximum in a group? - sql

How can I write an SQL query (DB2) that will run on this table:
| A | B | C | V |
+---+---+---+----+
| | | | |
| 1 | 1 | 1 | k1 |
| | | | |
| 1 | 1 | 2 | k1 |
| | | | |
| 1 | 2 | 3 | k2 |
| | | | |
| 2 | 3 | 4 | k2 |
| | | | |
| 1 | 2 | 3 | k3 |
| | | | |
| 1 | 3 | 5 | k3 |
| | | | |
| 1 | 4 | 6 | k3 |
+---+---+---+----+
and produce this result
+---+---+---+----+
| A | B | C | V |
+---+---+---+----+
| | | | |
| 1 | 1 | 2 | k1 |
| | | | |
| 2 | 3 | 4 | k2 |
| | | | |
| 1 | 4 | 6 | k3 |
+---+---+---+----+
that is it will select rows based on a max of a "tuple" (A,B,C) in a group:
or for two rows R1, R2 :
if R1.A <> R2.A return Row where A = Max(R1.A,R2.A)
if R2.B <> R2.B return Row where B = Max(R1.B,R2.B)
return Row where C = Max(R1.C,R2.C)

I think row_number() does what you want -- if by "group" you mean V:
select t.*
from (select t.*,
row_number() over (partition by v order by a desc, b desc, c desc) as seqnum
from t
) t
where seqnum = 1;

Related

How to insert or update a column using SQL based on sorted number of items for each item group

I have two tables 'Product' and 'product_Desc'
+-----------+-------------+
| ProductID | ProductName |
+-----------+-------------+
| 1 | A |
| 2 | B |
+-----------+-------------+
+----+-----------+-------------+-----------+
| Id | ProductID | ProductDec | SortOrder |
+----+-----------+-------------+-----------+
| 1 | 1 | Aero-pink | |
| 2 | 1 | Aero-white | |
| 3 | 1 | Aero-green | |
| 4 | 1 | Aero-Orange | |
| 5 | 2 | Baloon-1 | |
| 6 | 2 | Baloon-2 | |
| 7 | 2 | Baloon-3 | |
+----+-----------+-------------+-----------+
Now, what is the Sql code that can update 'sortOrder' column sequentially for each group of ProductID as shown below:
+----+-----------+-------------+-----------+
| Id | ProductID | ProductDec | SortOrder |
+----+-----------+-------------+-----------+
| 1 | 1 | Aero-pink | 1 |
| 2 | 1 | Aero-white | 2 |
| 3 | 1 | Aero-green | 3 |
| 4 | 1 | Aero-Orange | 4 |
| 5 | 2 | Baloon-1 | 1 |
| 6 | 2 | Baloon-2 | 2 |
| 7 | 2 | Baloon-3 | 3 |
+----+-----------+-------------+-----------+
Please note that these are sample tables, actual tables have thousands of records.
Would appreciate your help on this. Thank you
with cte
as
(
select SortOrder, row_number() over(partition by ProductID order by Id) as newPerProductOrder
from product_Desc
)
update cte
set SortOrder = newPerProductOrder
where (SortOrder <> newPerProductOrder or SortOrder is null)

Merge groups if they contain the same value

I have the following table:
+-----+----+---------+
| grp | id | sub_grp |
+-----+----+---------+
| 10 | A2 | 1 |
| 10 | B4 | 2 |
| 10 | F1 | 2 |
| 10 | B3 | 3 |
| 10 | C2 | 4 |
| 10 | A2 | 4 |
| 10 | H4 | 5 |
| 10 | K0 | 5 |
| 10 | Z3 | 5 |
| 10 | F1 | 5 |
| 10 | A1 | 5 |
| 10 | A | 6 |
| 10 | B | 6 |
| 10 | B | 7 |
| 10 | C | 7 |
| 10 | C | 8 |
| 10 | D | 8 |
| 20 | A | 1 |
| 20 | B | 1 |
| 20 | B | 2 |
| 20 | C | 2 |
| 20 | C | 3 |
| 20 | D | 3 |
+-----+----+---------+
Within every grp, my goal is to merge all the sub_grp sharing at least one id.
More than 2 sub_grp can be merged together.
The expected result should be:
+-----+----+---------+
| grp | id | sub_grp |
+-----+----+---------+
| 10 | A2 | 1 |
| 10 | B4 | 2 |
| 10 | F1 | 2 |
| 10 | B3 | 3 |
| 10 | C2 | 1 |
| 10 | A2 | 1 |
| 10 | H4 | 2 |
| 10 | K0 | 2 |
| 10 | Z3 | 2 |
| 10 | F1 | 2 |
| 10 | A1 | 2 |
| 10 | A | 6 |
| 10 | B | 6 |
| 10 | B | 6 |
| 10 | C | 6 |
| 10 | C | 6 |
| 10 | D | 6 |
| 20 | A | 1 |
| 20 | B | 1 |
| 20 | B | 1 |
| 20 | C | 1 |
| 20 | C | 1 |
| 20 | D | 1 |
+-----+----+---------+
Here is a SQL Fiddle with the test values: http://sqlfiddle.com/#!9/13666c/2
I am trying to solve this either with a stored procedure or queries.
This is an evolution from my previous problem: Merge rows containing same values
My understanding of the problem
Merge sub_grp (for a given grp) if any one of the IDs in one sub_grp match any one of the IDs in another sub_grp. A given sub_grp can be merged with only one other (the earliest in ascending order) sub_grp.
Disclaimer
This code may work. Not tested as OP did not provide DDLs and data scripts.
Solution
UPDATE final
SET sub_grp = new_sub_grp
FROM
-- For each grp, sub_grp combination return a matching new_sub_grp
( SELECT a.grp, a.sub_grp, MatchGrp.sub_grp AS new_sub_grp
FROM tbl AS a
-- Inner join will exclude cases where there are no matching sub_grp and thus nothing to update.
INNER JOIN
-- Find the earliest (if more than one sub-group is a match) matching sub-group where one of the IDs matches
( SELECT TOP 1 grp, sub_grp
FROM tbl AS b
-- b.sub_grp > a.sub_grp - this will only look at the earlier sub-groups avoiding the "double linking"
WHERE b.grp = a.grp AND b.sub_grp > a.sub_grp AND b.ID = a.ID
ORDER BY grp, sub_grp ) AS MatchGrp ON 1 = 1
-- Only return one record per grp, sub_grp combo
GROUP BY grp, sub_grp, MatchGrp.sub_grp ) AS final
You can re-number sub groups afterwards as a separate update statement with the help of DENSE_RANK window function.

Avoid repeated values in a join

I have two tables - a header and a matrix/details.
*Header Table* *Matrix / Details Table*
+----+--------+-----+ +----+--------+------+
| ID | Parent | Qty | | ID | Child | Qty |
+----+--------+-----+ +----+--------+------+
| 1 | A | 10 | | 1 | X | 100 |
| 2 | B | 20 | | 1 | Y | 1000 |
| 3 | C | 30 | | 2 | X | 200 |
+----+--------+-----+ | 2 | Y | 2000 |
| 3 | X | 30 |
| 3 | Y | 300 |
| 3 | Z | 3000 |
+----+--------+------+
I'm Joining these two tables based on ID.
I don't want the result to have duplicated values from header table.
I expect a result like following:
*Current Result* *Expected Result*
+----+--------+-----+-------+------+ +----+--------+-----+-------+------+
| ID | Parent | Qty | Child | Qty | | ID | Parent | Qty | Child | Qty |
+----+--------+-----+-------+------+ +----+--------+-----+-------+------+
| 1 | A | 10 | X | 100 | | 1 | A | 10 | X | 100 |
| 1 | A | 10 | Y | 1000 | | | | | Y | 1000 |
| 2 | B | 20 | X | 200 | | 2 | B | 20 | X | 200 |
| 2 | B | 20 | Y | 2000 | | | | | Y | 2000 |
| 3 | C | 30 | X | 30 | | 3 | C | 30 | X | 30 |
| 3 | C | 30 | Y | 300 | | | | | Y | 300 |
| 3 | C | 30 | Z | 3000 | | | | | Z | 3000 |
+----+--------+-----+-------+------+ +----+--------+-----+-------+------+
Is this possible? If not any, alternate solution available?
Thanks in advance...
If you are using SQL Server,Try with the below query.
;WITH CTE_1
AS
(SELECT *,ROW_NUMBER()OVER(PARTITION BY ID,Parent,Quantity ORDER BY ID ) RNO
FROM Header H
JOIN [Matrix / Details] M
ON H.ID=M.ID)
SELECT CASE WHEN RNO=1 THEN CAST(ID as VARCHAR(50)) ELSE '' END ID,
CASE WHEN RNO=1 THEN Parent ELSE '' END Parent,
CASE WHEN RNO=1 THEN cast(Quantity as VARCHAR(50)) ELSE '' END Quantity,
Child,Qty
FROM CTE_1
ORDER BY ID,Parent,Quantity

How to aggregate column on changing criteria in SQL (multiple SUMIFS)

Consider the following simplified example:
Table JobTitles
| PersonID | JobTitle | StartDate | EndDate |
|----------|----------|-----------|---------|
| A | A1 | 1 | 5 |
| A | A2 | 6 | 10 |
| A | A3 | 11 | 15 |
| B | B1 | 2 | 4 |
| B | B2 | 5 | 7 |
| B | B3 | 8 | 11 |
| C | C1 | 5 | 12 |
| C | C2 | 13 | 14 |
| C | C3 | 15 | 18 |
Table Transactions:
| PersonID | TransDate | Amt |
|----------|-----------|-----|
| A | 2 | 5 |
| A | 3 | 10 |
| A | 12 | 5 |
| A | 12 | 10 |
| B | 3 | 5 |
| B | 3 | 10 |
| B | 10 | 5 |
| C | 16 | 10 |
| C | 17 | 5 |
| C | 17 | 10 |
| C | 17 | 5 |
Desired Output:
| PersonID | JobTitle | StartDate | EndDate | Amt |
|----------|----------|-----------|---------|-----|
| A | A1 | 1 | 5 | 15 |
| A | A2 | 6 | 10 | 0 |
| A | A3 | 11 | 15 | 15 |
| B | B1 | 2 | 4 | 15 |
| B | B2 | 5 | 7 | 0 |
| B | B3 | 8 | 11 | 5 |
| C | C1 | 5 | 12 | 0 |
| C | C2 | 13 | 14 | 0 |
| C | C3 | 15 | 18 | 30 |
To me this is JobTitles LEFT OUTER JOIN Transactions with some type of moving criteria for the TransDate -- that is, I want to SUM Transaction.Amt if Transactions.TransDate is between JobTitles.StartDate and JobTitles.EndDate per each PersonID.
Feels like some type of partition or window function, but my SQL skills are not strong enough to create an elegant solution. In Excel, this equates to:
SUMIFS(Transaction[Amt], JobTitles[PersonID], Results[#[PersonID]], Transactions[TransDate], ">" & Results[#[StartDate]], Transactions[TransDate], "<=" & Results[#[EndDate]])
Moreover, I want to be able to perform this same logic over several flavors of Transaction tables.
The basic query is:
select jt.PersonID, jt.JobTitle, jt.StartDate, jt.EndDate, coalesce(sum(amt), 0) as amt
from JobTitles jt left join
Transactions t
on jt.PersonId = t.PersonId and
t.TransDate between jt.StartDate and jt.EndDate
group by jt.PersonID, jt.JobTitle, jt.StartDate, jt.EndDate;

how to write a query to get multilevel data

I have four tables as below:
tblAccount
Id i sprimary key
+----+-----------------+
| Id | AccName |
+----+-----------------+
| 1 | AccountA |
| 2 | AccountB |
+----+-----------------+
tblLocation
Id is primary key.
+----+---------------+
| Id | LocName |
+----+---------------+
| 1 | LocationA |
| 2 | LocationB |
| 3 | LocationC |
+----+---------------+
tblAccountwiseLocation
Id i sprimary key.LocId and AccId are foreign key.
+----+---------------+---------------+
| Id | LocId | AccId |
+----+---------------+---------------+
| 1 | 1 | 1 |
| 2 | 2 | 1 |
| 3 | 3 | 1 |
| 4 | 1 | 2 |
| 5 | 2 | 2 |
| 6 | 3 | 2 |
+----+---------------+---------------+
tblRSCMaster
Id i sprimary key.LocId and AccId are foreign key.
+----+---------------+---------------+----------------+------------------+
| Id | LocId | AccId | RSCNo | DateOfAddition |
+----+---------------+---------------+----------------+------------------+
| 1 | 1 | 1 | Acc1_Loc1_1_14 | 15/01/2014 |
| 2 | 2 | 1 | Acc1_Loc2_1_14 | 15/01/2014 |
| 3 | 3 | 1 | Acc1_Loc2_1_14 | 15/01/2014 |
| 4 | 1 | 2 | Acc2_Loc1_1_14 | 15/01/2014 |
| 5 | 2 | 2 | Acc2_Loc2_1_14 | 15/01/2014 |
| 6 | 3 | 2 | Acc2_Loc3_1_14 | 15/01/2014 |
| 7 | 1 | 1 | Acc1_Loc1_2_14 | 15/02/2014 |
| 8 | 2 | 1 | Acc1_Loc2_2_14 | 15/02/2014 |
| 9 | 3 | 1 | Acc1_Loc3_2_14 | 15/02/2014 |
| 10 | 1 | 2 | Acc2_Loc1_2_14 | 15/02/2014 |
| 11 | 2 | 2 | Acc2_Loc2_2_14 | 15/02/2014 |
| 12 | 3 | 2 | Acc2_Loc3_2_14 | 15/02/2014 |
| 13 | 1 | 1 | Acc1_Loc1_3_14 | 15/03/2014 |
| 14 | 2 | 1 | Acc1_Loc2_3_14 | 15/03/2014 |
| 15 | 3 | 1 | Acc1_Loc3_3_14 | 15/03/2014 |
| 16 | 1 | 2 | Acc2_Loc1_3_14 | 15/03/2014 |
| 17 | 2 | 2 | Acc2_Loc2_3_14 | 15/03/2014 |
| 18 | 3 | 2 | Acc2_Loc3_3_14 | 15/03/2014 |
| 19 | 1 | 1 | Acc1_Loc1_4_14 | 15/04/2014 |
| 20 | 2 | 1 | Acc1_Loc2_4_14 | 15/04/2014 |
| 21 | 3 | 1 | Acc1_Loc3_4_14 | 15/04/2014 |
| 22 | 1 | 2 | Acc2_Loc1_4_14 | 15/04/2014 |
| 23 | 2 | 2 | Acc2_Loc2_4_14 | 15/04/2014 |
| 24 | 3 | 2 | Acc2_Loc3_4_14 | 15/04/2014 |
| 25 | 1 | 1 | Acc1_Loc1_5_14 | 15/05/2014 |
| 26 | 2 | 1 | Acc1_Loc2_5_14 | 15/05/2014 |
| 27 | 3 | 1 | Acc1_Loc3_5_14 | 15/05/2014 |
| 28 | 1 | 2 | Acc2_Loc1_5_14 | 15/05/2014 |
| 29 | 2 | 2 | Acc2_Loc2_5_14 | 15/05/2014 |
| 30 | 3 | 2 | Acc2_Loc3_5_14 | 15/05/2014 |
+----+---------------+---------------+----------------+------------------+
Acc1_Loc1_1_14 resembles RSC for LocationA of AccountA for Jan 2014.
I need to get a output as below from tblRSCMaster.
+---------------+---------------+----------------+------------------+
| LocId | AccId | RSCNo | DateOfAddition |
+---------------+---------------+----------------+------------------+
| 1 | 1 | Acc1_Loc1_3_14 | 15/03/2014 |
| 1 | 1 | Acc1_Loc1_4_14 | 15/04/2014 |
| 1 | 1 | Acc1_Loc1_5_14 | 15/05/2014 |
| 2 | 1 | Acc1_Loc2_3_14 | 15/03/2014 |
| 2 | 1 | Acc1_Loc2_4_14 | 15/04/2014 |
| 2 | 1 | Acc1_Loc2_5_14 | 15/05/2014 |
| 3 | 1 | Acc1_Loc3_3_14 | 15/03/2014 |
| 3 | 1 | Acc1_Loc3_4_14 | 15/04/2014 |
| 3 | 1 | Acc1_Loc3_5_14 | 15/05/2014 |
+---------------+---------------+----------------+------------------+
Each account has multiple locations and each location has multiple RSCs.
I need to get last three RSCs for each location for AccountA.
I have tried the below query:
SELECT tblAccountwiseLocation.LocId,tblAccountwiseLocation.AccId,tblRSCMaster.RSCNo,tblRSCMaster.DateOfAddition FROM tblAccountwiseLocation
INNER JOIN tblRSCMaster ON tblAccountwiseLocation.LocId= tblRSCMaster.LocId
where tblRSCMaster.AccId=1
But not getting the proper output.
Please help me out.
Thank you all in advance.
You can wrap the existing query inside a common table expression, and use ROW_NUMBER() to get only the last 3 (by tblRSCMaster.DateOfAddition) entries per tblAccountwiseLocation.LocId.
WITH cte AS (
SELECT tblAccountwiseLocation.LocId,
tblAccountwiseLocation.AccId,
tblRSCMaster.RSCNo,
tblRSCMaster.DateOfAddition,
ROW_NUMBER() OVER (PARTITION BY tblAccountwiseLocation.LocId
ORDER BY tblRSCMaster.DateOfAddition DESC) rn
FROM tblAccountwiseLocation
INNER JOIN tblRSCMaster
ON tblAccountwiseLocation.LocId = tblRSCMaster.LocId
AND tblAccountwiseLocation.AccId = tblRSCMaster.AccId
WHERE tblRSCMaster.AccId=1
)
SELECT LocId, AccId, RSCNo, DateOfAddition
FROM cte
WHERE rn <= 3
ORDER BY LocId, AccId, DateOfAddition
An SQLfiddle to test with.
Is this what you need?
select m.*
from (select m.*, row_number() over (partition by accID
order by DateOfAddition desc) as seqnum
from tblRSCMaster
where m.locid = 1
) m
where seqnum <= 3
order by AccId, DateOfAddition;
I think you need to filter on the locid rather than on the AccId to get what you want.