Finding row of "duplicate" data with greatest value

Finding row of "duplicate" data with greatest value - sql

I have a table setup as follows:
Key || Code || Date
5 2 2018
5 1 2017
8 1 2018
8 2 2017
I need to retrieve only the key and code where:
Code=2 AND Date > the other record's date
So based on this data above, I need to retrieve:
Key 5 with code=2
Key 8 does not meet the criteria since code 2's date is lower than code 1's date
I tried joining the table on itself but this returned incorrect data
Select key,code
from data d1
Join data d2 on d1.key = d2.key
Where d1.code = 2 and d1.date > d2.date
This method returned data with incorrect values and wrong data.

Perhaps you want this:
select d.*
from data d
where d.code = 2 and
d.date > (select d2.date
from data d2
where d2.key = d.key and d2.code = 1
);
If you just want the key, I would go for aggregation:
select d.key
from data d
group by d.key
having max(case when d2.code = 2 then date end) > max(case when d2.code <> 2 then date end);

use row_number, u can select rows with dates in ascending order. This is based on your sample data, selecting 2 rows
DECLARE #table TABLE ([key] INT, code INT, DATE INT)
INSERT #table
SELECT 5, 2, 2018
UNION ALL
SELECT 5, 2, 2018
UNION ALL
SELECT 8, 1, 2018
UNION ALL
SELECT 8, 2, 2017
SELECT [key], code, DATE
FROM (
SELECT [key], code, DATE, ROW_NUMBER() OVER (
PARTITION BY [key], code ORDER BY DATE
) rn
FROM #table
) x
WHERE rn = 2

Related

Is there a way to use parameters in subquery with groupby and having in sql query?

I am trying to use rank for getting a rank for each alphabet and select that rank in having function using parameters
Ex:
Here is my table data:
Alphabet Date
A 2019-12-19 12:31:43.633
A 2019-12-19 12:31:43.650
B 2019-11-07 11:37:08.560
select *
from (select alphabet, date,
dense_Rank() over (partition by alphabet order by date) RankOrder
from mytable
) A
group by alphabet, date, RankOrder
having RankOrder = 1
By using the above query here is my Result:
Alphabet Date Rank Order
A 2019-12-19 12:31:43.633 1
B 2019-11-07 11:37:08.560 1
What if I had to do this for multiple alphabets using parameters?
using declare #palphabet int='A',#pyear nvarchar(20)=2019
How can I add the parameters to the above query?

You can add where clause :
select a.*
from(select alphabet,date,dense_Rank() over (partition by alphabet order by date) RankOrder
from mytable mt
where mt.Alphabet = #palphabet and year(mt.date) = #pyear
) a
where RankOrder = 1;
I don't think GROUP BY with HAVING clause is required here. As you have a raw data not the aggregate data & you are returning only 1 row for each alphabet by using where RankOrder = 1.
Note : You can't use INT Alphabet as base table contains text value. So, change the type of variable #palphabet.

I am not totally clear what you want but I suspect you are asking you can reduce the rows returned to be only for a specific alphabet for a specific year. Something like this should work for you.
declare #mytable table (Alphabet char(1), MyDate Datetime)
insert #mytable values
('A', '2019-12-19 12:31:43.633')
, ('A', '2019-12-19 12:31:43.650')
, ('B', '2019-11-07 11:37:08.560')
declare #Alphabet char(1) = 'A'
, #Year int = 2019
select *
from
(
select t.Alphabet
, t.MyDate
, RankOrder = dense_Rank() over (partition by t.Alphabet order by t.MyDate)
from #mytable t
--filter the rows here instead of in the final select statement
where t.Alphabet = #Alphabet
and t.MyDate >= convert(char(4), #year) + '0101'
and t.MyDate <= convert(char(4), #year + 1) + '0101'
) A
where A.RankOrder = 1

I want to get the most recent Date and time for the above query that is the reason I user rank to filter the rank one which is the most recent. Also, I used Group by to group my rank according to the alphabet :
Using Group by:
A 2019-12-19 12:31:43.633 1
A 2019-12-19 12:31:43.650 2
A 2019-12-19 12:31:43.667 3
B 2019-11-07 11:37:08.560 1
B 2019-11-07 11:37:08.577 2
Having is allowing me to select only the rank with 1 which is what I need.
A 2019-12-19 12:31:43.633 1
B 2019-11-07 11:37:08.560 1
C 2019-10-30 15:06:36.643 1
D 2019-11-05 16:16:17.920 1
If I had to do the same by using parameters. The problem is my parameter is in this format #date='F2019/2020' and my date format is 2019-12-19 12:31:43.633. How do I select the particular alphabet and the most recent date for that alphabet using a parameter?

T-SQL - Copying & Transposing Data

I'm trying to copy data from one table to another, while transposing it and combining it into appropriate rows, with different columns in the second table.
First time posting. Yes this may seem simple to everyone here. I have tried for a couple hours to solve this. I do not have much support internally and have learned a great deal on this forum and managed to get so much accomplished with your other help examples. I appreciate any help with this.
Table 1 has the data in this format.
Type Date Value
--------------------
First 2019 1
First 2020 2
Second 2019 3
Second 2020 4
Table 2 already has the Date rows populated and columns created. It is waiting for the Values from Table 1 to be placed in the appropriate column/row.
Date First Second
------------------
2019 1 3
2020 2 4

For an update, I might use two joins:
update t2
set first = tf.value,
second = ts.value
from table2 t2 left join
table1 tf
on t2.date = tf.date and tf.type = 'First' left join
table1 ts
on t2.date = ts.date and ts.type = 'Second'
where tf.date is not null or ts.date is not null;

use conditional aggregation
select date,max(case when type='First' then value end) as First,
max(case when type='Second' then value end) as Second from t
group by date

You can do conditional aggregation :
select date,
max(case when type = 'first' then value end) as first,
max(case when type = 'Second' then value end) as Second
from table t
group by date;
After that you can use cte :
with cte as (
select date,
max(case when type = 'first' then value end) as first,
max(case when type = 'Second' then value end) as Second
from table t
group by date
)
update t2
set t2.First = t1.First,
t2.Second = t1.Second
from table2 t2 inner join
cte t1
on t1.date = t2.date;

Seems like you're after a PIVOT
DECLARE #Table1 TABLE
(
[Type] NVARCHAR(100)
, [Date] INT
, [Value] INT
);
DECLARE #Table2 TABLE(
[Date] int
,[First] int
,[Second] int
)
INSERT INTO #Table1 (
[Type]
, [Date]
, [Value]
)
VALUES ( 'First', 2019, 1 )
, ( 'First', 2020, 2 )
, ( 'Second', 2019, 3 )
, ( 'Second', 2020, 4 );
INSERT INTO #Table2 (
[Date]
)
VALUES (2019),(2020)
--Show us what's in the tables
SELECT * FROM #Table1
SELECT * FROM #Table2
--How to pivot the data from Table 1
SELECT * FROM #Table1
PIVOT (
MAX([Value]) --Pivot on this Column
FOR [Type] IN ( [First], [Second] ) --Make column where [Value] is in one of this
) AS [pvt] --Table alias
--which gives
--Date First Second
------------- ----------- -----------
--2019 1 3
--2020 2 4
--Using that we can update #Table2
UPDATE [tbl2]
SET [tbl2].[First] = pvt.[First]
,[tbl2].[Second] = pvt.[Second]
FROM #Table1 tbl1
PIVOT (
MAX([Value]) --Pivot on this Column
FOR [Type] IN ( [First], [Second] ) --Make column where [Value] is in one of this
) AS [pvt] --Table alias
INNER JOIN #Table2 tbl2 ON [tbl2].[Date] = [pvt].[Date]
--Results from #Table 2 after updated
SELECT * FROM #Table2
--which gives
--Date First Second
------------- ----------- -----------
--2019 1 3
--2020 2 4

Update Temp Table with Group By

Getting this error
Msg 157, Level 15, State 1, Line 20
An aggregate may not appear in the set list of an UPDATE statement.
My UPDATE statement:
UPDATE #Results
SET CustomerName = dbo.GetCustomerNameByCustomerId(CustomerId),
TotalIncremental = Sum(IncrementalDollarsDebitCredit),
TotalDeficiency = 0
FROM
IncrementalCreditHeader ICH
INNER JOIN
IncrementalCreditHistory IC ON IC.IncrementalCreditID = ICH.IncrementalCreditID
WHERE
IC.BillingPeriodStartDate < = '2015-07-01 00:00:00.000'
AND ICH.ARCreatedFlag = 'Y' AND ICH.ActiveFlag = 1

in general, you can use the fact that you can update common table expressions in SQL server + apply the window function, like this:
create table temp (client_id int, amount int, total_amount int)
insert into temp (client_id, amount)
select 1, 10 union all
select 1, 15 union all
select 2, 5 union all
select 2, 7 union all
select 2, 15
;with cte as (
select total_amount, sum(amount) over(partition by client_id) as new_total_amount
from temp
)
update cte set
total_amount = new_total_amount
--------------------------------
client_id amount total_amount
1 10 25
1 15 25
2 5 27
2 7 27
2 15 27
sql fiddle demo
But in your question, you have to add #Results table to your update query otherwise it's not clear for SQL Server which rows to update

How can I Pivot a table in DB2? [duplicate]

This question already has answers here:
Pivoting in DB2
(3 answers)
Closed 5 years ago.
I have table A, below, where for each unique id, there are three codes with some value.
ID Code Value
---------------------
11 1 x
11 2 y
11 3 z
12 1 p
12 2 q
12 3 r
13 1 l
13 2 m
13 3 n
I have a second table B with format as below:
Id Code1_Val Code2_Val Code3_Val
Here there is just one row for each unique id. I want to populate this second table B from first table A for each id from the first table.
For the first table A above, the second table B should come out as:
Id Code1_Val Code2_Val Code3_Val
---------------------------------------------
11 x y z
12 p q r
13 l m n
How can I achieve this in a single SQL query?

select Id,
max(case when Code = '1' then Value end) as Code1_Val,
max(case when Code = '2' then Value end) as Code2_Val,
max(case when Code = '3' then Value end) as Code3_Val
from TABLEA
group by Id

SELECT Id,
max(DECODE(Code, 1, Value)) AS Code1_Val,
max(DECODE(Code, 2, Value)) AS Code2_Val,
max(DECODE(Code, 3, Value)) AS Code3_Val
FROM A
group by Id

If your version doesn't have DECODE(), you can also use this:
INSERT INTO B (id, code1_val, code2_val, code3_val)
WITH Ids (id) as (SELECT DISTINCT id
FROM A) -- Only to construct list of ids
SELECT Ids.id, a1.value, a2.value, a3.value
FROM Ids -- or substitute the actual id table
JOIN A a1
ON a1.id = ids.id
AND a1.code = 1
JOIN A a2
ON a2.id = ids.id
AND a2.code = 2
JOIN A a3
ON a3.id = ids.id
AND a3.code = 3
(Works on my V6R1 DB2 instance, and have an SQL Fiddle Example).

Here is a SQLFiddle example
insert into B (ID,Code1_Val,Code2_Val,Code3_Val)
select Id, max(V1),max(V2),max(V3) from
(
select ID,Value V1,'' V2,'' V3 from A where Code=1
union all
select ID,'' V1, Value V2,'' V3 from A where Code=2
union all
select ID,'' V1, '' V2,Value V3 from A where Code=3
) AG
group by ID

Here is the SQL Query:
insert into pivot_insert_table(id,code1_val,code2_val, code3_val)
select * from (select id,code,value from pivot_table)
pivot(max(value) for code in (1,2,3)) order by id ;

WITH Ids (id) as
(
SELECT DISTINCT id FROM A
)
SELECT Ids.id,
(select sub.value from A sub where Ids.id=sub.id and sub.code=1 fetch first rows only) Code1_Val,
(select sub.value from A sub where Ids.id=sub.id and sub.code=2 fetch first rows only) Code2_Val,
(select sub.value from A sub where Ids.id=sub.id and sub.code=3 fetch first rows only) Code3_Val
FROM Ids

You want to pivot your data. Since DB2 has no pivot function, yo can use Decode (basically a case statement.)
The syntax should be:
SELECT Id,
DECODE(Code, 1, Value) AS Code1_Val,
DECODE(Code, 2, Value) AS Code2_Val,
DECODE(Code, 3, Value) AS Code3_Val
FROM A

SELECT DISTINCT for data groups

I have following table:
ID Data
1 A
2 A
2 B
3 A
3 B
4 C
5 D
6 A
6 B
etc. In other words, I have groups of data per ID. You will notice that the data group (A, B) occurs multiple times. I want a query that can identify the distinct data groups and number them, such as:
DataID Data
101 A
102 A
102 B
103 C
104 D
So DataID 102 would resemble data (A,B), DataID 103 would resemble data (C), etc. In order to be able to rewrite my original table in this form:
ID DataID
1 101
2 102
3 102
4 103
5 104
6 102
How can I do that?
PS. Code to generate the first table:
CREATE TABLE #t1 (id INT, data VARCHAR(10))
INSERT INTO #t1
SELECT 1, 'A'
UNION ALL SELECT 2, 'A'
UNION ALL SELECT 2, 'B'
UNION ALL SELECT 3, 'A'
UNION ALL SELECT 3, 'B'
UNION ALL SELECT 4, 'C'
UNION ALL SELECT 5, 'D'
UNION ALL SELECT 6, 'A'
UNION ALL SELECT 6, 'B'

In my opinion You have to create a custom aggregate that concatenates data (in case of strings CLR approach is recommended for perf reasons).
Then I would group by ID and select distinct from the grouping, adding a row_number()function or add a dense_rank() your choice. Anyway it should look like this
with groupings as (
select concat(data) groups
from Table1
group by ID
)
select groups, rownumber() over () from groupings

The following query using CASE will give you the result shown below.
From there on, getting the distinct datagroups and proceeding further should not really be a problem.
SELECT
id,
MAX(CASE data WHEN 'A' THEN data ELSE '' END) +
MAX(CASE data WHEN 'B' THEN data ELSE '' END) +
MAX(CASE data WHEN 'C' THEN data ELSE '' END) +
MAX(CASE data WHEN 'D' THEN data ELSE '' END) AS DataGroups
FROM t1
GROUP BY id
ID DataGroups
1 A
2 AB
3 AB
4 C
5 D
6 AB
However, this kind of logic will only work in case you the "Data" values are both fixed and known before hand.
In your case, you do say that is the case. However, considering that you also say that they are 1000 of them, this will be frankly, a ridiculous looking query for sure :-)
LuckyLuke's suggestion above would, frankly, be the more generic way and probably saner way to go about implementing the solution though in your case.

From your sample data (having added the missing 2,'A' tuple, the following gives the renumbered (and uniqueified) data:
with NonDups as (
select t1.id
from #t1 t1 left join #t1 t2
on t1.id > t2.id and t1.data = t2.data
group by t1.id
having COUNT(t1.data) > COUNT(t2.data)
), DataAddedBack as (
select ID,data
from #t1 where id in (select id from NonDups)
), Renumbered as (
select DENSE_RANK() OVER (ORDER BY id) as ID,Data from DataAddedBack
)
select * from Renumbered
Giving:
1 A
2 A
2 B
3 C
4 D
I think then, it's a matter of relational division to match up rows from this output with the rows in the original table.

Just to share my own dirty solution that I'm using for the moment:
SELECT DISTINCT t1.id, D.data
FROM #t1 t1
CROSS APPLY (
SELECT CAST(Data AS VARCHAR) + ','
FROM #t1 t2
WHERE t2.id = t1.id
ORDER BY Data ASC
FOR XML PATH('') )
D ( Data )
And then going analog to LuckyLuke's solution.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Finding row of "duplicate" data with greatest value - sql

Related

Is there a way to use parameters in subquery with groupby and having in sql query?

T-SQL - Copying & Transposing Data

Update Temp Table with Group By

How can I Pivot a table in DB2? [duplicate]

SELECT DISTINCT for data groups

Categories

Resources