Complex Query where record is max in set - sql

I have some data that looks like this
CandidateCategory
candidateCategoryId
candidateId
categoryId
I want to return all records where a specific categoryId is the most recent entry, this max(candidateCategoryId)
So if a candidate has 5 categories I want to get that record for say category 23 but only if that is the most recent category added, ie candidateCategoryId is higher than all others for that category.
Using MS SQL 2012
Sample data in format
candidateCategoryId candidateId categoryId
100 1 10
101 1 11
102 1 50
103 1 23
104 1 40
no result, 23 isn't the max candidateCategoryId
candidateCategoryId candidateId categoryId
200 2 20
201 2 31
202 2 12
203 2 23
return result, 23 is the max candidateCategoryId for this candidate.

Try getting the max CandidateCategoryID per CandidateID First, then re-join back
select
yd2.*
from
( select yd.candidateID,
max( yd.candidateCategoryId ) as maxCandCatID
from YourData yd
group by yd.candidateID ) MaxPerID
JOIN YourData yd2
on MaxPerID.candidateID = yd2.candidateID
AND MaxPerID.maxCandCatID = yd2.CandidateCategoryID
AND yd2.categoryID = 23
So, from your sample data, the inner prequery "MaxPerID" will generate two rows...
CandidateID MaxCandCatID (and ultimately corresponds to category ID)
1 104 40
2 203 23
Then, re-joining back to your original table on these two inclusive of your AND CategoryID = 23 will only return the second CandidateID entry
And to help clarify to others who posted answers, the person does not appear to want the highest category ID, but if you look at them, they are sequentially added -- like an auto-incrementing number for the CandidateCategoryID. So, they want the most recent entry for a given candidate (hence candidates 1 & 2)... and if the last entry made was that of category = 23, they want THAT one.

select *
from (select t.*
from tbl t
join (select candidateid,
categoryid,
max(candidatecategoryid) as lastid
from tbl
group by candidateid, categoryid) v
on t.candidateid = v.candidateid
and t.categoryid = v.lastid) x
where categoryid = 23

This is a basic "greatest-of-n" problem. You can solve it with not exists, among other ways:
select t.*
from somedata t
where not exists (select 1
from somedata t2
where t2.categoryId = t.categoryId and
t2.candidateCategoryId > t.candidateCategoryId
);
EDIT:
If you only want categories where the max is 23, then add another condition:
select t.*
from somedata t
where not exists (select 1
from somedata t2
where t2.categoryId = t.categoryId and
t2.candidateCategoryId > t.candidateCategoryId
) and
t.categoryId = 23

Another way to skin this cat. Using your sample data, we can create an inline table for testing and get
DECLARE #candidates TABLE (CandidateCategoryId int,CandidateId int,CategoryId int)
INSERT INTO #candidates
SELECT 100, 1, 10
UNION
SELECT 101, 1, 11
UNION
SELECT 102, 1, 50
UNION
SELECT 103, 1, 23
UNION
SELECT 104, 1, 40
UNION
SELECT 200, 2, 20
UNION
SELECT 201, 2, 31
UNION
SELECT 202, 2, 12
UNION
SELECT 203, 2, 23
SELECT * FROM #candidates c
JOIN
(
SELECT CandidateId,MAX(CategoryId) AS CategoryId FROM #candidates
GROUP BY CandidateId
) tmp
ON c.CandidateId = tmp.CandidateId
AND c.CategoryId = tmp.CategoryId
And get results that look like
CandidateCategoryId | CandidateId | CategoryId
----------------------------------------------
201 | 2 | 31
102 | 1 | 50

I came up with this
select candidateCategoryId
from candidateCategory
where candidateCategoryId in (
select max(candidateCategoryId)
from candidateCategory
group by candidateId )
and categoryId = 23

select *
from yourtable
where candidateCategoryId = (select max(candidateCategoryId) from yourtable)

declare #categoryId int
select #categoryId = 23
with cte as
(
select candidateCategoryId, candidateId, categoryId,
rn = row_number() over (partition by candidateId order by candidateCategoryId desc)
from yourtable
)
select *
from cte c
where exists
(
select *
from cte x
where x.candidateId = c.candidateId
and x.rn = 1
and x.categoryId = #categoryId
)

Related

Select the non repeating/Distinct value in SQL

I'm trying to select the record based on the distinct id. When i go for 'DISTINCT' it picks the duplicate record and truncates the repeating record and gives me the one left out.
How can i SQL to pick to just that record which isn't repeated ?
INPUT
id
name
age
location
1
a
22
usa
1
a
23
usa
2
b
44
uk
3
e
33
eu
3
f
55
eu
8
k
49
usa
OUTPUT
id
name
age
location
2
b
44
uk
8
k
49
usa
ok , here is how you can do it :
select * from (
select * , count(*) over (partition by id) cn
from tablename
) t
where cn = 1
Try this:
SELECT *
FROM [Input]
WHERE ID IN (
SELECT ID FROM [Input] GROUP BY ID HAVING COUNT(ID) = 1
)
This should achieve the output you're after:
SELECT *
FROM yourtable
WHERE id IN (
SELECT id
FROM yourtable
GROUP BY id
HAVING COUNT(*) = 1)
You can use SQL Common Transaction Expression (CTE) AS FOLLOWS
declare #mytable as table(id int ,name nvarchar(100),age int,location nvarchar(50))
insert into #mytable values
(1,'a',22,'usa'),(1,'a',23,'usa'),(2,'b',44,'uk'),(3,'e',33,'eu'),(3,'f',55,'Tunisia'),('8','k',49,'Palestine')
with
cte1 as(select * from #mytable),
cte2 as (select id, count(1) N from #mytable group by id),
cte3 as (select TA.id,TA.name,TA.age,TA.location from cte1 TA inner join cte2 TB on TA.id=TB.id where TB.N=1)
select * from cte3

Join query result in duplicated rows

-----------tblDListTest---------
id listid trackingcode
1 125 trc1
2 125 trc1
3 125 trc1
4 126 trc4
5 126 trc5
---------------------------------
---------tblTrcWeightTest----------
id weight trackingcode
1 20 trc1
2 30 trc1
3 40 trc1
4 50 trc4
5 70 trc5
Need to display trackingcode and with their weight.
In tblDListTest, there are 3 records against listid 125.
I want to display only 3 records with weight.
I am using query :
set transaction isolation level read uncommitted
select DL.id, DL.listid, DL.trackingcode, tw.weight
from tblDListTest DL
inner join tblTrcWeightTest tw on DL.trackingcode = tw.trackingcode
where DL.listid = 125
My query result :
id listid trackingcode weight
1 125 trc1 20
1 125 trc1 30
1 125 trc1 40
2 125 trc1 20
2 125 trc1 30
2 125 trc1 40
3 125 trc1 20
3 125 trc1 30
3 125 trc1 40
But I want following result .
id listid trackingcode weight
1 125 trc1 20
2 125 trc1 30
3 125 trc1 40
you need a unique key (any combination of fields that results on a unique value) in one of the tables.
In your example, trc1 appears 3 times in each table.
SQL doen't know to join this data, so, it will make a cartesian product of the possible combinations.
If you can't use a unique value in the join, you can use a SELECT DISTINCT DL.id, DL.listid, DL.trackingcode, tw.weight ....
There are duplicates between your tables. You would want to see something like this:
;WITH DL (id, listid, trackingcode) AS (
SELECT CONVERT(int, id), listid, trackingcode FROM (
VALUES
('1','125','trc1'),
('2','125','trc1'),
('3','125','trc1'),
('4','126','trc4'),
('5','126','trc5')
) AS A (id, listid, trackingcode)
),
tw (id, weight, trackingcode) AS (
SELECT CONVERT(int, id), weight, trackingcode FROM (
VALUES
('1','20','trc1'),
('2','30','trc1'),
('3','40','trc1'),
('4','50','trc4'),
('5','70','trc5')
) AS A (id, weight, trackingcode)
)
SELECT DISTINCT DL.listid,
DL.trackingcode,
tw.weight
FROM DL
INNER JOIN tw ON DL.trackingcode = tw.trackingcode
WHERE DL.listid = 125
You can use row_number() to enumerate the values and then use that for the join:
select dl.id, dl.listid, dl.trackingcode, tw.weight
from (select dl.*, row_number() over (partition by trackingcode order by id) as seqnum
from tblDListTest dl
) dl inner join
(select tw.*, row_number() over (partition by trackingcode order by id) as seqnum
from tblTrcWeightTest tw
) tw
on dl.trackingcode = tw.trackingcode and dl.seqnum = tw.seqnum
where dl.listid = 125;
You can just use something like this.
DECLARE #tblDListTest table (
ID INT,
listid INT,
trackingcode VARCHAR(20)
)
DECLARE #tblTrcWeightTest table (
ID INT,
weight INT,
trackingcode VARCHAR(20)
)
INSERT INTO #tblDListTest (ID,listid,trackingcode)
VALUES (1, 125, 'trc1'),
(2, 125, 'trc1'),
(3, 125, 'trc1'),
(4, 126, 'trc4'),
(5, 126, 'trc5')
INSERT INTO #tblTrcWeightTest (ID,weight,trackingcode)
VALUES (1, 20, 'trc1'),
(2, 30, 'trc1'),
(3, 40, 'trc1'),
(4, 50, 'trc4'),
(5, 70, 'trc5')
SELECT A.ID, A.listid, A.trackingcode, B.weight
FROM #tblDListTest A
JOIN #tblTrcWeightTest B ON B.ID = A.ID
WHERE A.listid = 125
You can use subquery :
select twt.id, tt.listid, twt.trackingcode, twt.weight
from tblTrcWeightTest twt cross apply (
select top 1 tdt.listid
from tblDListTest tdt
where tdt.trackingcode = twt.trackingcode
) tt
where twt.trackingcode = 'trc1';

Max and Min value's corresponding records

I have a scenario to get the respective field value of "Max" and "Min" records
Please find the sample data below
-----------------------------------------------------------------------
ID Label ProcessedDate
-----------------------------------------------------------------------
1 Label1 11/01/2016
2 Label2 11/02/2016
3 Label3 11/03/2016
4 Label4 11/04/2016
5 Label5 11/05/2016
I have the "ID" field populated in another table as a foreign key. While querying those records in that table based on the "ID" field I need to get the "Label" field of "Max" Processed date and "Min" processed date.
-----------------------------------------------------------------------
ID LabelID GroupingField
-----------------------------------------------------------------------
1 1 101
2 2 101
3 3 101
4 4 101
5 5 101
6 1 102
7 2 102
8 3 102
9 4 102
And the final result set I expect it to look something like this.
-----------------------------------------------------------------------
GroupingField FirstProcessed LastProcessed
-----------------------------------------------------------------------
101 Label1 Label5
102 Label1 Label4
I have 'almost' managed to get this above result using rank function but still not satisfied with it. So I am looking if someone can provide me with a better option.
Thanks,
Prakazz
CREATE TABLE #Details (ID INT,LabelID INT,GroupingField INT)
CREATE TABLE #Details1 (ID INT,Label VARCHAR(100),ProcessedDate VARCHAR(100))
INSERT INTO #Details1 (ID ,Label ,ProcessedDate )
SELECT 1,'Label1','11/01/2016' UNION ALL
SELECT 2,'Label2','11/02/2016' UNION ALL
SELECT 3,'Label3','11/03/2016' UNION ALL
SELECT 4,'Label4','11/04/2016' UNION ALL
SELECT 5,'Label5','11/05/2016'
INSERT INTO #Details (ID ,LabelID ,GroupingField )
SELECT 1,1,101 UNION ALL
SELECT 2,2,101 UNION ALL
SELECT 3,3,101 UNION ALL
SELECT 4,4,101 UNION ALL
SELECT 5,5,101 UNION ALL
SELECT 6,1,102 UNION ALL
SELECT 7,2,102 UNION ALL
SELECT 8,3,102 UNION ALL
SELECT 9,4,102
;WITH CTE (GroupingField , MAXId ,MinId) AS
(
SELECT GroupingField,MAX(LabelID) MAXId,MIN(LabelID) MinId
FROM #Details
GROUP BY GroupingField
)
SELECT GroupingField ,B.Label FirstProcessed, A.Label LastProcessed
FROM CTE
JOIN #Details1 A ON MAXId = A.ID
JOIN #Details1 B ON MinId = B.ID
You can use SQL Row_Number() function using Partition By as follows with a combination of Group By
;with cte as (
select
t.Label, t.ProcessedDate,
g.GroupingField,
ROW_NUMBER() over (partition by GroupingField Order By ProcessedDate ASC) minD,
ROW_NUMBER() over (partition by GroupingField Order By ProcessedDate DESC) maxD
from tbl t
inner join GroupingFieldTbl g
on t.ID = g.LabelID
)
select GroupingField, max(FirstProcessed) FirstProcessed, max(LastProcessed) LastProcessed
from (
select
GroupingField,
FirstProcessed = CASE when minD = 1 then Label else null end,
LastProcessed = CASE when maxD = 1 then Label else null end
from cte
where
minD = 1 or maxD = 1
) t
group by GroupingField
order by GroupingField
I also used CTE expression to make coding easier and understandable
Output is as

How can I get the first result for each account in this SQL query?

I'm trying to write a query that follows this logic:
Find the first following status code of an account that had a previous status code of X.
So if I have a table of:
id account_num status_code
64 1 X
82 1 Y
72 2 Y
87 1 Z
91 2 X
103 2 Z
The results would be:
id account_num status_code
82 1 Y
103 2 Z
I've come up with a couple of solutions but I'm not all that great with SQL and so they've been pretty inelegeant thus far. I was hoping that someone here might be able to point me in the right direction.
View:
SELECT account_number, id
FROM table
WHERE status_code = 'X'
Query:
SELECT account_number, min(id)
FROM table
INNER JOIN view
ON table.account_number = view.account_number
WHERE table.id > view.id
At this point I have the id that I need but I'd have to write ANOTHER query that uses the id tolook up the status_code.
Edit: To add some context, I'm trying to find calls that have a status_code of X. If a call has a status_code of X we want to dial it a different way the next time we make an attempt. The aim of this query is to provide a report that will show the results of the second dial if the first dial resulted an X status code.
Here's a SQL Server solution.
UPDATE
The idea is to avoid a number of NESTED LOOP joins as proposed by Olaf because they roughly have O(N * M) complexity and thus extremely bad for your performance. MERGED JOINS complexity is O(NLog(N) + MLog(M)) which is much better for real world scenarios.
The query below works as follows:
RankedCTE is a subquery that assigns a row number to each id partioned by account and sorted by id which represents the time. So for the data below the output of this
SELECT
id,
account_num,
status_code,
ROW_NUMBER() OVER (PARTITION BY account_num ORDER BY id DESC) AS item_rank
FROM dbo.Test
would be:
id account_num status_code item_rank
----------- ----------- ----------- ----------
87 1 Z 1
82 1 Y 2
64 1 X 3
103 2 Z 1
91 2 X 2
72 2 Y 3
Once we have them numbered we join the result on itself like this:
WITH RankedCTE AS
(
SELECT
id,
account_num,
status_code,
ROW_NUMBER() OVER (PARTITION BY account_num ORDER BY id DESC) AS item_rank
FROM dbo.Test
)
SELECT
*
FROM
RankedCTE A
INNER JOIN RankedCTE B ON
A.account_num = B.account_num
AND A.item_rank = B.item_rank - 1
which will give us an event and a preceding event in the same table
id account_num status_code item_rank id account_num status_code item_rank
----------- ----------- ----------- ----------- ----------- ----------- ----------- -----------
87 1 Z 1 82 1 Y 2
82 1 Y 2 64 1 X 3
103 2 Z 1 91 2 X 2
91 2 X 2 72 2 Y 3
Finally, we just have to take the preceding event with code "X" and the event with code not "X":
WITH RankedCTE AS
(
SELECT
id,
account_num,
status_code,
ROW_NUMBER() OVER (PARTITION BY account_num ORDER BY id DESC) AS item_rank
FROM dbo.Test
)
SELECT
A.id,
A.account_num,
A.status_code
FROM
RankedCTE A
INNER JOIN RankedCTE B ON
A.account_num = B.account_num
AND A.item_rank = B.item_rank - 1
AND A.status_code <> 'X'
AND B.status_code = 'X'
Query plans for this query and #Olaf Dietsche solution (one of the versions) are below.
Data setup script
CREATE TABLE dbo.Test
(
id int not null PRIMARY KEY,
account_num int not null,
status_code nchar(1)
)
GO
INSERT dbo.Test (id, account_num, status_code)
SELECT 64 , 1, 'X' UNION ALL
SELECT 82 , 1, 'Y' UNION ALL
SELECT 72 , 2, 'Y' UNION ALL
SELECT 87 , 1, 'Z' UNION ALL
SELECT 91 , 2, 'X' UNION ALL
SELECT 103, 2, 'Z'
SQL Fiddle with subselect
select id, account_num, status_code
from mytable
where id in (select min(t1.id)
from mytable t1
join mytable t2 on t1.account_num = t2.account_num
and t1.id > t2.id
and t2.status_code = 'X'
group by t1.account_num)
and SQL Fiddle with join, both for MS SQL Server 2012, both returning the same result.
select id, account_num, status_code
from mytable
join (select min(t1.id) as min_id
from mytable t1
join mytable t2 on t1.account_num = t2.account_num
and t1.id > t2.id
and t2.status_code = 'X'
group by t1.account_num) t on id = min_id
SELECT MIN(ID), ACCOUNT_NUM, STATUS_CODE FROM (
SELECT ID, ACCOUNT_NUM, STATUS_CODE
FROM ACCOUNT A1
WHERE EXISTS
(SELECT 1
FROM ACCOUNT A2
WHERE A1.ACCOUNT_NUM = A2.ACCOUNT_NUM
AND A2.STATUS_CODE = 'X'
AND A2.ID < A1.ID)
) SUB
GROUP BY ACCOUNT_NUM
Here's an SQLFIDDLE
Here's query, with your data, checked under PostgreSQL:
SELECT t0.*
FROM so13594339 t0 JOIN
(SELECT min(t1.id), t1.account_num
FROM so13594339 t1, so13594339 t2
WHERE t1.account_num = t2.account_num AND t1.id > t2.id AND t2.status_code = 'X'
GROUP BY t1.account_num
) z
ON t0.id = z.min AND t0.account_num = z.account_num;

TSQL Sweepstakes Script

I need to run a sweepstakes script to get X amount of winners from a customers table. Each customer has N participations. The table looks like this
CUSTOMER-A 5
CUSTOMER-B 8
CUSTOMER-C 1
I can always script to have CUSTOMER-A,B and C inserted 5, 8 and 1 times respectively in a temp table and then select randomly using order by newid() but would like to know if there's a more elegant way to address this.
(Update: Added final query.)
(Update2: Added single query to avoid temp table.)
Here's the hard part using a recursive CTE plus the final query that shows "place".
Code
DECLARE #cust TABLE (
CustomerID int IDENTITY,
ParticipationCt int
)
DECLARE #list TABLE (
CustomerID int,
RowNumber int
)
INSERT INTO #cust (ParticipationCt) VALUES (5)
INSERT INTO #cust (ParticipationCt) VALUES (8)
INSERT INTO #cust (ParticipationCt) VALUES (1)
INSERT INTO #cust (ParticipationCt) VALUES (3)
INSERT INTO #cust (ParticipationCt) VALUES (4)
SELECT * FROM #cust
;WITH t AS (
SELECT
lvl = 1,
CustomerID,
ParticipationCt
FROM #Cust
UNION ALL
SELECT
lvl = lvl + 1,
CustomerID,
ParticipationCt
FROM t
WHERE lvl < ParticipationCt
)
INSERT INTO #list (CustomerID, RowNumber)
SELECT
CustomerID,
ROW_NUMBER() OVER (ORDER BY NEWID())
FROM t
--<< All rows
SELECT * FROM #list ORDER BY RowNumber
--<< All customers by "place"
SELECT
CustomerID,
ROW_NUMBER() OVER (ORDER BY MIN(RowNumber)) AS Place
FROM #list
GROUP BY CustomerID
Results
CustomerID ParticipationCt
----------- ---------------
1 5
2 8
3 1
4 3
5 4
CustomerID RowNumber
----------- -----------
4 1
1 2
1 3
2 4
1 5
5 6
2 7
2 8
4 9
2 10
2 11
2 12
1 13
5 14
5 15
3 16
5 17
1 18
2 19
2 20
4 21
CustomerID Place
----------- -----
4 1
1 2
2 3
5 4
3 5
Single Query with No Temp Table
It is possible to get the answer with a single query that does not use a temp table. This works fine, but I personally like the temp table version better so you can validate the interim results.
Code (Single Query)
;WITH List AS (
SELECT
lvl = 1,
CustomerID,
ParticipationCt
FROM #Cust
UNION ALL
SELECT
lvl = lvl + 1,
CustomerID,
ParticipationCt
FROM List
WHERE lvl < ParticipationCt
),
RandomOrder AS (
SELECT
CustomerID,
ROW_NUMBER() OVER (ORDER BY NEWID()) AS RowNumber
FROM List
)
SELECT
CustomerID,
ROW_NUMBER() OVER (ORDER BY MIN(RowNumber)) AS Place
FROM RandomOrder
GROUP BY CustomerID
try this:
Select Top X CustomerId
From (Select CustomerId,
Rand(CustomerId) *
Count(*) /
(Select Count(*)
From Table) Sort
From Table
Group By CustomerId) Z
Order By Sort Desc
EDIT: abovbe assumed multiple rows per customer, one row per participation... Sorry, following assumes one row per customer, with column Participations holding number of participations for that customer.
Select Top 23 CustomerId
From ( Select CustomerId,
Participations - RAND(CustomerId) *
(Select SUM(Participations ) From customers) sort
from customers) Z
Order By sort desc