Find the top value for each parent - sql

I'm sure this is a common request but I wouldn't know how to ask for it formally.
I encountered this a long time ago when I was in the Army. A soldier has multiple physical fitness tests but the primary test that counts in the most recent. The soldier also has multiple marksmanship qualifications but only the most recent qualification to the weapon assigned is significant.
How do you create a view that itemizes the most significant child of the parent?

Use:
SELECT p.*, x.*
FROM PARENT p
JOIN CHILD x ON x.parent_id = p.id
JOIN (SELECT c.id,
c.parent_id,
MAX(c.date_column) AS max_date
FROM CHILD c
GROUP BY c.id, c.parent_id) y ON y.id = x.id
AND y.parent_id = x.parent_id
AND y.max_date = x.date
Assuming SQL Server 2005+:
WITH summary AS (
SELECT p.*,
c.*,
ROW_NUMBER() OVER (PARTITION BY p.id
ORDER BY c.date DESC) AS rank
FROM PARENT p
JOIN CHILD c ON c.parent_id = p.id)
SELECT s.*
FROM summary s
WHERE s.rank = 1

Although I'm not quite sure what you are implying by "itemizing", you can do something like so:
Select ..
From Soldier
Left Join FitnessTest
On FitnessTest.SoldierId = Soldier.Id
And FitnessTest.TestDate = (
Select Max(FT1.TestDate)
From FitnessTest As FT1
Where FT1.SoldierId = FitnessTest.SoldierId
)
Left Join MarksmanshipTest
On MarksmanshipTest.SoldierId = Soldier.Id
And MarksmanshipTest.TestDate = (
Select Max(MT1.TestDate)
From MarksmanshipTest As MT1
Where MT1.SoldierId = MarksmanshipTest.SoldierId
)
This assumes that a solider cannot have two test datetime values for a fitness test or a marksmanship test.

No significant differnce from previous two answer but a little more detail perhaps:
create table soldier ( soldierId int primary key,
name varchar(100) )
create table fitnessTest ( soldierId int foreign key references soldier,
occurred datetime, result int )
create table marksmanshipTest ( soldierId int foreign key references soldier,
occurred datetime, result int )
;with
mostRecentFitnessTest as
(
select
fitnessTest.soldierId,
fitnessTest.result,
row_number() over (order by occurred desc) as row
from fitnessTest
),
mostRecentMarksmanshipTest as
(
select
marksmanshipTest.soldierId,
marksmanshipTest.result,
row_number() over (order by occurred desc) as row
from marksmanshipTest
)
select
soldier.soldierId,
soldier.name,
mostRecentFitnessTest.result,
mostRecentMarksmanshipTest.result
from soldier
left outer join mostRecentFitnessTest on
mostRecentFitnessTest.soldierId = soldier.soldierId
and mostRecentFitnessTest.row = 1
left outer join mostRecentMarksmanshipTest on
mostRecentMarksmanshipTest.soldierId = soldier.soldierId
and mostRecentMarksmanshipTest.row = 1

Related

SQL Server: Query for products with matching tags

I have been pondering over this for the past few hours but I cannot find a solution.
I have a products in a table, tags in another table and a product/tag link table.
Now I want to retrieve all products which have the same tags as a certain product.
Here are the tables (simplified):
PRODUCT:
id varchar(36) (primary key)
Name varchar(50)
TAG:
id varchar(36) (primary key)
Name varchar(50)
PRODUCTTAG:
id varchar(36) (primary key)
ProductID varchar(36)
TagID varchar(36)
I find quite a few answers here on Stackoverflow talking about returning full and partial matches. However I am looking for a query which only gives full matches.
Example:
Product A has tags 1, 2, 3
Product B has tags 1, 2
Product C has tags 1, 2, 3
Product D has tags 1, 2, 3, 4
If I query for product A, only product C should be found - as it is the only one having exactly the same tags.
Is this even possible?
Yes, yes, try this way:
with aa as (
select count(*) count
from [PRODUCTTAG]
where ProductID = '19A947C0-6A0F-4A6F-9675-48FBE30A877D'
), bb as
(
select ProductID, count(*) count
from [PRODUCTTAG]
group by ProductID
)
select distinct b.ProductID
from [dbo].[PRODUCTTAG] a join
[dbo].[PRODUCTTAG] b on a.TagID = b.TagID cross join
aa join
bb on aa.count = bb.count and b.ProductID = bb.ProductID
where a.ProductID = '19A947C0-6A0F-4A6F-9675-48FBE30A877D'
declare #PRODUCTTAG table(id int identity(1,1),ProductID int,TagID int)
insert into #PRODUCTTAG VALUES
(1,1),(1,2),(1,3)
,(2,1),(2,2)
,(3,1),(3,2),(3,3)
,(4,1),(4,2),(4,3),(4,4)
;With CTE as
(
select ProductID,count(*)smallCount
FROM #PRODUCTTAG
group by ProductID
)
,CTE1 as
(
select smallCount, count(smallCount)BigCount
from cte
group by smallCount
)
,CTE2 as
(
select * from cTE c
where exists(
select smallCount from cte1 c1
where BigCount>1 and c1.smallCount=c.smallCount
)
)
select * from cte2
--depending upon the output expected join this with #PRODUCTTAG,#Product,#Tag
--like this
--select * from #PRODUCTTAG PT
--where exists(
--select * from cte2 c2 where pt.productid=c2.productid
--)
Or Tell what is final output look like ?
This is a case where I find it simpler to combine all the tags into a single string and compare the strings. But, that is painful in SQL Server until 2016.
So, there is a set based solution:
with pt as (
select pt.*, count(*) over (partition by productid) as cnt
from producttag pt
)
select pt.productid
from pt join
pt pt2
on pt.cnt = pt2.cnt and
pt.productid <> pt2.productid and
pt.tagid = pt2.tagid
where pt2.productid = #x
group by pt.productid, pt.cnt
having count(*) = pt.cnt;
This matches every product to your given product based on the tags. The having clause then ensures that the number of matching tags is the same for the two products. Because the join only considers matching tags, all the tags are the same.

SELECT TOP inside INNER JOIN

I created this simple database in SQL Server:
create database product_test
go
use product_test
go
create table product
(
id int identity primary key,
label varchar(255),
description text,
price money,
);
create table picture
(
id int identity primary key,
p_path text,
product int foreign key references product(id)
);
insert into product
values ('flip phone 100', 'back 2 the future stuff.', 950),
('flip phone 200', 's;g material', 1400)
insert into picture
values ('1.jpg', 1), ('2.jpg', 1), ('3.jpg', 2)
What I want is to select all products and only one picture for each product. Any help is greatly appreciated.
I'm a fan of outer apply for this purpose:
select p.*, pi.id, pi.path
from product p outer apply
(select top 1 pi.*
from picture pi
where pi.product = p.id
) pi;
You can include an order by to get one particular picture (say, the one with the lowest or highest id). Or, order by newid() to get a random one.
Have you tried using a correlated sub-query?
SELECT *, (SELECT TOP 1 p_path FROM picture WHERE product = p.id ORDER BY id)
FROM picture p
Hope this helps,
SELECT
*,
(
SELECT TOP 1 p2.p_path
FROM dbo.picture p2
WHERE p.id = p2.product
) AS picture
FROM dbo.product p
Or with join:
SELECT
*
FROM dbo.product p
INNER JOIN
(
SELECT p2.product, MIN(p2.p_path) AS p_path
FROM dbo.picture p2
GROUP BY p2.product
) AS pt
ON p.id = pt.product
But you need to change p_path to varchar type
I would use a windowing function like this:
SELECT *
FROM product
JOIN (
SELECT id, product, p_path,
row_number() OVER (PARTITION BY product ORDER BY id ASC) as RN
FROM picture
) pic ON product.id = pic.product AND pic.RN = 1
As you can see here I am selecting the picture with the lowest id (ORDER BY id ASC) -- you can change this order by to your requirements.
just group by and take min or max
left join in case there is no picture
select pr.ID, pr.label, pr.text, pr.price
, min(pic.p_path)
from product pr
left join picture pic
on pic.product = pr.ID
group by pr.ID, pr.label, pr.text, pr.price

How to do conditional update on columns using CTE?

I have a table CUST with following layout. There are no constraints. I do see that one ChildID has more than one ParentID associated with it. (Please see the records for ChildID = 115)
Here is what I need -
Wherever one child has more than 1 parent, I want to update those ParentID and ParentName with the ParentID and ParentName which has max match_per. So in the below image, I want ParentID 1111 and ParentName LEE YOUNG WOOK to update all records where ChildId = 115 (since the match_per 0.96 is maximum within the given set). In case there are two parents with equal max match_per, then I want to pick any 1 one of them.
I know it is possible using CTE but I don't know how to update CTE. Can anybody help?
One way of doing it
WITH CTE1 AS
(
SELECT *,
CASE WHEN match_per =
MAX(match_per) OVER (PARTITION BY ChildId)
THEN CAST(ParentId AS CHAR(10)) + ParentName
END AS parentDetailsForMax
FROM CUST
), CTE2 AS
(
SELECT *,
MAX(parentDetailsForMax) OVER (PARTITION BY ChildId) AS maxParentDetailsForMax
FROM CTE1
)
UPDATE CTE2
SET ParentId = CAST(LEFT(maxParentDetailsForMax,10) AS int),
ParentName = SUBSTRING(maxParentDetailsForMax,10,8000)
Getting both the parent id and parent name is a bit tricky. I think the logic is easiest using cross apply:
with toupdate as (
select t.*, p.parentId as new_parentId, p.parentName as new_parentName
max(match_per) over (partition by childid) as max_match_per,
count(*) over (partition by childid) as numparents
from table t cross apply
(select top 1 p.*
from table p
where p.childid = t.childid
order by match_per desc
) p
)
update toupdate
set parentId = new_ParentId,
parentName = new_ParentName
where numparents > 1;
As a note: the fact that parent id and parent name are both stored in the table, potentially multiple times seems like a problem. I would expect to look up the name, given the id, to reduce data redundancy.
Try something like this?? The first CTE will get Max(match_per) for each ChildID. Then, the second will use the new MaxMatchPer to find what its corresponding ParentID should be.
; WITH CTE AS (
SELECT ChildID,MAX(match_per) AS MaxMatchPer
FROM tbl
GROUP BY ChildID
), CTE1 AS (
SELECT t.ParentID, c.ChildID
FROM tbl t
JOIN CTE c
ON c.ChildID = t.ChildID
AND c.MaxMatchPer = t.match_per
)
UPDATE t
SET ParentID = c.ParentID
FROM tbl t
LEFT JOIN CTE1 c
ON c.ChildID = t.ChildID
Also, this is poor normalization. You should not have ParentName nor ChildName in this table.

take each maximum value of a column and get information from another table

i have two tables:
create table saller(
id_saller int IDENTITY PRIMARY KEY,
name varchar(50),
branch varchar(10)
);
create table sale(
id_sale int IDENTITY PRIMARY KEY,
amount float,
id_saller int,
CONSTRAINT fk_saller FOREIGN KEY (id_saller)REFERENCES saller(id_saller)
);
i wanna get the biggest selling value of the amount for each branch
and get the name and id of the saller in charge for the biggest selling
i tried this:
SELECT saller.name, saller.id_saller,maxv.branch, maxv.maxbranch
FROM saller
INNER JOIN sale
ON saller.id_saller = sale.id_saller
INNER JOIN (
SELECT saller.branch,saller.id_saller,MAX(sale.amount) AS maxbranch
FROM saller
INNER JOIN sale
ON saller.id_saller = sale.id_saller
GROUP BY saller.branch,saller.id_saller
) AS maxv ON(sale.id_saller = maxv.id_saller)
One way to do it if you want to return exactly one row per branch even if you have ties
SELECT branch, id_saller, name, amount
FROM
(
SELECT r.branch, s.id_saller, r.name, s.amount,
ROW_NUMBER() OVER (PARTITION BY r.branch ORDER BY s.amount DESC) rnum
FROM sale s JOIN saller r
ON s.id_saller = r.id_saller
) q
WHERE q.rnum = 1
or if you want the highest value with ties
SELECT branch, id_saller, name, amount
FROM
(
SELECT r.branch, s.id_saller, r.name, s.amount,
RANK() OVER (PARTITION BY r.branch ORDER BY s.amount DESC) rank
FROM sale s JOIN saller r
ON s.id_saller = r.id_saller
) q
WHERE q.rank = 1
Here is SQLFiddle demo
According to your question, I don't understand the presence of branch, maybe a table that has not been mentionned. But to retrieve the seller id and name, you can try this;
SELECT saller.name, saller.id_saller
FROM saller
INNER JOIN sale
ON saller.id_saller = sale.id_saller
WHERE sale.Amount = (Select Max(Amount) from sale)

Recursive query in SQL Server

I have a table with following structure
Table name: matches
That basically stores which product is matching which product. I need to process this table
And store in a groups table like below.
Table Name: groups
group_ID stores the MIN Product_ID of the Product_IDS that form a group. To give an example let's say
If A is matching B and B is Matching C then three rows should go to group table in format (A, A), (A, B), (A, C)
I have tried looking into co-related subqueries and CTE, but not getting this to implement.
I need to do this all in SQL.
Thanks for the help .
Try this:
;WITH CTE
AS
(
SELECT DISTINCT
M1.Product_ID Group_ID,
M1.Product_ID
FROM matches M1
LEFT JOIN matches M2
ON M1.Product_Id = M2.matching_Product_Id
WHERE M2.matching_Product_Id IS NULL
UNION ALL
SELECT
C.Group_ID,
M.matching_Product_Id
FROM CTE C
JOIN matches M
ON C.Product_ID = M.Product_ID
)
SELECT * FROM CTE ORDER BY Group_ID
You can use OPTION(MAXRECURSION n) to control recursion depth.
SQL FIDDLE DEMO
Something like this (not tested)
with match_groups as (
select product_id,
matching_product_id,
product_id as group_id
from matches
where product_id not in (select matching_product_id from matches)
union all
select m.product_id, m.matching_product_id, p.group_id
from matches m
join match_groups p on m.product_id = p.matching_product_id
)
select group_id, product_id
from match_groups
order by group_id;
Sample of the Recursive Level:
DECLARE #VALUE_CODE AS VARCHAR(5);
--SET #VALUE_CODE = 'A' -- Specify a level
WITH ViewValue AS
(
SELECT ValueCode
, ValueDesc
, PrecedingValueCode
FROM ValuesTable
WHERE PrecedingValueCode IS NULL
UNION ALL
SELECT A.ValueCode
, A.ValueDesc
, A.PrecedingValueCode
FROM ValuesTable A
INNER JOIN ViewValue V ON
V.ValueCode = A.PrecedingValueCode
)
SELECT ValueCode, ValueDesc, PrecedingValueCode
FROM ViewValue
--WHERE PrecedingValueCode = #VALUE_CODE -- Specific level
--WHERE PrecedingValueCode IS NULL -- Root