SQL Server: Query for products with matching tags - sql

I have been pondering over this for the past few hours but I cannot find a solution.
I have a products in a table, tags in another table and a product/tag link table.
Now I want to retrieve all products which have the same tags as a certain product.
Here are the tables (simplified):
PRODUCT:
id varchar(36) (primary key)
Name varchar(50)
TAG:
id varchar(36) (primary key)
Name varchar(50)
PRODUCTTAG:
id varchar(36) (primary key)
ProductID varchar(36)
TagID varchar(36)
I find quite a few answers here on Stackoverflow talking about returning full and partial matches. However I am looking for a query which only gives full matches.
Example:
Product A has tags 1, 2, 3
Product B has tags 1, 2
Product C has tags 1, 2, 3
Product D has tags 1, 2, 3, 4
If I query for product A, only product C should be found - as it is the only one having exactly the same tags.
Is this even possible?

Yes, yes, try this way:
with aa as (
select count(*) count
from [PRODUCTTAG]
where ProductID = '19A947C0-6A0F-4A6F-9675-48FBE30A877D'
), bb as
(
select ProductID, count(*) count
from [PRODUCTTAG]
group by ProductID
)
select distinct b.ProductID
from [dbo].[PRODUCTTAG] a join
[dbo].[PRODUCTTAG] b on a.TagID = b.TagID cross join
aa join
bb on aa.count = bb.count and b.ProductID = bb.ProductID
where a.ProductID = '19A947C0-6A0F-4A6F-9675-48FBE30A877D'

declare #PRODUCTTAG table(id int identity(1,1),ProductID int,TagID int)
insert into #PRODUCTTAG VALUES
(1,1),(1,2),(1,3)
,(2,1),(2,2)
,(3,1),(3,2),(3,3)
,(4,1),(4,2),(4,3),(4,4)
;With CTE as
(
select ProductID,count(*)smallCount
FROM #PRODUCTTAG
group by ProductID
)
,CTE1 as
(
select smallCount, count(smallCount)BigCount
from cte
group by smallCount
)
,CTE2 as
(
select * from cTE c
where exists(
select smallCount from cte1 c1
where BigCount>1 and c1.smallCount=c.smallCount
)
)
select * from cte2
--depending upon the output expected join this with #PRODUCTTAG,#Product,#Tag
--like this
--select * from #PRODUCTTAG PT
--where exists(
--select * from cte2 c2 where pt.productid=c2.productid
--)
Or Tell what is final output look like ?

This is a case where I find it simpler to combine all the tags into a single string and compare the strings. But, that is painful in SQL Server until 2016.
So, there is a set based solution:
with pt as (
select pt.*, count(*) over (partition by productid) as cnt
from producttag pt
)
select pt.productid
from pt join
pt pt2
on pt.cnt = pt2.cnt and
pt.productid <> pt2.productid and
pt.tagid = pt2.tagid
where pt2.productid = #x
group by pt.productid, pt.cnt
having count(*) = pt.cnt;
This matches every product to your given product based on the tags. The having clause then ensures that the number of matching tags is the same for the two products. Because the join only considers matching tags, all the tags are the same.

Related

SELECT TOP inside INNER JOIN

I created this simple database in SQL Server:
create database product_test
go
use product_test
go
create table product
(
id int identity primary key,
label varchar(255),
description text,
price money,
);
create table picture
(
id int identity primary key,
p_path text,
product int foreign key references product(id)
);
insert into product
values ('flip phone 100', 'back 2 the future stuff.', 950),
('flip phone 200', 's;g material', 1400)
insert into picture
values ('1.jpg', 1), ('2.jpg', 1), ('3.jpg', 2)
What I want is to select all products and only one picture for each product. Any help is greatly appreciated.
I'm a fan of outer apply for this purpose:
select p.*, pi.id, pi.path
from product p outer apply
(select top 1 pi.*
from picture pi
where pi.product = p.id
) pi;
You can include an order by to get one particular picture (say, the one with the lowest or highest id). Or, order by newid() to get a random one.
Have you tried using a correlated sub-query?
SELECT *, (SELECT TOP 1 p_path FROM picture WHERE product = p.id ORDER BY id)
FROM picture p
Hope this helps,
SELECT
*,
(
SELECT TOP 1 p2.p_path
FROM dbo.picture p2
WHERE p.id = p2.product
) AS picture
FROM dbo.product p
Or with join:
SELECT
*
FROM dbo.product p
INNER JOIN
(
SELECT p2.product, MIN(p2.p_path) AS p_path
FROM dbo.picture p2
GROUP BY p2.product
) AS pt
ON p.id = pt.product
But you need to change p_path to varchar type
I would use a windowing function like this:
SELECT *
FROM product
JOIN (
SELECT id, product, p_path,
row_number() OVER (PARTITION BY product ORDER BY id ASC) as RN
FROM picture
) pic ON product.id = pic.product AND pic.RN = 1
As you can see here I am selecting the picture with the lowest id (ORDER BY id ASC) -- you can change this order by to your requirements.
just group by and take min or max
left join in case there is no picture
select pr.ID, pr.label, pr.text, pr.price
, min(pic.p_path)
from product pr
left join picture pic
on pic.product = pr.ID
group by pr.ID, pr.label, pr.text, pr.price

how can select unique row using where/having cluse and compare with another table

i cant understand how can take unique column (remove duplication) from a table
which compare with another table data.
in my case
i have two table
i want to get unique rows from tblproduct after compireing with tblviewer as
[in table viewer first taking viewerid after that taking productid in viewer table afterthat compire with tblproduct.
actualy like that
if i take vieweris=123 two row productid select 12001&11001 after that this tblproduct productid and finaly taking the row from tblproduct which maching.
select *
from tblproduct
where productid =
(
select distinct(productid)
from tblviewer
where viewerid = 123
)
There are a few ways to do this. You can do a standard INNER JOIN to the table to filter the results:
Select Distinct P.*
From tblProduct P
Join tblViewer V On V.ProductId = P.ProductId
Where V.ViewerId = 123
Alternatively, you could use EXISTS as well - this eliminates the need to use a DISTINCT altogether:
Select *
From tblProduct P
Where Exists
(
Select *
From tblViewer V
Where V.ProductId = P.ProductId
And V.ViewerId = 123
)
Or, you could also use an IN, as suggested by the other answers:
Select *
From tblProduct
Where ProductId In
(
Select ProductId
From tblViewer
Where ViewerId = 123
)
I think you just want to use an IN clause, you will not need to use distinct
select *
from tblproduct
where productid in
(
select productid
from tblviewer
where viewerid = 123
)
I'm not sure what you're asking, but I think it is,
select *
from tblproduct
where productid in
(
select distinct(productid)
from tblviewer
)

How to get all child of a given id in SQL Server query

I have two tables in SQL Server database:
category(
itemid,
parentid
)
ArticleAssignedCategories(
categid,
artid
)
categid is a foreign key of itemid
I want to get count of artids and child of that for given itemid (child means categories with parentid of given itemid.)
For example; If given itemid = 1 and in table category have (3,1),(4,1)(5,3)
All of 3, 4, 5 are child of 1
Can anyone help me to write a good query?
Recursive queries can be done using CTE
with CTE(itemid, parentid)
as (
-- start with some category
select itemid, parentid
from category where itemid = <some_itemid>
union all
-- recursively add children
select c.itemid, c.parentid
from category c
join CTE on c.parentid = CTE.itemid
)
select count(*)
from ArticleAssignedCategories a
join CTE on CTE.itemid = a.categid
Here is the query. I hope this may help you
select b.artid,count(b.artid) from category a
inner join ArticleAssignedCategories b on a.itemid = b.artid
group by b.artid

Recursive query in SQL Server

I have a table with following structure
Table name: matches
That basically stores which product is matching which product. I need to process this table
And store in a groups table like below.
Table Name: groups
group_ID stores the MIN Product_ID of the Product_IDS that form a group. To give an example let's say
If A is matching B and B is Matching C then three rows should go to group table in format (A, A), (A, B), (A, C)
I have tried looking into co-related subqueries and CTE, but not getting this to implement.
I need to do this all in SQL.
Thanks for the help .
Try this:
;WITH CTE
AS
(
SELECT DISTINCT
M1.Product_ID Group_ID,
M1.Product_ID
FROM matches M1
LEFT JOIN matches M2
ON M1.Product_Id = M2.matching_Product_Id
WHERE M2.matching_Product_Id IS NULL
UNION ALL
SELECT
C.Group_ID,
M.matching_Product_Id
FROM CTE C
JOIN matches M
ON C.Product_ID = M.Product_ID
)
SELECT * FROM CTE ORDER BY Group_ID
You can use OPTION(MAXRECURSION n) to control recursion depth.
SQL FIDDLE DEMO
Something like this (not tested)
with match_groups as (
select product_id,
matching_product_id,
product_id as group_id
from matches
where product_id not in (select matching_product_id from matches)
union all
select m.product_id, m.matching_product_id, p.group_id
from matches m
join match_groups p on m.product_id = p.matching_product_id
)
select group_id, product_id
from match_groups
order by group_id;
Sample of the Recursive Level:
DECLARE #VALUE_CODE AS VARCHAR(5);
--SET #VALUE_CODE = 'A' -- Specify a level
WITH ViewValue AS
(
SELECT ValueCode
, ValueDesc
, PrecedingValueCode
FROM ValuesTable
WHERE PrecedingValueCode IS NULL
UNION ALL
SELECT A.ValueCode
, A.ValueDesc
, A.PrecedingValueCode
FROM ValuesTable A
INNER JOIN ViewValue V ON
V.ValueCode = A.PrecedingValueCode
)
SELECT ValueCode, ValueDesc, PrecedingValueCode
FROM ViewValue
--WHERE PrecedingValueCode = #VALUE_CODE -- Specific level
--WHERE PrecedingValueCode IS NULL -- Root

Ranking before grouping problem in SQL Server 2005

HI,
This should be easy but I don't understand enough about how grouping works.
Basically I have 2 tables "Categories" and "Items"
Categories
ID
CategoryName
Items
ID
CategoryID
ItemName
Photo
Score
All I want to do is get 1 row for each category which contains the Category ID, the Category Name and the photo that belongs to the highest scoring item.
So I have tried joining the categories to the items and grouping by the CategoryID. Trouble is that I want to order the items so that the highest scoring items are at the top before it does the groupings to make sure that the photo is from the current highest scoring item in that category. If I select MAX(I.score) I can get the highest score but I'm not sure how to get accompanying photo as MAX(photo) will obviously give me the photo with the highest file name alphabetically.
I hope I've explained that well.
You could try something like (Full example)
DECLARE #Categories TABLE(
ID INT,
CategoryName VARCHAR(50)
)
DECLARE #Items TABLE(
ID INT,
CategoryID INT,
ItemName VARCHAR(50),
Photo VARCHAR(50),
Score FLOAT
)
INSERT INTO #Categories (ID,CategoryName) SELECT 1, 'Cat1'
INSERT INTO #Categories (ID,CategoryName) SELECT 2, 'Cat2'
INSERT INTO #Items (ID,CategoryID,ItemName,Photo,Score) SELECT 1, 1, 'Item1', 'PItem1', 1
INSERT INTO #Items (ID,CategoryID,ItemName,Photo,Score) SELECT 2, 1, 'Item2', 'PItem2', 2
INSERT INTO #Items (ID,CategoryID,ItemName,Photo,Score) SELECT 3, 1, 'Item3', 'PItem3', 3
INSERT INTO #Items (ID,CategoryID,ItemName,Photo,Score) SELECT 4, 2, 'Item4', 'PItem4', 5
INSERT INTO #Items (ID,CategoryID,ItemName,Photo,Score) SELECT 5, 2, 'Item5', 'PItem5', 2
SELECT *
FROM (
SELECT c.ID,
c.CategoryName,
i.Photo,
i.Score,
ROW_NUMBER() OVER(PARTITION BY i.CategoryID ORDER BY i.Score DESC) RowID
FROM #Categories c INNER JOIN
#Items i ON c.ID = i.CategoryID
) CatItems
WHERE RowID = 1
Using the ROW_NUMBER you can selet the items you require.
You need to aggregate first and join back like this.
(If you change grouping, you need to change JOIN)
SELECT
...
FROM
(
select
max(Score) AS MaxScore,
CategoryID
FROM
Items
GROUP BY
CategoryID
) M
JOIN
Items I ON M.CategoryID = I.CategoryID AND M.MaxScore = I.Score
JOIN
Categories C ON I.CategoryID = C.CategoryID
This is a pretty common problem, and one that SQL Server doesn't solve particularly well. Something like this should do the trick, though:
select
c.ID,
c.CategoryName,
item.*
from Categories c
join (
select
ID,
CategoryID,
ItemName,
Photo,
Score,
(row_number() over order by CategoryID, Score desc) -
(rank() over order by CategoryID) as rownum
from Items) item on item.CategoryID = c.CategoryID and item.rownum = 0
While there is no explicit group by clause, this (for practical purposes) groups the Categories records and gives you a joined statement that allows you to view any property of the highest scoring item.
You can use row numbers to rank items per category:
select *
from (
select
row_number() over (partition by c.id order by i.score desc) rn
, *
from Categories c
join Items i on c.ID = i.CategoryID
) sub
where rn = 1
In SQL 2005, you can't reference a row_number() directly in a where, so it's wrapped in a subquery.
Exactly as you worded it:
"the Category ID, the Category Name and the photo that belongs to the highest scoring item." -- Now here I surmise you really meant "...highest scoring item in that category", no?)
Select CategoryID, c.Categoryname, Photo
From items i Join Categoiries c
On c.ID = i.CategoryId
Where Score = (Select Max(Score) From Items
Where CategoryID = i.CategoryId)
If you really meant the highest scoring item on the whole items table, then just omit the predicate in the subquery
Select CategoryID, c.Categoryname, Photo
From items i Join Categoiries c
On c.ID = i.CategoryId
Where Score = (Select Max(Score) From Items)
Both these queries will return multiple rows per group if there are more than one item in the defined group which tie for highest score..