Find projects which belong to several categories - sql

I have a projects table (columns id and name), a categories table (columns id and name) and a join table projects_categories (columns project_id and category_id).
I want to write a query that returns all the projects which belong to a set of categories. For instance if I have
projects
id name
1 foo
2 bar
categories
id name
1 bee
2 gee
projects_categories
project_id category_id
1 1
1 2
2 1
I want to write a query that returns me the project "foo" if I pass the category ids 1 and 2. I have tried the following query, but it it returns all the projects that belongs to category 1 OR 2, not category 1 AND 2.
SELECT "projects".*
FROM "projects"
INNER JOIN "projects_categories" ON "projects_categories"."project_id" = "project"."id"
WHERE "projects_categories"."category_id" IN (1, 2)
The following query does not return any result:
SELECT "projects".*
FROM "projects"
INNER JOIN "projects_categories" ON "projects_categories"."project_id" = "project"."id"
WHERE "projects_categories"."category_id" = 1
AND "projects_categories"."category_id" = 2
I understand why these queries return these results, but can't figure out how to write the query I need.

You can try with EXISTS :
SELECT "p".*
FROM "projects" "p"
WHERE EXISTS (
SELECT *
FROM "projects_categories" "pc"
WHERE "pc"."category_id" IN (1, 2)
AND "pc"."project_id" = "p"."id"
GROUP BY "pc"."project_id"
HAVING COUNT(DISTINCT "pc"."project_id" ) = 2)
)
Or JOIN :
SELECT "p".*
FROM "projects" "p"
JOIN (
SELECT "pc"."project_id"
FROM "projects_categories" "pc"
WHERE "pc"."category_id" IN (1, 2)
GROUP BY "pc"."project_id"
HAVING COUNT(DISTINCT "pc"."project_id" ) = 2)
) "tpc" ON "tpc"."project_id" = "p"."id"
Or IN :
SELECT "p".*
FROM "projects" "p"
WHERE "p"."id" IN (
SELECT "pc"."project_id"
FROM "projects_categories" "pc"
WHERE "pc"."category_id" IN (1, 2)
GROUP BY "pc"."project_id"
HAVING COUNT(DISTINCT "pc"."project_id" ) = 2)
)

The issue is that the query was looking into the same table and column and attempting to have two different values. This will separate the two different conditions, and then combine them into one result set.
SELECT A.ProjectName, A.CategoryId, B.CategoryId
FROM
(SELECT P.Name [ProjectName], PC.CategoryId, PC.ProjectId
FROM #Projects P
INNER JOIN #Project_Categories PC
ON PC.ProjectId = P.ID
WHERE PC.CategoryId = 1
) A
INNER JOIN (SELECT P.Name [ProjectName], PC.CategoryId, PC.ProjectId
FROM #Projects P
INNER JOIN #Project_Categories PC
ON PC.ProjectId = P.ID
WHERE PC.CategoryId = 2
) B
ON A.ProjectId = B.ProjectId

Related

Count with exists in SQL

Why is this query not returning the count of the results? How do I get it to show the count
SELECT COUNT (*) AS MWith
FROM member m
JOIN Channel mc ON mc.MemberID = m.id
JOIN Client c ON c.id = m.clientid
JOIN packages p ON p.id = m.packageid
WHERE Enroll > '2018'
AND EXISTS (
SELECT * FROM
activity a
WHERE a.memberid = m.id
AND a.code IN ('785', 'a599')
)
GROUP BY m.id;
OUTPUT
MWith
1
1
1
The empty set is because of the clause group by .
The workarounds are:
Remove a GROUP BY, because m.id anyway is not part of the output
Use GROUP BY ALL
An original example:
SELECT COUNT(*) AS MWith
FROM member m
JOIN Channel mc
ON mc.MemberID = m.id
JOIN Client c
ON c.id = m.clientid
JOIN packages p
ON p.id = m.packageid
WHERE Enroll > '2018'
AND EXISTS
(
SELECT *
FROM activity a
WHERE a.memberid = m.id
AND a.code IN ( '785', 'a599' )
)
-- GROUP BY m.id;
Other, simpler examples to show a difference:
-- returns an empty resultset
SELECT COUNT(*) FROM sys.databases
WHERE 1=0
GROUP BY name
-- returns: a single row with 0
SELECT COUNT(*) FROM sys.databases
WHERE 1=0
-- Another example with GROUP BY ALL
-- it returns one row per grouped value, with expected count = 0
SELECT COUNT(*) FROM sys.databases
WHERE 1=0
GROUP BY ALL name

SQL Filtering rows with no duplicate value

Hi so I'm new to SQL and I'm trying to find a way in which I can obtain only the rows that have values that are not duplicate to each other in a specific column of table.
For example the Table below is called T1 and contains:
ID|Branch ID
1 444
2 333
3 444
4 111
5 555
6 333
The result I want will be
ID|Branch ID
4 111
5 555
So only showing non duplicate rows
Edit: I want to apply this to a large relational code. Here is a snippet of where I want it to be added
FROM dbo.LogicalLine
INNER JOIN dbo.Page ON dbo.LogicalLine.page_id = dbo.Page.id
INNER JOIN dbo.Branch ON dbo.LogicalLine.branch_id = dbo.Branch.id
The table LogicalLine will have a column called branch_id containing duplicate id values. I wish to filter those out showing only the non-duplicate branch_id like above example then INNER JOIN the Branch table into the LogicalLine which I have done.
Added -Full Code here:
SELECT
(SELECT name
FROM ParentDevice
WHERE (Dev1.type NOT LIKE '%cable%') AND (id = Dev1.parent_device_id))T1_DeviceID,
(SELECT name
FROM Symbol
WHERE (id = CP1.symbol_id) AND (type NOT LIKE '%cable%'))T1_DeviceName,
(SELECT name
FROM Location
WHERE (id = Page.location_id))T1_Location,
(SELECT name
FROM Installation
WHERE (id = Page.installation_id))T1_Installation,
(SELECT name
FROM ParentDevice
WHERE (Dev2.type NOT LIKE '%cable%') AND (id = Dev2.parent_device_id))T2_DeviceID,
(SELECT name
FROM Symbol
WHERE ( id = CP2.symbol_id) AND (type NOT LIKE '%cable%'))T2_DeviceName,
(SELECT name
FROM Location
WHERE (id = PD2.location_id))T2_Location,
(SELECT name
FROM Installation
WHERE (id = Page.installation_id))T2_Installation,
(SELECT devicefamily
FROM Device
WHERE (type LIKE '%cable%') AND (id = SymCable.device_id))CablePartNumber,
(SELECT name
FROM ParentDevice
WHERE (id = DevCable.parent_device_id) AND (DevCable.type LIKE '%cable%'))CableTag
FROM dbo.LogicalLine
INNER JOIN dbo.Page ON dbo.LogicalLine.page_id = dbo.Page.id
INNER JOIN dbo.Branch ON dbo.LogicalLine.branch_id = dbo.Branch.id
LEFT OUTER JOIN dbo.Symbol AS SymCable ON dbo.LogicalLine.cable_id = SymCable.id
LEFT OUTER JOIN dbo.Device AS DevCable ON SymCable.device_id = DevCable.id
LEFT OUTER JOIN dbo.ParentDevice AS ParentCable ON DevCable.parent_device_id = ParentCable.id
INNER JOIN dbo.SymbolCP AS CP1 ON dbo.Branch.cp1_id = CP1.id
INNER JOIN dbo.SymbolCP AS CP2 ON dbo.Branch.cp2_id = CP2.id
INNER JOIN dbo.Symbol AS S1 ON CP1.symbol_id = S1.id
INNER JOIN dbo.Symbol AS S2 ON CP2.symbol_id = S2.id
INNER JOIN dbo.Device AS Dev1 ON S1.device_id = Dev1.id
INNER JOIN dbo.Device AS Dev2 ON S2.device_id = Dev2.id
INNER JOIN dbo.ParentDevice AS PD1 ON Dev1.parent_device_id = PD1.id
INNER JOIN dbo.ParentDevice AS PD2 ON Dev2.parent_device_id = PD2.id
INNER JOIN dbo.Location AS L1 ON PD1.location_id = L1.id
INNER JOIN dbo.Location AS L2 ON PD2.location_id = L2.id
INNER JOIN dbo.Installation AS I1 ON L1.installation_id = I1.id
INNER JOIN dbo.Installation AS I2 ON L2.installation_id = I2.id
WHERE
(PD1.project_id = #Projectid) AND (dbo.LogicalLine.drawingmode LIKE '%Single Line%');
Select Id, BranchId from table t
Where not exists
(Select * from table
where id != t.Id
and BranchId = t.BranchId)
or
Select Id, BranchId
From table
Group By BranchId
Having count(*) == 1
EDIT: to modify as requested, simply add to your complete SQL query a Where clause:
Select l.Id BranchId, [plus whatever else you have in your select clause]
FROM LogicalLine l
join Page p ON p.id = l.page_Id
join Branch b ON b.Id = l.branch_id
Group By l.branch_id, [Plus whatever else you have in Select clause]
Having count(*) == 1
or
Select l.Id BranchId, [plus whatever else you have in your select clause]
FROM LogicalLine l
join Page p on p.id = l.page_Id
join Branch b on b.Id = l.branch_id
Where not exists
(Select * from LogicalLine
where id != l.Id
and branch_id = l.branch_id)

SELECT common entities only based on different corresponding entities

3 Tables
Client -
CID Name
1 Ana
2 Bana
3 Cana
ClientProgram (Bridge Table) -
CID PID
1 4
1 5
1 8
2 10
Program -
PID Program
4 X
5 Y
8 Z
10 G
Desired Output:
Name Program
Ana X
Ana Y
I want to extract only those Clients which are common/exist in different Programs I choose (say X and Y in this case)
Query attempt:
SELECT
C.Name
,P.Program
FROM ClientProgram CP
INNER JOIN Client C
ON CP.CID=C.CID
INNER JOIN Program P
ON CP.PID=P.PID
INNER JOIN ClientProgram CP1
ON CP.CID=CP1.CID
WHERE P.Program = 'X' OR P.Program = 'Y'
AND CP.CID = CP1.CID
This however doesn't pulls in all clients and not only those which exist in multiple programs.
;WITH cte AS (
SELECT
c.Name
,p.Program
,COUNT(*) OVER (PARTITION BY c.CID) as ProgramCount
FROM
Program p
INNER JOIN ClientProgram cp
ON p.PID = cp.PID
INNER JOIN Client c
On cp.CID = c.CID
WHERE
p.Program IN ('X','Y')
)
SELECT Name, Program
FROM
cte
WHERE
ProgramCount > 1
The use of COUNT(*) over will be a problem if PID is not unique in Programs or if the combination of CID to PID in ClientProgram is not unique. However I would assume uniqueness based on what I see.
If not you can go a route like this:
;WITH cte AS (
SELECT
cp.CID
FROM
Program p
INNER JOIN ClientProgram cp
ON p.PID = cp.PID
WHERE
p.Program IN ('X','Y')
GROUP BY
cp.CID
HAVING
COUNT(DISTINCT p.PID) > 1
)
SELECT
c.Name
,p.Program
FROM
cte t
INNER JOIN Client c
ON t.CID = c.CID
INNER JOIN ClientProgram cp
ON t.CID = cp.CID
INNER JOIN Program p
ON cp.PID = p.PID
AND p.Program IN ('X','Y')
This is kind of a round about way of doing it. Probably a better way but this will do it. I through in scripts for temp table in case someone else wants to improve. Could do a temp table for example instead of CTE.
create table #client(cid int,name varchar(20))
create table #clientprogram (cid int, pid int)
create table #program( pid int, program varchar(20))
insert into #client
values(1,'Ana')
,(2,'Bana')
,(3,'Cana')
insert into #clientprogram
values (1,4)
,(1,5)
,(1,8)
,(2,10)
,(2,4)
insert into #program
values (4,'x')
,(5,'y')
,(8,'z')
,(10,'g')
WITH CHECKPLEASE AS(
Select c.Name,ISNULL(p.Program,p2.PRogram) Program
from #client c
inner join #clientprogram cp
on c.CID = cp.CID
left join #program p
on cp.PID = p.PID
and p.PRogram = 'X'
left join #program p2
on cp.PID = p2.PID
and p2.Program = 'Y'
where ISNULL(p.Program,p2.PRogram) is not null
)
Select *
From CHECKPLEASE
where Name in (
SELECT Name
From CHECKPLEASE
group by Name
having COUNT(*) > 1)

SELECT one entry with two left joins. SQL

Example
if two products
id name
1 product A
2 product B
And for each products I've attributes
id product_id value
1 1 1
2 1 2
3 2 3
3 2 4
And I need to select products by value of attributes.
I need products which have attributes with 1 AND 2 values.
This query doesn't work:
SELECT *
FROM product
LEFT JOIN attribute ON product.id = attribute.product_id
WHERE attribute.value = 1 AND attribute.value = 2;
Do a group by to find product id's with both 1 and 2 attributes. Select from products where product id found by that group by:
SELECT *
FROM product_table
WHERE id IN (select product_id
from attribute_table
where value in (1,2)
group by product_id
having count(distinct value) = 2)
Alternative solution, double join:
SELECT *
FROM product_table
JOIN attribute_table a1 ON product_table.id = a1.product_id
AND a1.value = 1
JOIN attribute_table a2 ON product_table.id = a2.product_id
AND a2.value = 2
SELECT *
FROM product p
LEFT JOIN attribute a ON p.id = a.product_id
WHERE a.value IN ('1','2')
To rephrase your question, you really need those products, which has both 1 and 2 within the values of their attributes:
SELECT product.*
-- , array_agg(attribute.value) attribute_values
-- uncomment the line above, if needed
FROM product
LEFT JOIN attribute ON product.id = attribute.product_id
GROUP BY product.id
HAVING array_agg(attribute.value) #> ARRAY[1, 2];
If you mean values 1 OR 2
SELECT *
FROM product p
LEFT JOIN attribute a ON p.id = a.product_id
WHERE a.value IN ('1', '2');

selecting the max values based on a count

How can i retrieve the max of each ValueCount based on the firmid. I need the data to be output like so.
My code is below
SELECT
F.FirmID,
F.Name,
DL.ValueId,
DL.ValueName,
count(DL.ValueName) AS ValueCount
FROM
dbo.Jobs AS J
INNER JOIN DimensionValues AS DV ON
DV.CrossRef = J.JobId
INNER JOIN dbo.DimensionLists AS DL ON
DV.ValueId = DL.ValueId
INNER JOIN Firms AS F ON
F.FirmId = J.ClientFirmId
WHERE
DL.DimensionId = 4
GROUP BY
F.FirmID,
F.Name,
DL.ValueName,
DL.ValueId
this produces something like
firmid | value | count
1 1 5
1 2 10
2 3 1
2 1 6
i need to return back the records with 10 and 6.
EDIT : SQL 2005 answer deleted.
Then you could push your results into a temporary table (or table variable) and do something like this...
SELECT
*
FROM
TempTable
WHERE
ValueCount = (SELECT MAX(ValueCount) FROM TempTable AS Lookup WHERE FirmID = TempTable.FirmID)
Or...
SELECT
*
FROM
TempTable
INNER JOIN
(SELECT FirmID, MAX(ValueCount) AS ValueCount FROM TempTable GROUP BY FirmID) AS lookup
ON lookup.FirmID = TempTable.FirmID
AND lookup.ValueCount = TempTable.ValueCount
These will give multiple records if any ValueCount is tied with another for the same FirmID. As such, you could try this...
SELECT
*
FROM
TempTable
WHERE
value = (
SELECT TOP 1
value
FROM
TempTable as lookup
WHERE
FirmID = TempTable.FirmID
ORDER BY
ValueCount DESC
)
For this problem you need to produce the result set of the query in order to determine the Max ValueCount, then you need to do the query again to pull just the records with Max ValueCount. You can do this many way, like repeating the main query as subqueries, and in SQL Server 2005/2008 by using a CTE. I think using the subqueries gets a little messy and would prefer the CTE, but for SQL Server 2000 you don't have that as an option. So, I've used a temp table instead of a CTE. I run it once to get the MaxValueCount and save that into a temp table, then run the query again and join against the temp table to get just the record with MaxValueCount.
create table #tempMax
(
FirmID int,
MaxValueCount int
)
insert #tempMax
SELECT t.FirmID, MAX(t.ValueCount) AS MaxValueCount
FROM (
SELECT F.FirmID, F.Name, DL.ValueId, DL.ValueName
, count(DL.ValueName) AS ValueCount
FROM dbo.Jobs AS J
INNER JOIN DimensionValues AS DV ON DV.CrossRef = J.JobId
INNER JOIN dbo.DimensionLists AS DL ON DV.ValueId = DL.ValueId
INNER JOIN Firms AS F ON F.FirmId = J.ClientFirmId
WHERE DL.DimensionId = 4
GROUP BY F.FirmID, F.Name, DL.ValueName, DL.ValueId) t
SELECT t.FirmID, t.Name, t.ValueID, t.ValueName, t.ValueCount
FROM (
SELECT F.FirmID, F.Name, DL.ValueId, DL.ValueName
, count(DL.ValueName) AS ValueCount
FROM dbo.Jobs AS J
INNER JOIN DimensionValues AS DV ON DV.CrossRef = J.JobId
INNER JOIN dbo.DimensionLists AS DL ON DV.ValueId = DL.ValueId
INNER JOIN Firms AS F ON F.FirmId = J.ClientFirmId
WHERE DL.DimensionId = 4
GROUP BY F.FirmID, F.Name, DL.ValueName, DL.ValueId) t
INNER JOIN #tempMax m ON t.FirmID = m.FirmID and t.ValueCount = m.MaxValueCount
DROP TABLE #tempMax
You should be able to use a derived table for this:
SELECT F.FirmID,
F.Name,
DL.ValueId,
DL.ValueName,
T.ValueCount
FROM Jobs J
INNER JOIN DimensionValues DV
ON DV.Crossref = J.JobID
INNER JOIN DimensionList DL
ON DV.ValueID = DL.ValueID
INNER JOIN Firms F
ON F.FirmID = J.ClientFirmID
--derived table
INNER JOIN (SELECT FirmID, MAX(ValueName) ValueCount FROM DimensionList GROUP BY FirmID) T
ON T.FirmID = F.FirmID
WHERE DL.DimensionId = 4
TBL1 and TBL2 is your query:
SELECT *
FROM TBL1
WHERE
TBL1.ValueCount = (SELECT MAX(TBL2.ValueCount) FROM TBL2 WHERE TBL2.FIRMID = TBL1.FIRMID)