SQL Elaborate Joins Query - sql

I'm trying to solve the below problem.
I feel like it is possible, but I can't seem to get it.
Here's the scenario:
Table 1 (Assets)
1 Asset-A
2 Asset-B
3 Asset-C
4 Asset-D
Table 2 (Attributes)
1 Asset-A Red
2 Asset-A Hard
3 Asset-B Red
4 Asset-B Hard
5 Asset-B Heavy
6 Asset-C Blue
7 Asset-C Hard
If I am looking for something having the same attributes as Asset-A, then it should identify Asset-B since Asset-B has all the same attributes as Asset-A (it should discard heavy, since Asset-A didn't specify anything different or the similar). Also, if I wanted the attributes for only Asset-A AND Asset-B that were common, how would I get that?
Seems simple, but I can't nail it...
The actual table I am using, is almost precisely Table2, simply an association of an AssetId, and an AttributeId so:
PK: Id
int: AssetId
int: AttributeId
I only included the idea of the asset table to simplify the question.

SELECT ato.id, ato.value
FROM (
SELECT id
FROM assets a
WHERE NOT EXISTS
(
SELECT NULL
FROM attributes ata
LEFT JOIN
attributes ato
ON ato.id = ata.id
AND ato.value = ata.value
WHERE ata.id = 1
AND ato.id IS NULL
)
) ao
JOIN attributes ato
ON ato.id = ao.id
JOIN attributes ata
ON ata.id = 1
AND ata.value = ato.value
, or in SQL Server 2005 (with sample data to check):
WITH assets AS
(
SELECT 1 AS id, 'A' AS name
UNION ALL
SELECT 2 AS id, 'B' AS name
UNION ALL
SELECT 3 AS id, 'C' AS name
UNION ALL
SELECT 4 AS id, 'D' AS name
),
attributes AS
(
SELECT 1 AS id, 'Red' AS value
UNION ALL
SELECT 1 AS id, 'Hard' AS value
UNION ALL
SELECT 2 AS id, 'Red' AS value
UNION ALL
SELECT 2 AS id, 'Hard' AS value
UNION ALL
SELECT 2 AS id, 'Heavy' AS value
UNION ALL
SELECT 3 AS id, 'Blue' AS value
UNION ALL
SELECT 3 AS id, 'Hard' AS value
)
SELECT ato.id, ato.value
FROM (
SELECT id
FROM assets a
WHERE a.id <> 1
AND NOT EXISTS
(
SELECT ata.value
FROM attributes ata
WHERE ata.id = 1
EXCEPT
SELECT ato.value
FROM attributes ato
WHERE ato.id = a.id
)
) ao
JOIN attributes ato
ON ato.id = ao.id
JOIN attributes ata
ON ata.id = 1
AND ata.value = ato.value

I don't completely understand the first part of your question, identifying assets based on their attributes.
Making some assumptions about column names, the following query would yield the common attributes between Asset-A and Asset-B:
SELECT [Table 2].Name
FROM [Table 2]
JOIN [Table 1] a ON a.ID = [Table 2].AssetID AND a.Name = 'Asset-A'
JOIN [Table 1] b ON b.ID = [Table 2].AssetID AND b.Name = 'Asset-B'
GROUP BY [Table 2].Name

Select * From Assets A
Where Exists
(Select * From Assets
Where AssetId <> A.AssetID
And (Select Count(*)
From Attributes At1 Join Attributes At2
On At1.AssetId <> At2.AssetId
And At1.attribute <> At2.Attribute
Where At1.AssetId = A.AssetId Asset) = 0 )
And AssetId = 'Asset-A'

select at2.asset, count(*)
from attribute at1
inner join attribute at2 on at1.value = at2.value
where at1.asset = "Asset-A"
and at2.asset != "Asset-A"
group by at2.asset
having count(*) = (select count(*) from attribute where asset = "Asset-A");

Find all assets who have every attribute that "A" has (but also may have additional attributes):
SELECT Other.ID
FROM Assets Other
WHERE
Other.AssetID <> 'Asset-A' -- do not return Asset A as a match to itself
AND NOT EXISTS (SELECT NULL FROM Attributes AttA WHERE
AttA.AssetID='Asset-A'
AND NOT EXISTS (SELECT NULL FROM Attributes AttOther WHERE
AttOther.AssetID=Other.ID AND AttOther.AttributeID = AttA.AttributeID
)
)
I.e., "find any asset where there is no attribute of A that is not also an attribute of this asset".
Find all assets who have exactly the same attributes as "A":
SELECT Other.ID
FROM Assets Other
WHERE
Other.AssetID <> 'Asset-A' -- do not return Asset A as a match to itself
AND NOT EXISTS (SELECT NULL FROM Attributes AttA WHERE
AttA.AssetID='Asset-A'
AND NOT EXISTS (SELECT NULL FROM Attributes AttOther WHERE
AttOther.AssetID=Other.ID
AND AttOther.AttributeID = AttA.AttributeID
)
)
AND NOT EXISTS (SELECT NULL FROM Attributes AttaOther WHERE
AttaOther.AssetID=Other.ID
AND NOT EXISTS (SELECT NULL FROM Attributes AttaA WHERE
AttaA.AssetID='Asset-A'
AND AttaA.AttributeID = AttaOther.AttributeID
)
)
I.e., "find any asset where there is no attribute of A that is not also an attribute of this asset, and where there is no attribute of this asset that is not also an attribute of A."

This solution works as prescribed, thanks for the input.
WITH Atts AS
(
SELECT
DISTINCT
at1.[Attribute]
FROM
Attribute at1
WHERE
at1.[Asset] = 'Asset-A'
)
SELECT
DISTINCT
Asset,
(
SELECT
COUNT(ta2.[Attribute])
FROM
Attribute ta2
INNER JOIN
Atts b
ON
b.[Attribute] = ta2.[attribute]
WHERE
ta2.[Asset] = ta.Asset
)
AS [Count]
FROM
Atts a
INNER JOIN
Attribute ta
ON
a.[Attribute] = ta.[Attribute]

Find all assets that have all the same attributes as asset-a:
select att2.Asset from attribute att1
inner join attribute att2 on att2.Attribute = att1.Attribute and att1.Asset <> att2.Asset
where att1.Asset = 'Asset-A'
group by att2.Asset, att1.Asset
having COUNT(*) = (select COUNT(*) from attribute where Asset=att1.Asset)

I thought maybe I can do this with LINQ and then work my way backwards with:
var result = from productsNotA in DevProducts
where productsNotA.Product != "A" &&
(
from productsA in DevProducts
where productsA.Product == "A"
select productsA.Attribute
).Except
(
from productOther in DevProducts
where productOther.Product == productsNotA.Product
select productOther.Attribute
).Single() == null
select new {productsNotA.Product};
result.Distinct()
I thought that translating this back to SQL with LinqPad would result into a pretty SQL query. However it didn't :). DevProducts is my testtable with a column Product and Attribute. I thought I'd post the LINQ query anyways, might be useful to people who are playing around with LINQ.
If you can optimize the LINQ query above, please let me know (it might result in better SQL ;))

I'm using following DDL
CREATE TABLE Attributes (
Asset VARCHAR(100)
, Name VARCHAR(100)
, UNIQUE(Asset, Name)
)
Second question is easy
SELECT Name
FROM Attributes
WHERE Name IN (SELECT Name FROM Attributes WHERE Asset = 'A')
AND Asset = 'B'
First question is not more difficult
SELECT Asset
FROM Attributes
WHERE Name IN (SELECT Name FROM Attributes WHERE Asset = 'A')
GROUP BY Asset
HAVING COUNT(*) = (SELECT COUNT(*) FROM FROM Attributes WHERE Asset = 'A')
Edit:
I left AND Asset != 'A' out of the WHERE clause of the second snippet for brevity

Related

Conditional IN Statement to be used inside Postgres function

I am working on Postgres and I have two tables vehicles and vehicles_flag. There are no relations between the two tables and hence we can not join two tables to fetch the required data.
The table structure is below (vehicle_flag table may not contain all the id present in the vehicle table) :
[Table structure]
I am writing a function that will accept multiple input parameters. I have to select vehicle id from the vehicle_flag table only if the flag value is true: otherwise, I have to ignore the vehicel_flag table. My aim is to achieve something like this, but turns out the case statement expects scaler output:
select count(id) from vehicles
where
vehicles.id in (case
when #hasbluetooth =1 then (select distinct id from vehicle_flags where flag='bluetooth' and value = '1')
else
(select distinct id from vehicles)
end)
and
vehicles.id in (case
when #hasac =1 then (select distinct id from vehicle_flags where flag='ac' and value = '1')
else
(select distinct id from vehicles)
end)
Kindly suggest any solution to achieve this.
I suspect you want:
select v.*
from vehicle v
left join vehicle_flags vf on vf.id = v.id
group by v.id
having
(#hasbluetooth = 0 or bool_or(vf.flag = 'bluetooth' and vf.value = 1)
and (#hasac = 0 or bool_or(vf.flag = 'ac' and vf.value = 1)

check if row exists with specific value

I have 2 tables task and taskattributes. There is a linking between 2 tables with taskid. Each taskid has multiple attributes represented by key,value.
I would like to find out if specific key exists for the task
For e.g. here if I want to check all the tasks which do not have key 'A'.
use correlated subquery with not exists
select a.taskid, b.key, b.value
from task a inner join taskattributes b on a.taskid=b.taskid
where not exist
(select 1 from taskattributes c on c.taskid=b.taskid and key='A')
With not exists:
select *
from task t
where not exists (
select 1 from taskattributes
where taskid = t.taskid and key = 'A'
)
One simple solution uses aggregation:
SELECT
t.taskid,
t.name
FROM task t
INNER JOIN taskattributes ta
ON t.taskid = ta.taskid
GROUP BY
t.taskid,
t.name
HAVING
COUNT(CASE WHEN "key" = 'A' THEN 1 END) = 0;
If you are using Postgres 9.4 or later, you may use FILTER in the HAVING clause:
HAVING COUNT(*) FILTER (WHERE "key" = 'A') = 0

Join three rows if the same value in one column

There is a Postgres database and the table has three columns. The data structure is in external system so I can not modify it.
Every object is represented by three rows (identified by column element_id - rows with the same value in this column represents the same object), for example:
key value element_id
-----------------------------------
status active 1
name exampleNameAAA 1
city exampleCityAAA 1
status inactive 2
name exampleNameBBB 2
city exampleCityBBB 2
status inactive 3
name exampleNameCCC 3
city exampleCityCCC 3
I want to get all values describing every objects (name, status and city).
For this example the output should be like:
exampleNameAAA | active | exampleCityAAA
exampleNameBBB | inactive | exampleCityBBB
exampleNameCCC | inactive | exampleCityCCC
I know how to join two rows:
select a.value as name,
b.value as status
from the_table a
join the_table b
on a.element_id = b.element_id
and b."key" = 'status'
where a."key" = 'name';
How is it possible to join three columns?
You can try below
DEMO
select a.value as name,
b.value as status,c.value as city
from t1 a
join t1 b
on a.element_id = b.element_id and b."keys" = 'status'
join t1 c on a.element_id = c.element_id and c."keys" = 'city'
where a."keys" = 'name';
OUTPUT
name status city
exampleNameAAA active exampleCityAAA
exampleNameBBB inactive exampleCityBBB
exampleNameCCC inactive exampleCityCCC
One option is to simply add another join for each value you need (this is one of the big disadvantages of the EAV (anti) pattern you are using:
select a.value as name,
b.value as status,
c.value as city
from the_table a
join the_table b on a.element_id = b.element_id and b."key" = 'status'
join the_table c on a.element_id = c.element_id and c."key" = 'city'
where a."key" = 'name';
Another option is to aggregate all key/value pairs for an element into a JSON then you can easily access each one without additional joins:
select t.element_id,
t.obj ->> 'city' as city,
t.obj ->> 'status' as status,
t.obj ->> 'name' as name
from (
select e.element_id, jsonb_object_agg("key", value) as obj
from element e
group by e.element_id
) t;
If the table is really big this might be a lot slower than the join version due to the aggregation step. If you limit the query to only some elements (e.g. by adding a where element_id = 1 or where element_id in (1,2,3)) then this should be quite fast.
It has the advantage that you always have all key/value pairs for each element_id available regardless on what you do. The inner select could be put into a view, to make things easier.
Online example: https://rextester.com/MSZOWU37182
Seems like you want to PIVOT
One way to do that is via conditional aggregation.
select
-- t.element_id,
max(case when t.key = 'name' then t.value end) as name,
max(case when t.key = 'status' then t.value end) as status,
max(case when t.key = 'city' then t.value end) as city
from the_table t
group by t.element_id;
db<>fiddle here
Or use crosstab:
select
-- element_id,
name,
status,
city
from crosstab (
'select t.element_id, t.key, t.value
from the_table t'
) as ct (element_id int, name varchar(30), status varchar(30), city varchar(30));
But if you do like those joins, here's a way
select
-- el.element_id,
nm.value as name,
st.value as status,
ci.value as city
from
(
select distinct t.element_id
from the_table t
where t.key in ('name','status','city')
) as el
left join the_table as nm on (nm.element_id = el.element_id and nm.key = 'name')
left join the_table as st on (st.element_id = el.element_id and st.key = 'status')
left join the_table as ci on (ci.element_id = el.element_id and ci.key = 'city');

How to get a single record in a one to many relation based on the value of this record

I have two tables, one contains a list of projects, another contains requests for this project. I would like to get the project number and the requests record. The status can be: Red, Yellow, Green, Open
How can I make sure that the 1 status record being shown follows the logic that when there is a Red one, show the red one, when there is no red one but there is a yellow one show this yellow one so on...
;WITH
numberTest as(
SELECT dbo.ServiceRequest.ID as numId,
ROW_NUMBER() OVER (PARTITION BY dbo.ServiceRequest.Project_ID order by Project_ID) AS RN1
FROM dbo.ServiceRequest
),
CTEVrequest AS
(
SELECT dbo.ServiceRequest.ID, dbo.ServiceRequest.Project_ID
FROM dbo.ServiceRequest
LEFT JOIN numberTest ON numberTest.numId = dbo.ServiceRequest.ID
WHERE numberTest.RN1 = 1
AND
dbo.ServiceRequest.ID = CASE
WHEN EXISTS(
select srvReq.ID
from dbo.ServiceRequest as srvReq
where requestStatus.ServiceStatus = 'R' AND srvReq.Project_ID = dbo.ServiceRequest.Project_ID)
THEN (select srvReq.ID
from dbo.ServiceRequest as srvReq
where requestStatus.ServiceStatus = 'R' AND srvReq.Project_ID = dbo.ServiceRequest.Project_ID)
WHEN EXISTS(
select srvReq.ID
from dbo.ServiceRequest as srvReq
where requestStatus.ServiceStatus = 'Y' AND srvReq.Project_ID = dbo.ServiceRequest.Project_ID)
THEN (select srvReq.ID
from dbo.ServiceRequest as srvReq
where requestStatus.ServiceStatus = 'Y' AND srvReq.Project_ID = dbo.ServiceRequest.Project_ID)
END)
SELECT DISTINCT
dbo.Project.ProjectNumber,
dbo.Project.ID,
CTEVrequest.ServiceReqStatus,
CTEVrequest.ServiceStatus
FROM dbo.Project
LEFT JOIN CTEVrequest ON CTEVrequest.Project_ID = dbo.Project.ID
LEFT JOIN dbo.Project ON dbo.ServiceRequest.Project_ID = dbo.Project.ID
The problem is I get the "Subquery returned more than 1 value" error and I have no clue how to make the result check if there exists a record with Red and if not select the one with Yellow and so on.
Probably I did not understand you right, but looks like your query is supposed to do this:
SELECT *
FROM dbo.Project p
OUTER APPLY
(
SELECT TOP 1 sr.*
FROM dbo.ServiceRequest sr
WHERE sr.Project_ID = p.Project_ID
/* AND sr.ServiceStatus in ('R', 'Y') */
ORDER BY
CASE
WHEN sr.ServiceStatus = 'R' THEN 1
WHEN sr.ServiceStatus = 'Y' THEN 2
ELSE 3
END
) sr
Why don't you create a table that functions as a key for the statuses? For example:
IF OBJECT_ID('tempdb..#LevelKey') IS NOT NULL
DROP TABLE #LevelKey;
CREATE TABLE #LevelKey
(LevelText [varchar](255),
LevelValue INT);
INSERT INTO #LevelKey VALUES
('Red',1),
('Orange',2),
('Yellow',3),
('Green',4);
Now, join the project level to this key. When you want to grab the highest rank, you could do something like:
SELECT TOP 1 *
FROM Table
INNER JOIN #LevelKey ON LevelText = Whatever
ORDER BY LevelValue ASC
But that only works for one. So I'm not even sure why I included that. To bring back a group, you're going to have to do something like this:
SELECT ProjectID, MAX(LevelValue) AS LevelValue
FROM Table
INNER JOIN #LevelKey ON LevelText = Whatever
GROUP BY ProjectID

How to join three tables with distinct

I'm trying to join three tables to pull back a list of distinct blog posts with associated assets (images etc) but I keep coming up a cropper. The three tablets are tblBlog, tblAssetLink and tblAssets. The Blog tablet hold the blog, the asset table holds the assets and the Assetlink table links the two together.
tblBlog.BID is the PK in blog, tblAssets.AID is the PK in Assets.
This query works but pulls back multiple posts for the same record. I've tried to use select distinct and group by and even union but as my knowledge is pretty poor with SQL - they all error.
I'd like to also discount any assets that are marked as deleted (tblAssets.Deleted = true) but not hide the associated Blog post (if that's not marked as deleted). If anyone can help - it would be much appreciated! Thanks.
Here's my query so far....
SELECT dbo.tblBlog.BID,
dbo.tblBlog.DateAdded,
dbo.tblBlog.PMonthName,
dbo.tblBlog.PDay,
dbo.tblBlog.Header,
dbo.tblBlog.AddedBy,
dbo.tblBlog.PContent,
dbo.tblBlog.Category,
dbo.tblBlog.Deleted,
dbo.tblBlog.Intro,
dbo.tblBlog.Tags,
dbo.tblAssets.Name,
dbo.tblAssets.Description,
dbo.tblAssets.Location,
dbo.tblAssets.Deleted AS Expr1,
dbo.tblAssetLink.Priority
FROM dbo.tblBlog
LEFT OUTER JOIN dbo.tblAssetLink
ON dbo.tblBlog.BID = dbo.tblAssetLink.BID
LEFT OUTER JOIN dbo.tblAssets
ON dbo.tblAssetLink.AID = dbo.tblAssets.AID
WHERE ( dbo.tblBlog.Deleted = 'False' )
ORDER BY dbo.tblAssetLink.Priority, tblBlog.DateAdded DESC
EDIT
Changed the Where and the order by....
Expected output:
tblBlog.BID = 123
tblBlog.DateAdded = 12/04/2015
tblBlog.Header = This is a header
tblBlog.AddedBy = Persons name
tblBlog.PContent = *text*
tblBlog.Category = Category name
tblBlog.Deleted = False
tblBlog.Intro = *text*
tblBlog.Tags = Tag, Tag, Tag
tblAssets.Name = some.jpg
tblAssets.Description = Asset desc
tblAssets.Location = Location name
tblAssets.Priority = True
Use OUTER APPLY:
DECLARE #b TABLE ( BID INT )
DECLARE #a TABLE ( AID INT )
DECLARE #ba TABLE
(
BID INT ,
AID INT ,
Priority INT
)
INSERT INTO #b
VALUES ( 1 ),
( 2 )
INSERT INTO #a
VALUES ( 1 ),
( 2 ),
( 3 ),
( 4 )
INSERT INTO #ba
VALUES ( 1, 1, 1 ),
( 1, 2, 2 ),
( 2, 1, 1 ),
( 2, 2, 2 )
SELECT *
FROM #b b
OUTER APPLY ( SELECT TOP 1
a.*
FROM #ba ba
JOIN #a a ON a.AID = ba.AID
WHERE ba.BID = b.BID
ORDER BY Priority
) o
Output:
BID AID
1 1
2 1
Something like:
SELECT b.BID ,
b.DateAdded ,
b.PMonthName ,
b.PDay ,
b.Header ,
b.AddedBy ,
b.PContent ,
b.Category ,
b.Deleted ,
b.Intro ,
b.Tags ,
o.Name ,
o.Description ,
o.Location ,
o.Deleted AS Expr1 ,
o.Priority
FROM dbo.tblBlog b
OUTER APPLY ( SELECT TOP 1
a.* ,
al.Priority
FROM dbo.tblAssetLink al
JOIN dbo.tblAssets a ON al.AID = a.AID
WHERE b.BID = al.BID
ORDER BY al.Priority
) o
WHERE b.Deleted = 'False'
You cannot join three tables unless they all have the same attribute. It would work if all tables had BID, but the second join is trying to join AID. Which wont work. They all have to have BID.
Based on your comments
i would like to get is just one asset per blog post (top one ordered
by Priority)
You can change your query as following. I suggest changing the join with dbo.tblAssetLink to filtered one, which contains only one (highest priority) link for every blog.
SELECT dbo.tblBlog.BID,
dbo.tblBlog.DateAdded,
dbo.tblBlog.PMonthName,
dbo.tblBlog.PDay,
dbo.tblBlog.Header,
dbo.tblBlog.AddedBy,
dbo.tblBlog.PContent,
dbo.tblBlog.Category,
dbo.tblBlog.Deleted,
dbo.tblBlog.Intro,
dbo.tblBlog.Tags,
dbo.tblAssets.Name,
dbo.tblAssets.Description,
dbo.tblAssets.Location,
dbo.tblAssets.Deleted AS Expr1,
dbo.tblAssetLink.Priority
FROM dbo.tblBlog
LEFT OUTER JOIN
(SELECT BID, AID,
ROW_NUMBER() OVER (PARTITION BY BID ORDER BY [Priority] DESC) as N
FROM dbo.tblAssetLink) AS filteredAssetLink
ON dbo.tblBlog.BID = filteredAssetLink.BID
LEFT OUTER JOIN dbo.tblAssets
ON filteredAssetLink.AID = dbo.tblAssets.AID
WHERE dbo.tblBlog.Deleted = 'False' AND filteredAssetLink.N = 1
ORDER BY tblBlog.DateAdded DESC