I have 2 tables task and taskattributes. There is a linking between 2 tables with taskid. Each taskid has multiple attributes represented by key,value.
I would like to find out if specific key exists for the task
For e.g. here if I want to check all the tasks which do not have key 'A'.
use correlated subquery with not exists
select a.taskid, b.key, b.value
from task a inner join taskattributes b on a.taskid=b.taskid
where not exist
(select 1 from taskattributes c on c.taskid=b.taskid and key='A')
With not exists:
select *
from task t
where not exists (
select 1 from taskattributes
where taskid = t.taskid and key = 'A'
)
One simple solution uses aggregation:
SELECT
t.taskid,
t.name
FROM task t
INNER JOIN taskattributes ta
ON t.taskid = ta.taskid
GROUP BY
t.taskid,
t.name
HAVING
COUNT(CASE WHEN "key" = 'A' THEN 1 END) = 0;
If you are using Postgres 9.4 or later, you may use FILTER in the HAVING clause:
HAVING COUNT(*) FILTER (WHERE "key" = 'A') = 0
Related
Let's say I have a table events with structure:
id
value_array
XXXX
[a,b,c,d]
...
...
I have a second table values_of_interest with structure:
value
x
y
z
a
I want to find id's that have any of the values found in values_of_interest. All else equal, what would be the most performant SQL to make this happen? (I am using BigQuery, but feel free to answer more generally)
My current thought is:
SELECT
DISTINCT e.id
FROM
events e, values_of_interest vi
WHERE
EXISTS(
SELECT
value
FROM
UNNEST(e.value_array) value
JOIN
vi ON vi.value = e.value
)
Few quick options for BigQuery Standard SQL
Option 1
select id
from `project.dataset.events`
where exists (
select 1
from `project.dataset.values_of_interest`
where value in unnest(value_array)
)
Option 2
select id
from `project.dataset.events` t
where (
select count(1)
from t.value_array as value
join `project.dataset.values_of_interest`
using(value)
) > 0
I would write this using exists and a join:
select e.id
from `project.dataset.events` e
where exists (select 1
from unnest(e.value_array) val join
`project.dataset.values_of_interest` voi
on val = voi.value
);
I am working on Postgres and I have two tables vehicles and vehicles_flag. There are no relations between the two tables and hence we can not join two tables to fetch the required data.
The table structure is below (vehicle_flag table may not contain all the id present in the vehicle table) :
[Table structure]
I am writing a function that will accept multiple input parameters. I have to select vehicle id from the vehicle_flag table only if the flag value is true: otherwise, I have to ignore the vehicel_flag table. My aim is to achieve something like this, but turns out the case statement expects scaler output:
select count(id) from vehicles
where
vehicles.id in (case
when #hasbluetooth =1 then (select distinct id from vehicle_flags where flag='bluetooth' and value = '1')
else
(select distinct id from vehicles)
end)
and
vehicles.id in (case
when #hasac =1 then (select distinct id from vehicle_flags where flag='ac' and value = '1')
else
(select distinct id from vehicles)
end)
Kindly suggest any solution to achieve this.
I suspect you want:
select v.*
from vehicle v
left join vehicle_flags vf on vf.id = v.id
group by v.id
having
(#hasbluetooth = 0 or bool_or(vf.flag = 'bluetooth' and vf.value = 1)
and (#hasac = 0 or bool_or(vf.flag = 'ac' and vf.value = 1)
Is the following problematic?
DELETE a
FROM WHAnalysis.dbo.tb_r12027dxi_CalculatedData a
WHERE EXISTS
(
SELECT *
FROM WHAnalysis.dbo.tb_r12027dxi_CalculatedData b
WHERE
b.[Past28Days] = 1 AND
a.[Index] = b.[Index]
HAVING SUM(b.Amount) = 0
)
Reason I'm slightly uneasy about using the above script is that if I run the following it errors:
SELECT *
FROM WHAnalysis.dbo.tb_r12027dxi_CalculatedData b
WHERE
b.[Past28Days] = 1
HAVING SUM(b.Amount) = 0
I understand why this script errors => the select is not grouped on anything therefore the processor does not like the aggregation in the HAVING clause.
But as a sub-query this error does not occur - why? Is this a problematic approach?
EDIT
Ended up using the following:
DELETE a
FROM WHAnalysis.dbo.tb_r12027dxi_CalculatedData a
WHERE a.[Index] IN
(
SELECT [Index]
FROM WHAnalysis.dbo.tb_r12027dxi_CalculatedData
WHERE [Past28Days] = 1
GROUP BY [Index]
HAVING SUM(Amount) = 0
)
But as suggested in answer the following is more readable by simply adding the GROUP BY into the sub-query:
DELETE a
FROM WHAnalysis.dbo.tb_r12027dxi_CalculatedData a
WHERE EXISTS
(
SELECT *
FROM WHAnalysis.dbo.tb_r12027dxi_CalculatedData b
WHERE
b.[Past28Days] = 1 AND
a.[Index] = b.[Index]
GROUP BY b.[Index]
HAVING SUM(b.Amount) = 0
)
It is legal to omit group by and still perform aggregations, therefore having is still a way of limiting results:
select sum(x)
from
(
select 1 x union all select 2
) a
having sum(x) = 3
Exists() work because everything in select list is ignored. Exists() looks for rows only, and terminates as soon as one is found. You might add group by b.Index to make intent clear to anyone reviewing the code later, or rewrite it using inner join and derived table.
DELETE a
FROM WHAnalysis.dbo.tb_r12027dxi_CalculatedData a
INNER JOIN
(
SELECT b.[Index]
FROM WHAnalysis.dbo.tb_r12027dxi_CalculatedData b
WHERE
b.[Past28Days] = 1
GROUP BY b.[Index]
HAVING SUM(b.Amount) = 0
) b1
ON a.[Index] = b1.[Index]
Lets say I have a table e.g
Request No. Type Status
---------------------------
1 New Renewed
and then another table
Action ID Request No LastUpdated
------------------------------------
1 1 06-10-2010
2 1 07-14-2010
3 1 09-30-2010
How can I join the second table with the first table but only get the latest record from the second table(e.g Last Updated DESC)
SELECT T1.RequestNo ,
T1.Type ,
T1.Status,
T2.ActionId ,
T2.LastUpdated
FROM TABLE1 T1
JOIN TABLE2 T2
ON T1.RequestNo = T2.RequestNo
WHERE NOT EXISTS
(SELECT *
FROM TABLE2 T2B
WHERE T2B.RequestNo = T2.RequestNo
AND T2B.LastUpdated > T2.LastUpdated
)
Using aggregates:
SELECT r.*, re.*
FROM REQUESTS r
JOIN REQUEST_EVENTS re ON re.request_no = r.request_no
JOIN (SELECT t.request_no,
MAX(t.lastupdated) AS latest
FROM REQUEST_EVENTS t
GROUP BY t.request_no) x ON x.request_no = re.request_no
AND x.latest = re.lastupdated
Using LEFT JOIN & NOT EXISTS:
SELECT r.*, re.*
FROM REQUESTS r
JOIN REQUEST_EVENTS re ON re.request_no = r.request_no
WHERE NOT EXISTS(SELECT NULL
FROM REQUEST_EVENTS re2
WHERE re2.request_no = r2.request_no
AND re2.LastUpdated > re.LastUpdated)
SELECT *
FROM REQUEST, ACTION
WHERE REQUEST.REQUESTNO = ACTION.REQUESTNO --Joining here
AND ACTION.LastUpdated = (SELECT MAX(LastUpdated) FROM ACTION WHERE REQUEST.REQUESTNO = ACTION.REQUESTNO);
A sub-query is used to get the last updated record's date and matches against itself to prevent the other records being joined.
Granted, depending on how precise the LastUpdated field is, it can have problems with two records being updated on the same date, but that is a problem encountered in any other implementation, so the precision would have to be increased or some other logic would have to be in place or another distinguishing characteristic to prevent multiple rows being returned.
SELECT r.RequestNo, r.Type, r.Status, a.ActionID, MAX(a.LastUpdated)
FROM Request r
INNER JOIN Action a ON r.RequestNo = a.RequestNo
GROUP BY r.RequestNo, r.Type, r.Status, a.ActionID
We can use the operation Top 1 with ORDER BY clause. For instance, if your tables are RequestTable(ID,Type,Status) and ActionTable(ActionID,RequestID,LastUpdated), the query will be like this:
Select Top 1 rq.ID, rq.Status, at.ActionID
From RequestTable as rq
JOIN ActionTable as at ON rq.ID = at.RequestID
Order by at.LastUpdated DESC
I'm trying to solve the below problem.
I feel like it is possible, but I can't seem to get it.
Here's the scenario:
Table 1 (Assets)
1 Asset-A
2 Asset-B
3 Asset-C
4 Asset-D
Table 2 (Attributes)
1 Asset-A Red
2 Asset-A Hard
3 Asset-B Red
4 Asset-B Hard
5 Asset-B Heavy
6 Asset-C Blue
7 Asset-C Hard
If I am looking for something having the same attributes as Asset-A, then it should identify Asset-B since Asset-B has all the same attributes as Asset-A (it should discard heavy, since Asset-A didn't specify anything different or the similar). Also, if I wanted the attributes for only Asset-A AND Asset-B that were common, how would I get that?
Seems simple, but I can't nail it...
The actual table I am using, is almost precisely Table2, simply an association of an AssetId, and an AttributeId so:
PK: Id
int: AssetId
int: AttributeId
I only included the idea of the asset table to simplify the question.
SELECT ato.id, ato.value
FROM (
SELECT id
FROM assets a
WHERE NOT EXISTS
(
SELECT NULL
FROM attributes ata
LEFT JOIN
attributes ato
ON ato.id = ata.id
AND ato.value = ata.value
WHERE ata.id = 1
AND ato.id IS NULL
)
) ao
JOIN attributes ato
ON ato.id = ao.id
JOIN attributes ata
ON ata.id = 1
AND ata.value = ato.value
, or in SQL Server 2005 (with sample data to check):
WITH assets AS
(
SELECT 1 AS id, 'A' AS name
UNION ALL
SELECT 2 AS id, 'B' AS name
UNION ALL
SELECT 3 AS id, 'C' AS name
UNION ALL
SELECT 4 AS id, 'D' AS name
),
attributes AS
(
SELECT 1 AS id, 'Red' AS value
UNION ALL
SELECT 1 AS id, 'Hard' AS value
UNION ALL
SELECT 2 AS id, 'Red' AS value
UNION ALL
SELECT 2 AS id, 'Hard' AS value
UNION ALL
SELECT 2 AS id, 'Heavy' AS value
UNION ALL
SELECT 3 AS id, 'Blue' AS value
UNION ALL
SELECT 3 AS id, 'Hard' AS value
)
SELECT ato.id, ato.value
FROM (
SELECT id
FROM assets a
WHERE a.id <> 1
AND NOT EXISTS
(
SELECT ata.value
FROM attributes ata
WHERE ata.id = 1
EXCEPT
SELECT ato.value
FROM attributes ato
WHERE ato.id = a.id
)
) ao
JOIN attributes ato
ON ato.id = ao.id
JOIN attributes ata
ON ata.id = 1
AND ata.value = ato.value
I don't completely understand the first part of your question, identifying assets based on their attributes.
Making some assumptions about column names, the following query would yield the common attributes between Asset-A and Asset-B:
SELECT [Table 2].Name
FROM [Table 2]
JOIN [Table 1] a ON a.ID = [Table 2].AssetID AND a.Name = 'Asset-A'
JOIN [Table 1] b ON b.ID = [Table 2].AssetID AND b.Name = 'Asset-B'
GROUP BY [Table 2].Name
Select * From Assets A
Where Exists
(Select * From Assets
Where AssetId <> A.AssetID
And (Select Count(*)
From Attributes At1 Join Attributes At2
On At1.AssetId <> At2.AssetId
And At1.attribute <> At2.Attribute
Where At1.AssetId = A.AssetId Asset) = 0 )
And AssetId = 'Asset-A'
select at2.asset, count(*)
from attribute at1
inner join attribute at2 on at1.value = at2.value
where at1.asset = "Asset-A"
and at2.asset != "Asset-A"
group by at2.asset
having count(*) = (select count(*) from attribute where asset = "Asset-A");
Find all assets who have every attribute that "A" has (but also may have additional attributes):
SELECT Other.ID
FROM Assets Other
WHERE
Other.AssetID <> 'Asset-A' -- do not return Asset A as a match to itself
AND NOT EXISTS (SELECT NULL FROM Attributes AttA WHERE
AttA.AssetID='Asset-A'
AND NOT EXISTS (SELECT NULL FROM Attributes AttOther WHERE
AttOther.AssetID=Other.ID AND AttOther.AttributeID = AttA.AttributeID
)
)
I.e., "find any asset where there is no attribute of A that is not also an attribute of this asset".
Find all assets who have exactly the same attributes as "A":
SELECT Other.ID
FROM Assets Other
WHERE
Other.AssetID <> 'Asset-A' -- do not return Asset A as a match to itself
AND NOT EXISTS (SELECT NULL FROM Attributes AttA WHERE
AttA.AssetID='Asset-A'
AND NOT EXISTS (SELECT NULL FROM Attributes AttOther WHERE
AttOther.AssetID=Other.ID
AND AttOther.AttributeID = AttA.AttributeID
)
)
AND NOT EXISTS (SELECT NULL FROM Attributes AttaOther WHERE
AttaOther.AssetID=Other.ID
AND NOT EXISTS (SELECT NULL FROM Attributes AttaA WHERE
AttaA.AssetID='Asset-A'
AND AttaA.AttributeID = AttaOther.AttributeID
)
)
I.e., "find any asset where there is no attribute of A that is not also an attribute of this asset, and where there is no attribute of this asset that is not also an attribute of A."
This solution works as prescribed, thanks for the input.
WITH Atts AS
(
SELECT
DISTINCT
at1.[Attribute]
FROM
Attribute at1
WHERE
at1.[Asset] = 'Asset-A'
)
SELECT
DISTINCT
Asset,
(
SELECT
COUNT(ta2.[Attribute])
FROM
Attribute ta2
INNER JOIN
Atts b
ON
b.[Attribute] = ta2.[attribute]
WHERE
ta2.[Asset] = ta.Asset
)
AS [Count]
FROM
Atts a
INNER JOIN
Attribute ta
ON
a.[Attribute] = ta.[Attribute]
Find all assets that have all the same attributes as asset-a:
select att2.Asset from attribute att1
inner join attribute att2 on att2.Attribute = att1.Attribute and att1.Asset <> att2.Asset
where att1.Asset = 'Asset-A'
group by att2.Asset, att1.Asset
having COUNT(*) = (select COUNT(*) from attribute where Asset=att1.Asset)
I thought maybe I can do this with LINQ and then work my way backwards with:
var result = from productsNotA in DevProducts
where productsNotA.Product != "A" &&
(
from productsA in DevProducts
where productsA.Product == "A"
select productsA.Attribute
).Except
(
from productOther in DevProducts
where productOther.Product == productsNotA.Product
select productOther.Attribute
).Single() == null
select new {productsNotA.Product};
result.Distinct()
I thought that translating this back to SQL with LinqPad would result into a pretty SQL query. However it didn't :). DevProducts is my testtable with a column Product and Attribute. I thought I'd post the LINQ query anyways, might be useful to people who are playing around with LINQ.
If you can optimize the LINQ query above, please let me know (it might result in better SQL ;))
I'm using following DDL
CREATE TABLE Attributes (
Asset VARCHAR(100)
, Name VARCHAR(100)
, UNIQUE(Asset, Name)
)
Second question is easy
SELECT Name
FROM Attributes
WHERE Name IN (SELECT Name FROM Attributes WHERE Asset = 'A')
AND Asset = 'B'
First question is not more difficult
SELECT Asset
FROM Attributes
WHERE Name IN (SELECT Name FROM Attributes WHERE Asset = 'A')
GROUP BY Asset
HAVING COUNT(*) = (SELECT COUNT(*) FROM FROM Attributes WHERE Asset = 'A')
Edit:
I left AND Asset != 'A' out of the WHERE clause of the second snippet for brevity