SQL Query to retrieve data while excluding a set of rows - sql

I have basically four tables (SQL Server):
Objects:
id
ObjectName
Components
id
ComponentName
ObjectsDetails:
ObjectID
ComponentID
ExclusionTable
id
ComponentID
Basically, these tables describe Objects and what Objects are made of (what components)
For example, Object "A" may be made out of component "A" and component "B".
In this case, the tables would be populated this way:
Objects:
id ObjectName
1 A
Components:
id ComponentName
1 A
2 B
ObjectDetails:
ObjectID ComponentID
1 1
1 2
Now, "ExclusionTable" may have a list of components that are to be excluded from a search (therefore, excluding entire objects if the object is made out of at least one of those components).
For example, I would like to ask:
"Give me all the Objects that are not made out of components A and B".
Therefore, my question is:
Is there a way to write a query for that ? No views, no stored procedures please.. my SQL engine does not support that.
I tried something like:
SELECT DISTINCT ObjectName FROM Objects INNER JOIN ObjectsDetails ON Objects.id =
ObjectDetails.ObjectID WHERE ObjectsDetails.ComponentID NOT IN (1,2)
in case ExclusionTable tells us that Components A and B needs to be excluded.
Of course, that doesn't work...
I tried a few variations using WHERE NOT EXISTS (SELECT * FROM ExclusionTable) but I am not proficient enough in SQL to understand how to get it to work using one query only (if it is even possible).
Thanks!

You should avoid doing queries with [not] in (select ...)
SELECT DISTINCT ObjectName
FROM Objects
INNER JOIN ObjectsDetails ON Objects.id = ObjectDetails.ObjectID
LEFT JOIN ExclusionTable on ExclusionTable.ComponentId = ObjectsDetails.ComponentID
where ExclusionTable.ComponentId is null;
This will retrieve only rows for which the ComponentID is not in ExclusionTable.
Update:
SELECT ObjectName
FROM Objects
INNER JOIN ObjectsDetails ON Objects.id = ObjectDetails.ObjectID
LEFT JOIN ExclusionTable on ExclusionTable.ComponentId = ObjectsDetails.ComponentID
group by ObjectName
having count(distinct ObjectsDetails.ComponentID) = sum(case when ExclusionTable.id is null then 1 else 0 end)
New approach, I think the only other way I could do it is basically to compare the number of components per object with the number of components in the object not included on the list. When these number are equal, no component is on the excluded list and we can show the object.
I'm sorry I can't make a test right now, please use EXPLAIN select ... to compare the queries, if they work.

Basically, if you need to get all objects not made from A or B, you need to get all objects EXCEPT those made from A or B.
SELECT DISTINCT Id, ObjectName
FROM Objects
WHERE Id NOT IN (
SELECT DISTINCT ObjectDetails.ObjectID
FROM ObjectDetails
INNER JOIN Components ON ObjectDetails.ComponentID = Components.Id
WHERE Components.ComponentName = 'A' OR Components.ComponentName = 'B'
)
Would that be what you're looking for?
EDIT: Of course, you can omit the join if you already have the component ids - then just put those in the where clause to filter them out.

select id, objectname
from Objects
left outer join
( select objectid from ObjectsDetails od inner join Exclusiontable et
on od.ComponentID= et.ComponentID) excludedid
on Objects.ID = excludedid.ObjectID and excludedid.ObjectID is null

Related

sql join: Detect changes

Periodically, I want to compare a global sql table (called "resource") with a local backup one (called "region_db") to see if a field has been changed. The field I'm monitoring this way is called "state", and the primary key is called "id". Currently I'm doing
SELECT id, state FROM resource
Then manually going through the resulting rows in a loop. For each (id, state) tuple, I do
SELECT state FROM region_db WHERE id = id
And check if the state from the local region_db matches the one from the global resource db. I'm able to detect two cases this way: 1) when a new id is added to resource, and 2) when the state of an existing row changes.
However, I'm missing the case where a row is deleted from the resource table.
I'm thinking about using JOINs but not sure about how to efficiently distinguish between the three cases (modify existing, add new, and delete row from resource table) while minimizing the number of JOINs / DB operations.
You can use full join:
select coalesce(r.id, reg.id) as id,
(case when r.id is null then 'DELETED'
when reg.id is null then 'CREATED'
else 'UPDATED'
end)
from resource r full join
region_db reg
on r.id = reg.id
where r.id is null or reg.id is null or r.state <> reg.state; -- something changed
WITH joined AS (
SELECT
region.state as 'region_state',
resource.state as 'global_state'
FROM
resource
INNER JOIN
region_db
ON
resource.id = region_db.id
) SELECT * FROM joined WHERE region_state <> 'global_state';
;
This query will get you a table that reflects when the state of an existing row changes. If you do a left join instead of an inner join in the with query, you will get records that may have been added but not backed up yet to region_db. Like-wise, with a right join, you may get records that have been deleted but not propagated yet.
Hopefully this helps.
You could use a UNION ALL that should tell you the differences in the tables -- basically checking for where count(*) = 1 meaning where the rows don't match (because of the GROUP BY)
SELECT id,state
FROM (
SELECT id, state FROM resource
UNION ALL
SELECT id,state FROM region_db
) tbl
GROUP BY id, state
HAVING count(*) = 1
ORDER BY id;

SQL: Find entries with matching criteria in different table

I have two tables, Event and EventTag
CREATE TABLE event (
id INT PRIMARY KEY,
content TEXT
)
CREATE TABLE event_tag (
event_id INT,
type VARCHAR(255),
value VARCHAR(255)
)
Each event has zero or more tags. The query I'd like to express in SQL is:
Give me all Event (all columns in the table) that have associated tags with EventTag.type="foo" and EventTag.value="bar".
This is easy for one tag criterion (for example, with a join and a where, as answered here), but how do I tackle the situation of two or more criteria? So: Give me the events that have an associated tag "foo" equal to "bar" and (!) an event tag "qux" equal to "quux"? I thought about joining the tag table 'n' times, but I'm not sure if it's a good idea.
The best way to solve this problem is to not use the EAV database model (Entity-Attribute-Value). You're running into just the first of many problems with this anti-pattern. A quick Google search on "EAV model" should reveal some of the other problems in store for you if you choose not to redesign. Normally your Event table should have a column for foo and a column for qux.
One possible solution that you can use, if you insist (or are forced) to go down this path:
SELECT id, content
FROM Event
WHERE id IN
(
SELECT
E.id
FROM
Event E
INNER JOIN Event_Tag T ON
T.event_id = E.id AND
(
(T.type = 'foo' AND T.value = 'bar') OR
(T.type = 'qux' AND T.value = 'quux')
)
GROUP BY
E.id
HAVING
COUNT(*) = 2
)
If you put your various type/value pairs into a temporary table or as a CTE then you can JOIN to that instead of listing out all of the pairs that you want. That syntax will be dependent on your RDBMS though.
Use Or operand for multiple case/criteria
SELECT * FROM Event e join Event_tag on e.eventId = et.eventtagid where ((EventTag.type="foo" and EventTag.value="bar") or (EventTag.type="po" and EventTag.value="yo"))
or if the values is dyanmic, then depending on your programming language that interface SQL, you can write a query
For example in java I can do it using
SELECT * FROM Event e join Event_tag et on e.eventid = et.eventtagid where (EventTag.type=? and EventTag.value=?)
Where I assign the above SQL string to Query and set the parameters for it.
Select id from EVENT ev
INNER JOIN EVENT_TAG et
ON ev.id = et.event_ID
WHERE et.type = 'foo'
AND et.value = 'bar'
Obviously you can put any thing you want between the parentheses to find what ever types you want.

If my subquery is returning multiple value then i need to perform normalization?

SELECT Title
FROM movie
WHERE Movie_no = (SELECT Movie_no
FROM Customer
INNER JOIN issues ON customer.`Cus_id`=issues.`Cus_id`
WHERE NAME = 'Shyam')
No. You just need to use IN or ANY:
SELECT Title
FROM movie
WHERE Movie_no IN (SELECT Movie_no
FROM Customer INNER JOIN
issues
ON customer.Cus_id = issues.Cus_id
WHERE NAME = 'Shyam'
);
I mean, you might need to normalize your data for other reasons, but a subquery returning more than one value would not be such a reason. And, your data structure seems reasonable, based on this one query.

Setting up database structure

I am trying to figure out the best way to restructure my database as I didn't plan ahead and now I am a little stuck on this part :)
I have a Table called Campaigns and a Table called Data Types.
Each campaign is a unique record that holds about 10 fields of data.
The data types contains 3 fields - ID, Type, Description
When You create a campaign, you can select as many data types as you would like.
1, 2 or all 3 of them.
My concern / question is - How can I store what the user selected with the campaign record?
I need to be able to pull in the campaign details but also know which data types were selected.
How I originally had it set up was the data types were in 1 field, comma separated but learned is not ideal to do that.
What would be the best way to accomplish this? Storing the data as XML ?
UPDATE -
Here is an example of the query I was trying to get to work (its probably way off).
BEGIN
SET NOCOUNT ON;
BEGIN
SELECT *
FROM (SELECT A.[campaignID] as campaignID,
A.[campaignTitle],
A.[campaignDesc],
A.[campaignType],
A.[campaignStatus],
A.[duration],
A.[whoCreated],
B.[campaignID],
B.[dataType],
(SELECT *
FROM Tags_Campaign_Settings
WHERE campaignID = #campaignID) AS dataTypes
FROM Tags_Campaigns AS A
INNER JOIN
Tags_Campaign_Settings AS B
ON A.[campaignID] = B.[campaignID]
WHERE A.[campaignID] = #campaignID
) AS a
FOR XML PATH ('campaigns'), TYPE, ELEMENTS, ROOT ('root');
END
END
Create a join table called Campain_DataType with campaignId and dataTypeId. Make sure they're foreign key constrained to the respective tables. When you query for campaign data, you can either create a separate query to get the data type information based on the campaignId, or you can do a left outer join to fetch campaigns and their data types together.
If you want to collapse the 3 data types into the same row, then give the following a shot. It's definitely on the hacky side, and it'll only work with a fixed number of data types. If you add another data type, you'll have to update this query to support it.
SELECT
Campaign.ID,
Campaign.foo,
Campaign.bar,
dataType1.hasDataType1,
dataType2.hasDataType2,
dataType3.hasDataType3
FROM
Campaign
LEFT OUTER JOIN
( SELECT
1 as hasDataType1,
Campaign_DataType.campaignID
FROM
DataType
INNER JOIN Campaign_DataType ON Campaign_DataType.dataTypeId = DataType.id
WHERE
DataType.Type = 'Type1'
) dataType1 ON dataType1.campaignID = Campaign.ID
LEFT OUTER JOIN
( SELECT
1 as hasDataType2,
Campaign_DataType.campaignID
FROM
DataType
INNER JOIN Campaign_DataType ON Campaign_DataType.dataTypeId = DataType.id
WHERE
DataType.Type = 'Type2'
) dataType2 ON dataType2.campaignID = Campaign.ID
LEFT OUTER JOIN
( SELECT
1 as hasDataType3,
Campaign_DataType.campaignID
FROM
DataType
INNER JOIN Campaign_DataType ON Campaign_DataType.dataTypeId = DataType.id
WHERE
DataType.Type = 'Type3'
) dataType3 ON dataType3.campaignID = Campaign.ID
The record you receive for each Campaign will have three fields: hasDataType1, hasDataType2, hasDataType3. These columns will be 1 for yes, NULL for no.
Looks to me like what you want here is a crosstab query. Take a look at:
Sql Server 2008 Cross Tab Query

Get values from all sub-divided child tables

I have three tables as below.
TransactionTable
----------------
TransactionID
Status
Value
FileNo (int)
FileType - 'E' indicates Email, 'D' Indicates Document
EmailTable
----------
EmailFileNo (Identity)
ReceivedDate
....
....
....
DocumentsTable
---------------
DocFileNo (Identity)
ReceivedDate
.....
.....
There is one to many relationship between EmailTable and TransactionTable and also between DocumentsTable and TransactionTable
What is the name for such type of relationship... I just used the term sub-divided child tables
I need to select TransactionID, ReceivedDate, Value where status is 'P'...
I could get the result using
Select A.TransactionID, IsNull(B.ReceivedDate, C.ReceivedDate) as ReceivedDate, A.Value
From TransactionTable as A
Left outer join EmailTable as B on A.FileNo = B.EmailFileNo and A.FileType='E'
Left outer join DocumentsTable as C on A.FileNo = C.DocFileNo and A.FileType = 'D'
where A.Status = 'P'
The above query gives me the result as expected... Is this the way it should be done or is there a better way to handle such scenarios ?
Edit : Included the where clause, which got missed during copy paste operation. Thanks for pointing this out.
Your query looks good. The only comment I'd make is that I don't see you satisfying the Status='P' condition that you specified in your requirements.
Select A.TransactionID, IsNull(B.ReceivedDate, C.ReceivedDate) as ReceivedDate, A.Value
From TransactionTable as A
Left outer join EmailTable as B
on A.FileNo = B.EmailFileNo
and A.FileType='E'
Left outer join DocumentsTable as C
on A.FileNo = C.DocFileNo
and A.FileType = 'D'
where A.Status = 'P'
Someone might have a better response, but that's pretty much it. You could opt for COALESCE instead of ISNULL which permit a variable number of arguments, so you can add a third option if both are Email and Documents are NULL for some reason.
Everything that follows is just commentary on the schema. The table structure has a problem, but I'm sure you're now coding after these tables are already established, so this isn't necessarily a call for action. You probably have to live with them as they are.
My instinctive response would have been to assign TransactionId to the child tables, because they are not formally children right now. They are autonomous objects that TransactionTable happens to refer to.
I had similar problem before where I had a key column that didn't have a clear definition and I eventually opted against it. It's not possible to build a formal constraint/foreign key for FileNo on TransactionTable, because FileNo could be defined on either of the two tables.
(Incidentally your status = 'P' check is missing from your query.)
Also if you keep adding new filetype beyond 'E' and 'D' you are going to have to keep extending the query to new tables. A File table of some form, with the key fields on might have been one way of resolving this. [for all I know you may already have some sort of File table]
Not sure if any of this helps you, though. There's no way to improve upon your query without changing the table structures.