How to return rows matched in a table without multiple EXISTS clauses? - sql

I want to pull back results from one table that match ALL specified values where the specified values are in another table. I can do it like this:
SELECT * FROM Contacts
WHERE
EXISTS (SELECT 1 FROM dbo.ContactClassifications WHERE ContactID = Contacts.ID AND ClassificationID = '8C62E5DE-00FC-4994-8127-000B02E10DA5')
AND EXISTS (SELECT 1 FROM dbo.ContactClassifications WHERE ContactID = Contacts.ID AND ClassificationID = 'D2E90AA0-AC93-4406-AF93-0020009A34BA')
AND EXISTS etc...
However that falls over when I get up to about 40 EXISTS clauses. The error message is "The query processor ran out of internal resources and could not produce a query plan. This is a rare event and only expected for extremely complex queries or queries that reference a very large number of tables or partitions. Please simplify the query."

The gist of this is to
Select all contacts with any GUID from the IN statement
Use a DISTINCT COUNT to get a count for each contactid on matching GUID's
Use the HAVING to retain only those contacts that equal the amount of matching GUID's you've put into the IN statement
SQL Statement
SELECT *
FROM dbo.Contacts c
INNER JOIN (
SELECT c.ID
FROM dbo.Contacts c
INNER JOIN dbo.ContactClassifications cc ON c.ID = cc.ContactID
WHERE cc.ClassificationID IN ('..', '..', 38 other GUIDS)
GROUP BY
c.ID
HAVING COUNT(DISTINCT cc.ClassificationID) = 40
) cc ON cc.ID = c.ID
Test script at data.stackexchange

One solution is to demand that no classification exists without a matching contact. That's a double negation:
select *
from contacts c
where not exists
(
select *
from ContactClassifications cc
where not exists
(
select *
from ContactClassifications cc2
where cc2.ContactID = c.ID
and cc2.ClassificationID = cc.ClassificationID
)
)
This type of problem is known as relational division.

SELECT c.*
FROM Contacts c
INNER JOIN
(cc.ContactID, COUNT(DISTINCT cc.ClassificationID) as num_class
FROM ContactClassifications
WHERE ClassificationID IN (....)
GROUP BY cc.ContactID
) b ON c.ID = b.ContactID
WHERE b.num_class = [number of distinct values - how many different values you put in "IN"]
If you run SQLServer 2005 and higher, you can do pretty much the same with CROSS APPLY, supposedly more efficiently

Related

Remove duplicates from result in sql

i have following sql in java project:
select distinct * from drivers inner join licenses on drivers.user_id=licenses.issuer_id
inner join users on drivers.user_id=users.id
where (licenses.state='ISSUED' or drivers.status='WAITING')
and users.is_deleted=false
And result i database looks like this:
And i would like to get only one result instead of two duplicated results.
How can i do that?
Solution 1 - That's Because one of data has duplicate value write distinct keyword with only column you want like this
Select distinct id, distinct creation_date, distinct modification_date from
YourTable
Solution 2 - apply distinct only on ID and once you get id you can get all data using in query
select * from yourtable where id in (select distinct id from drivers inner join
licenses
on drivers.user_id=licenses.issuer_id
inner join users on drivers.user_id=users.id
where (licenses.state='ISSUED' or drivers.status='WAITING')
and users.is_deleted=false )
Enum fields name on select, using COALESCE for fields which value is null.
usually you dont query distinct with * (all columns), because it means if one column has the same value but the rest isn't, it will be treated as a different rows. so you have to distinct only the column you want to, then get the data
I suspect that you want left joins like this:
select *
from users u left join
drivers d
on d.user_id = u.id and d.status = 'WAITING' left join
licenses l
on d.user_id = l.issuer_id and l.state = 'ISSUED'
where u.is_deleted = false and
(d.user_id is not null or l.issuer_id is not null);

comparison query taking ages

My query is quite simple:
select a.ID, a.adres, a.place, a.postalcode
from COMPANIES a, COMPANIES b
where a.Postcode = b.Postcode
and a.Adres = b.Adres
and (
select COUNT(COMPANYID)
from USERS
where COMPANYID=a.ID
)>(
select COUNT(COMPANYID)
from USERS
where COMPANYID=b.ID
)
Database: sql server 2008 r2
What I'm trying to do:
The table of COMPANIES contains double entries. I want to know the ones that are connected to the most amount of users. So I only have to change the foreign keys of those with the least. ( I already know the id's of the doubles)
Right now it's taking a lot of time to complete. I was wondering if if could be done faster
Try this version. It should be only a little faster. The COUNT is quite slow. I've added a.ID <> b.ID to avoid few cases earlier.
select a.ID, a.adres, a.place, a.postalcode
from COMPANIES a INNER JOIN COMPANIES b
ON
a.ID <> b.ID
and a.Postcode = b.Postcode
and a.Adres = b.Adres
and (
select COUNT(COMPANYID)
from USERS
where COMPANYID=a.ID
)>(
select COUNT(COMPANYID)
from USERS
where COMPANYID=b.ID
)
The FROM ... INNER JOIN ... ON ... is a preferred SQL construct to join tables. It may be faster too.
One approach would be to pre-calculate the COMPANYID count before doing the join since you'll be repeatedly calculating it in the main query. i.e. something like:
insert into #CompanyCount (ID, IDCount)
select COMPANYID, COUNT(COMPANYID)
from USERS
group by COMPANYID
Then your main query:
select a.ID, a.adres, a.place, a.postalcode
from COMPANIES a
inner join #CompanyCount aCount on aCount.ID = a.ID
inner join COMPANIES b on b.Postcode = a.Postcode and b.Adres = a.Adres
inner join #CompanyCount bCount on bCount.ID = b.ID and aCount.IDCount > bCount.IDCount
If you want all instances of a even though there is no corresponding b then you'd need to have left outer joins to b and bCount.
However you need to look at the query plan - which indexes are you using - you probably want to have them on the IDs and the Postcode and Adres fields as a minimum since you're joining on them.
Build an index on postcode and adres
The database probably executes the subselects for every row. (Just guessing here, veryfy it in the explain plan. If this is the case you can rewrite the query to join with the inline views (note this is how it would look in oracle hop it works in sql server as well):
select distinct a.ID, a.adres, a.place, a.postalcode
from
COMPANIES a,
COMPANIES b,
(
select COUNT(COMPANYID) cnt, companyid
from USERS
group by companyid) cntA,
(
select COUNT(COMPANYID) cnt, companyid
from USERS
group by companyid) cntb
where a.Postcode = b.Postcode
and a.Adres = b.Adres
and a.ID<>b.ID
and cnta.cnt>cntb.cnt

Filter a SQL Server table dynamically using multiple joins

I am trying to filter a single table (master) by the values in multiple other tables (filter1, filter2, filter3 ... filterN) using only joins.
I want the following rules to apply:
(A) If one or more rows exist in a filter table, then include only those rows from the master that match the values in the filter table.
(B) If no rows exist in a filter table, then ignore it and return all the rows from the master table.
(C) This solution should work for N filter tables in combination.
(D) Static SQL using JOIN syntax only, no Dynamic SQL.
I'm really trying to get rid of dynamic SQL wherever possible, and this is one of those places I truly think it's possible, but just can't quite figure it out. Note: I have solved this using Dynamic SQL already, and it was fairly easy, but not particularly efficient or elegant.
What I have tried:
Various INNER JOINS between master and filter tables - works for (A) but fails on (B) because the join removes all records from the master (left) side when the filter (right) side has no rows.
LEFT JOINS - Always returns all records from the master (left) side. This fails (A) when some filter tables have records and some do not.
What I really need:
It seems like what I need is to be able to INNER JOIN on each filter table that has 1 or more rows and LEFT JOIN (or not JOIN at all) on each filter table that is empty.
My question: How would I accomplish this without resorting to Dynamic SQL?
In SQL Server 2005+ you could try this:
WITH
filter1 AS (
SELECT DISTINCT
m.ID,
HasMatched = CASE WHEN f.ID IS NULL THEN 0 ELSE 1 END,
AllHasMatched = MAX(CASE WHEN f.ID IS NULL THEN 0 ELSE 1 END) OVER ()
FROM masterdata m
LEFT JOIN filtertable1 f ON join_condition
),
filter2 AS (
SELECT DISTINCT
m.ID,
HasMatched = CASE WHEN f.ID IS NULL THEN 0 ELSE 1 END,
AllHasMatched = MAX(CASE WHEN f.ID IS NULL THEN 0 ELSE 1 END) OVER ()
FROM masterdata m
LEFT JOIN filtertable2 f ON join_condition
),
…
SELECT m.*
FROM masterdata m
INNER JOIN filter1 f1 ON m.ID = f1.ID AND f1.HasMatched = f1.AllHasMatched
INNER JOIN filter2 f2 ON m.ID = f2.ID AND f2.HasMatched = f2.AllHasMatched
…
My understanding is, filter tables without any matches simply must not affect the resulting set. The output should only consist of those masterdata rows that have matched all the filters where matches have taken place.
SELECT *
FROM master_table mt
WHERE (0 = (select count(*) from filter_table_1)
OR mt.id IN (select id from filter_table_1)
AND (0 = (select count(*) from filter_table_2)
OR mt.id IN (select id from filter_table_2)
AND (0 = (select count(*) from filter_table_3)
OR mt.id IN (select id from filter_table_3)
Be warned that this could be inefficient in practice. Unless you have a specific reason to kill your existing, working, solution, I would keep it.
Do inner join to get results for (A) only and do left join to get results for (B) only (you will have to put something like this in the where clause: filterN.column is null) combine results from inner join and left join with UNION.
Left Outer Join - gives you the MISSING entries in master table ....
SELECT * FROM MASTER M
INNER JOIN APPRENTICE A ON A.PK = M.PK
LEFT OUTER JOIN FOREIGN F ON F.FK = M.PK
If FOREIGN has keys that is not a part of MASTER you will have "null columns" where the slots are missing
I think that is what you looking for ...
Mike
First off, it is impossible to have "N number of Joins" or "N number of filters" without resorting to dynamic SQL. The SQL language was not designed for dynamic determination of the entities against which you are querying.
Second, one way to accomplish what you want (but would be built dynamically) would be something along the lines of:
Select ...
From master
Where Exists (
Select 1
From filter_1
Where filter_1 = master.col1
Union All
Select 1
From ( Select 1 )
Where Not Exists (
Select 1
From filter_1
)
Intersect
Select 1
From filter_2
Where filter_2 = master.col2
Union All
Select 1
From ( Select 1 )
Where Not Exists (
Select 1
From filter_2
)
...
Intersect
Select 1
From filter_N
Where filter_N = master.colN
Union All
Select 1
From ( Select 1 )
Where Not Exists (
Select 1
From filter_N
)
)
I have previously posted a - now deleted - answer based on wrong assumptions on you problems.
But I think you could go for a solution where you split your initial search problem into a matter of constructing the set of ids from the master table, and then select the data joining on that set of ids. Here I naturally assume you have a kind of ID on your master table. The filter tables contains the filter values only. This could then be combined into the statement below, where each SELECT in the eligble subset provides a set of master ids, these are unioned to avoid duplicates and that set of ids are joined to the table with data.
SELECT * FROM tblData INNER JOIN
(
SELECT id FROM tblData td
INNER JOIN fa on fa.a = td.a
UNION
SELECT id FROM tblData td
INNER JOIN fb on fb.b = td.b
UNION
SELECT id FROM tblData td
INNER JOIN fc on fc.c = td.c
) eligible ON eligible.id = tblData.id
The test has been made against the tables and values shown below. These are just an appendix.
CREATE TABLE tblData (id int not null primary key identity(1,1), a varchar(40), b datetime, c int)
CREATE TABLE fa (a varchar(40) not null primary key)
CREATE TABLE fb (b datetime not null primary key)
CREATE TABLE fc (c int not null primary key)
Since you have filter tables, I am assuming that these tables are probably dynamically populated from a front-end. This would mean that you have these tables as #temp_table (or even a materialized table, doesn't matter really) in your script before filtering on the master data table.
Personally, I use the below code bit for filtering dynamically without using dynamic SQL.
SELECT *
FROM [masterdata] [m]
INNER JOIN
[filter_table_1] [f1]
ON
[m].[filter_column_1] = ISNULL(NULLIF([f1].[filter_column_1], ''), [m].[filter_column_1])
As you can see, the code NULLs the JOIN condition if the column value is a blank record in the filter table. However, the gist in this is that you will have to actively populate the column value to blank in case you do not have any filter records on which you want to curtail the total set of the master data. Once you have populated the filter table with a blank, the JOIN condition NULLs in those cases and instead joins on itself with the same column from the master data table. This should work for all the cases you mentioned in your question.
I have found this bit of code to be faster in terms of performance.
Hope this helps. Please let me know in the comments.

Inner join query

Please go thourgh Attached Image where i descirbed my scenario:
I want SQL Join query.
Have a look at something like
SELECT *
FROM Orders o
WHERE EXISTS (
SELECT 1
FROM OrderBooks ob INNER JOIN
Books b ON ob.BookID = b.BookID
WHERE o.OrderID = ob.OrderID
AND b.IsBook = #IsBook
)
The query will return all orders based on the given criteria.
So, what it does is, when #IsBook = 1 it will return all Orders where there exists 1 or more entries linked to this order that are Books. And if #IsBook = 0 it will return all Orders where there exists 1 or more entries linked to this order that are not Books.
Inner join is a method that is used to combine two or more tables together on base of common field from both tables. the both keys must be of same type and of length in regardless of name.
here is an example,
Table1
id Name Sex
1 Akash Male
2 Kedar Male
similarly another table
Table2
id Address Number
1 Nadipur 18281794
2 Pokhara 54689712
Now we can perform inner join operation using the following Sql statements
select A.id, A.Name, B.Address, B.Number from Table1 A
INNER JOIN Table2 B
ON A.id = B.id
Now the above query gives one to one relation details.

INNER JOIN vs IN

SELECT C.* FROM StockToCategory STC
INNER JOIN Category C ON STC.CategoryID = C.CategoryID
WHERE STC.StockID = #StockID
VS
SELECT * FROM Category
WHERE CategoryID IN
(SELECT CategoryID FROM StockToCategory WHERE StockID = #StockID)
Which is considered the correct (syntactically) and most performant approach and why?
The syntax in the latter example seems more logical to me but my assumption is the JOIN will be faster.
I have looked at the query plans and havent been able to decipher anything from them.
Query Plan 1
Query Plan 2
The two syntaxes serve different purposes. Using the Join syntax presumes you want something from both the StockToCategory and Category table. If there are multiple entries in the StockToCategory table for each category, the Category table values will be repeated.
Using the IN function presumes that you want only items from the Category whose ID meets some criteria. If a given CategoryId (assuming it is the PK of the Category table) exists multiple times in the StockToCategory table, it will only be returned once.
In your exact example, they will produce the same output however IMO, the later syntax makes your intent (only wanting categories), clearer.
Btw, yet a third syntax which is similar to using the IN function:
Select ...
From Category
Where Exists (
Select 1
From StockToCategory
Where StockToCategory.CategoryId = Category.CategoryId
And StockToCategory.Stock = #StockId
)
Syntactically (semantically too) these are both correct. In terms of performance they are effectively equivalent, in fact I would expect SQL Server to generate the exact same physical plans for these two queries.
T think There are just two ways to specify the same desired result.
for sqlite
table device_group_folders contains 10 records
table device_groups contains ~100000 records
INNER JOIN: 31 ms
WITH RECURSIVE select_childs(uuid) AS (
SELECT uuid FROM device_group_folders WHERE uuid = '000B:653D1D5D:00000003'
UNION ALL
SELECT device_group_folders.uuid FROM device_group_folders INNER JOIN select_childs ON parent = select_childs.uuid
) SELECT device_groups.uuid FROM select_childs INNER JOIN device_groups ON device_groups.parent = select_childs.uuid;
WHERE 31 ms
WITH RECURSIVE select_childs(uuid) AS (
SELECT uuid FROM device_group_folders WHERE uuid = '000B:653D1D5D:00000003'
UNION ALL
SELECT device_group_folders.uuid FROM device_group_folders INNER JOIN select_childs ON parent = select_childs.uuid
) SELECT device_groups.uuid FROM select_childs, device_groups WHERE device_groups.parent = select_childs.uuid;
IN <1 ms
SELECT device_groups.uuid FROM device_groups WHERE device_groups.parent IN (WITH RECURSIVE select_childs(uuid) AS (
SELECT uuid FROM device_group_folders WHERE uuid = '000B:653D1D5D:00000003'
UNION ALL
SELECT device_group_folders.uuid FROM device_group_folders INNER JOIN select_childs ON parent = select_childs.uuid
) SELECT * FROM select_childs);