How to have a set of combination as a condition in SQL? - sql

I'm working on an integration project and have created a batchlog table where I store which combinations have been exported already and then check new data against that batchlog table. It worked pretty well as long as I mostly just stored one ID in batchlog table, let's say Customer ID and then selected new rows from Customer table like this:
SELECT *
FROM Customer
WHERE CusId NOT IN (SELECT CusID FROM IntegrationBatchlog)
However, now the solution is more complex and same row from Customer table will be exported several times in combination with other data so now I have couple of separate stored procedures and more columns in IntegrationBatchlog table (CusID, OrdertypeID and PaymentMethod) and join clauses in my select so now it's more like.
SELECT * FROM Customer c
JOIN....
JOIN...
JOIN...
WHERE there is not a row with that CusID AND OrderTypeID AND PaymentMethod in batchlog table yet.
So here I should check whether or not this exact combination has already been exported but how do you do that when you have like three several ID columns in batchlog table and you want to exclude those rows where all the three ID's are already present in same row in batchlog table?

One way is to do a LEFT JOIN to the IntegrationBatchLog table and only insert rows that aren't present.
select *
from Customer c
LEFT OUTER JOIN IntegrationBatchLog i
on c.CusId = i.CusId
and c.OrderTypeID = i.OrderTypeID
and c.PaymentMethod = i.PaymentMethod
where
i.CusId is null

Use EXISTS, not IN. This allows multiple column matching
This is standard SQL
SELECT * FROM Customer c
JOIN....
JOIN...
JOIN...
WHERE NOT EXISTS (
SELECT * FROM IntegrationBatchlog I
WHERE C.CusID = I.CusID
AND C.OrderTypeID = I.OrderTypeID
AND C.PaymentMethod = I.PaymentMethod)

SELECT ...
FROM Customer c JOIN ...
WHERE NOT EXISTS (SELECT *
FROM IntegrationBatchLog I
WHERE I.CusID = c.CusId AND
I.OrderTypeId = c.OrderTypeID ...)

Maybe NOT EXISTS would work here. Here's an example from the MySQL docs (I don't know your DB) - http://dev.mysql.com/doc/refman/5.0/en/exists-and-not-exists-subqueries.html
The second example:
"What kind of store is present in no cities?"
SELECT DISTINCT store_type FROM stores
WHERE NOT EXISTS (SELECT * FROM cities_stores
WHERE cities_stores.store_type = stores.store_type);
Maybe yours could be:
SELECT * FROM Customer c
JOIN....
JOIN...
JOIN...
WHERE NOT EXISTS (
SELECT * FROM batchlog WHERE
c.CusID = batchlog.CusID AND
c.OrderTypeID = batchlog.OrderTypeID AND
c.PaymentMethod = batchlog.PaymentMethod
)

Related

Remove duplicates from result in sql

i have following sql in java project:
select distinct * from drivers inner join licenses on drivers.user_id=licenses.issuer_id
inner join users on drivers.user_id=users.id
where (licenses.state='ISSUED' or drivers.status='WAITING')
and users.is_deleted=false
And result i database looks like this:
And i would like to get only one result instead of two duplicated results.
How can i do that?
Solution 1 - That's Because one of data has duplicate value write distinct keyword with only column you want like this
Select distinct id, distinct creation_date, distinct modification_date from
YourTable
Solution 2 - apply distinct only on ID and once you get id you can get all data using in query
select * from yourtable where id in (select distinct id from drivers inner join
licenses
on drivers.user_id=licenses.issuer_id
inner join users on drivers.user_id=users.id
where (licenses.state='ISSUED' or drivers.status='WAITING')
and users.is_deleted=false )
Enum fields name on select, using COALESCE for fields which value is null.
usually you dont query distinct with * (all columns), because it means if one column has the same value but the rest isn't, it will be treated as a different rows. so you have to distinct only the column you want to, then get the data
I suspect that you want left joins like this:
select *
from users u left join
drivers d
on d.user_id = u.id and d.status = 'WAITING' left join
licenses l
on d.user_id = l.issuer_id and l.state = 'ISSUED'
where u.is_deleted = false and
(d.user_id is not null or l.issuer_id is not null);

How to select records which don't exist in another table or have a different status?

I am trying to select records from a temp table based on another temp table which holds their previous statuses (StatusHistory).
So if the record doesn't exist in the status table, then it should be selected. If the status of the record is different than the one in the StatusHistory table, then the record should be selected. Otherwise, if it exists with the same status in the StatusHistory table, then it should be ignored.
I have this SQL but it doesn't seem to be the best solution. Can you please point me to a better way to achieve that assuming that there are thousands of records in the tables? Would it be possible to achieve the same result with a JOIN statement?
SELECT AI.item
FROM #AllItems AI
WHERE NOT EXISTS (
SELECT * FROM #StatusHistory HS
WHERE HS.itemId = AI.itemId
) OR NOT AI.itemStatus IN ( SELECT HS.itemStatusHistory
FROM #StatusHistory HS
WHERE HS.itemId = AI.itemId
AND HS.itemId = AI.itemId )
Yes, you can do this with a LEFT JOIN.
SELECT AI.item
FROM #AllItems AI
LEFT JOIN #StatusHistory HS ON AI.itemId = HS.itemId
AND AI.itemStatus = HS.itemStatusHistory
WHERE HS.itemId IS NULL
A better solution, however, is to use NOT EXISTS:
SELECT AI.item
FROM #AllItems AI
WHERE NOT EXISTS
(
SELECT 1 FROM #StatusHistory SH
WHERE SH.itemId = AI.itemId
AND SH.itemStatusHistory = AI.itemStatus
);
As pointed out by Aaron, this usually performs better than a LEFT JOIN.

Select Combination of columns from Table A not in Table B

I have two sets of table with all the contacts on an Account their titles and etc. For data migration purposes I need to select All ContactsIds with their AccountID from Table A that do not exist in TableB. Its the combination of both the ContactId and the AccountID. I have tried the following:
Select * from Final_Combined_Result wfcr
WHERE NOT EXISTS (Select Contact_ID, Account_ID from Temp_WFCR)
I know this is completely off, but I have looked at a couple of other questions on here but was unable to find an appropriate solution.
I have also tried this:
Select * from Final_Combined_Result wfcr
WHERE NOT EXISTS
(Select Contact_ID, Account_ID from Temp_WFCR as tc
where tc.Account_ID=wfcr.Account_InternalID
AND tc.Account_ID=wfcr.Contact_InternalID)
This seems to be correct but I would like to make sure.
Select wfcr.ContactsId, wfcr.AccountID
from Final_Combined_Result wfcr
left join Temp_WFCR t_wfcr ON t_wfcr.ContactsIds = wfcr.ContactsId
AND t_wfcr.AccountID = wfcr.AccountID
WHERE t_wfcr.AccountID is null
See this great explanation of joins
#juergend's answer shows the left join.
Using a not exists you join in the subselect, it would look like this:
Select wfcr.*
from
Final_Combined_Result wfcr
WHERE NOT EXISTS
(Select 1 --select values dont matter here, only the join restricts.
from
Temp_WFCR t
where t.Contact_ID = wfcr.Contact_InternalID
and t.account_id = wfcr.Account_InternalID
)

Compare 2 tables and find the missing record

I have 2 database tables:
customers and customers_1
I have 100 customers in the customers table but only 99 customers in the customers_1 table. I would like to write a query that will compare the 2 tables and will result in the missing row.
I have tried this following SQL:
select * from customers c where in (select * from customers_1)
But this will only check for the one table.
Your query shouldn't work this way. You have to compare one column to another and use NOT IN instead of IN:
select *
from customers c
where customerid not in (select customerid from customers_1)
However, Since you are on SQL Server 2008, you can use EXCEPT:
SELECT * FROM customers
EXCEPT
SELECT * FROM customers_1;
This will give you the rows which are in the customers table that are not in customers_1 table:
EXCEPT returns any distinct values from the left query that are not
also found on the right query.
This is easy. Just join them with a left outer join and check for NULL in the table which has the 99 rows. It will look something like this.
SELECT * FROM customers c
LEFT JOIN customers1 c1 ON c.some_key = c1.some_key
WHERE c1.some_key IS NULL
Instead of NOT IN clause consider using NOT EXISTS. NOT EXISTS clause performs better in this particular scenario. Your query would look like:
SELECT * FROM Customer c WHERE NOT EXISTS (SELECT 1 FROM Customer_1 c1 WHERE c.Customer_Id = c1.Customer_Id)
SELECT 1 is just for readability so everyone will know that I don't care about the actual data.

How to return rows matched in a table without multiple EXISTS clauses?

I want to pull back results from one table that match ALL specified values where the specified values are in another table. I can do it like this:
SELECT * FROM Contacts
WHERE
EXISTS (SELECT 1 FROM dbo.ContactClassifications WHERE ContactID = Contacts.ID AND ClassificationID = '8C62E5DE-00FC-4994-8127-000B02E10DA5')
AND EXISTS (SELECT 1 FROM dbo.ContactClassifications WHERE ContactID = Contacts.ID AND ClassificationID = 'D2E90AA0-AC93-4406-AF93-0020009A34BA')
AND EXISTS etc...
However that falls over when I get up to about 40 EXISTS clauses. The error message is "The query processor ran out of internal resources and could not produce a query plan. This is a rare event and only expected for extremely complex queries or queries that reference a very large number of tables or partitions. Please simplify the query."
The gist of this is to
Select all contacts with any GUID from the IN statement
Use a DISTINCT COUNT to get a count for each contactid on matching GUID's
Use the HAVING to retain only those contacts that equal the amount of matching GUID's you've put into the IN statement
SQL Statement
SELECT *
FROM dbo.Contacts c
INNER JOIN (
SELECT c.ID
FROM dbo.Contacts c
INNER JOIN dbo.ContactClassifications cc ON c.ID = cc.ContactID
WHERE cc.ClassificationID IN ('..', '..', 38 other GUIDS)
GROUP BY
c.ID
HAVING COUNT(DISTINCT cc.ClassificationID) = 40
) cc ON cc.ID = c.ID
Test script at data.stackexchange
One solution is to demand that no classification exists without a matching contact. That's a double negation:
select *
from contacts c
where not exists
(
select *
from ContactClassifications cc
where not exists
(
select *
from ContactClassifications cc2
where cc2.ContactID = c.ID
and cc2.ClassificationID = cc.ClassificationID
)
)
This type of problem is known as relational division.
SELECT c.*
FROM Contacts c
INNER JOIN
(cc.ContactID, COUNT(DISTINCT cc.ClassificationID) as num_class
FROM ContactClassifications
WHERE ClassificationID IN (....)
GROUP BY cc.ContactID
) b ON c.ID = b.ContactID
WHERE b.num_class = [number of distinct values - how many different values you put in "IN"]
If you run SQLServer 2005 and higher, you can do pretty much the same with CROSS APPLY, supposedly more efficiently