Find rows which have never satistifed a condition - sql

Say I have a table of customers with three possible statuses: loan default, open loan, paid in full.
How can I find the customers who never defaulted?
Example: John and Alex had multiple loans with different statuses.
id | customer | status
----------------------
1 john default
1 john open
1 john paid
2 alex open
2 alex paid
John defaulted once and Alex never defaulted. A simple where status <> "default" attempt doesn't work because it incorrectly includes John's non-defaulted cases. The result should give me:
id | customer
-------------
2 alex

How can I find the customers who never defaulted?
You can use aggregation and having:
select id, customer
from t
group by id, customer
having sum(case when status = 'default' then 1 else 0 end) = 0;
The having clause counts the number of defaults for each customer and returns those customers with no defaults.
If you have a separate table of customers, I would recommend not exists:
select c.*
from customers c
where not exists (select 1
from t
where t.id = c.id and t.status = 'default'
);

Something like
select distinct `customer` from `customers`
where `customer` not in (
select `customer` from `customers where `status` = 'default'
);

The ALL() operator with a correlated sub-query works here:
WITH cte AS (
SELECT * FROM (VALUES
(1, 'john', 'default'),
(1, 'john', 'open'),
(1, 'john', 'paid'),
(2, 'alex', 'open'),
(2, 'alex', 'paid')
) AS x(id, customer, status)
)
SELECT *
FROM cte AS a
WHERE 'default' <> ALL (
SELECT status
FROM cte AS b
WHERE a.id = b.id
);
If you want just user and/or id, do select distinct «your desired columns» instead of select *.

Related

SQL for selecting values in a single column by 'AND' condition

I have a table data like bellow
PersonId
Eat
111
Carrot
111
Apple
111
Orange
222
Carrot
222
Apple
333
Carrot
444
Orange
555
Apple
I need an sql query which return the total number of PersonId's who eat both Carrot and Apple.
In the above example the result is, Result : 2. (PersonId's 111 and 222)
An ms-sql query like 'select count(distinct PersonId) from Person where Eat = 'Carrot' and Eat = 'Apple''
You can actually get the count without using a subquery to determine the persons who eat both. Assuming that the rows are unique:
select ( count(distinct case when eat = 'carrot' then personid end) +
count(distinct case when eat = 'apple' then personid end) -
count(distinct personid)
) as num_both
from t
where eat in ('carrot', 'apple')
SELECT PersonID FROM Person WHERE Eat = 'Carrot'
INTERSECT
SELECT PersonID FROM Person WHERE Eat = 'Apple'
You can use conditional aggregation of a sort:
select
personid
from <yourtable>
group by
personid
having
count (case when eat = 'carrot' then 1 else null end) >= 1
and count (case when eat = 'apple' then 1 else null end) >= 1
At this example, I use STRING_AGG to make easy the count and transform 'Apple' and 'Carrot' to one string comparison:
create table #EatTemp
(
PersonId int,
Eat Varchar(50)
)
INSERT INTO #EatTemp VALUES
(111, 'Carrot')
,(111, 'Apple')
,(111, 'Orange')
,(222, 'Carrot')
,(222, 'Apple')
,(333, 'Carrot')
,(444, 'Orange')
,(555, 'Apple')
SELECT Count(PersonId) WhoEatCarrotAndApple FROM
(
SELECT PersonId,
STRING_AGG(Eat, ';')
WITHIN GROUP (ORDER BY Eat) Eat
FROM #EatTemp
WHERE Eat IN ('Apple', 'Carrot')
GROUP BY PersonId
) EatAgg
WHERE Eat = 'Apple;Carrot'
You can use EXISTS statements to achieve your goal. Below is a full set of code you can use to test the results. In this case, this returns a count of 2 since PersonId 111 and 222 match the criteria you specified in your post.
CREATE TABLE Person
( PersonId INT
, Eat VARCHAR(10));
INSERT INTO Person
VALUES
(111, 'Carrot'), (111, 'Apple'), (111, 'Orange'),
(222, 'Carrot'), (222, 'Apple'), (333, 'Carrot'),
(444, 'Orange'), (555, 'Apple');
SELECT COUNT(DISTINCT PersonId)
FROM Person AS p
WHERE EXISTS
(SELECT 1
FROM Person e1
WHERE e1.Eat = 'Apple'
AND p.PersonId = e1.PersonId)
AND EXISTS
(SELECT 1
FROM Person e1
WHERE e1.Eat = 'Carrot'
AND p.PersonId = e1.PersonId);
EXISTS statements have a few advantages:
No chance of changing the granularity of your data since you aren't joining in your FROM clause.
Easy to add additional conditions as needed. Just add more EXISTS statements in your WHERE clause.
The condition is cleanly encapsulated in the EXISTS, so code intent is clear.
If you ever need complex conditions like existence of a value in another table based on specific filter conditions, then you can easily add this without introducing table joins in your main query.
Some alternative solutions such as PersonId IN (SUBQUERY) can introduce unexpected behavior in certain conditions, particularly when the subquery returns a NULL value.
select
count(PersonID)
from Person
where eat = 'Carrot'
and PersonID in (select PersonID
from Person
where eat = 'Apple');
Only selecting those persons who eat apples, and from that result select all those that eat carrots too.
SELECT COUNT (A.personID) FROM
(SELECT distinct PersonID FROM Person WHERE Eat = 'Carrot'
INTERSECT
SELECT distinct PersonID FROM Person WHERE Eat = 'Apple') as A

Compare rows with each other and keep only one row according to a condition [duplicate]

This question already has answers here:
How to compare rows with each other and keep only one row according to a condition
(2 answers)
Closed 1 year ago.
I have a requirement as per below:
If more than 1 comment exist (group of name, lastname, door, amount) and one of them includes NULL then keep only the record with the NULL comment and discard the others.
If NULL is not one of them and the comment includes NOT AVAILABLE and REQUIRES. Keep NOT AVAILABLE - discard REQUIRES.
Name Lastname Comment Amount Door
------------------------------------------------------------
John R. NULL 250 1
John R. NULL 250 1
John R. New design is available 250 1
John R. Not available 250 2
John R. Requires additional comment 250 2
John R. XYZ 200 3
John R. Requires more information 200 4
John R. Requires more information 200 4
John R. Requires more information 200 4
John R. ABC 200 4
Result should look like:
Name Lastname Comment Amount Door
-------------------------------------------------------------
John R. NULL 250 1
John R. Not available 250 2
John R. XYZ 200 3
John R. Requires more information 200 4
John R. Requires more information 200 4
John R. Requires more information 200 4
John R. ABC 200 4
It should only check for those rows which have comment for more than 1. Problem statement is such that it should check for NULL comment and not available comments and discard the others but if in case these two are not present then data should go as it is.
I am trying to write a CTE to get the result but not sure how to compare the comment section. Something like below
WITH RNs AS
(
SELECT
name,
lastname,
door,
package,
DENSE_RANK() OVER (PARTITION BY name,lastname, comment, amount, door
ORDER BY name, lastname, amount, door ASC) AS RN
FROM
test
)
I'm sure someone might come up with something a bit more elegant, however this produces the desired output with your sample data.
This partitions by your requirement for the grouping classification to order the rows sequentially per group, and within each group additionally by a second ordering criteria to rank the not available/requires comments.
It then creates a sum per group to count the number of null/not available occurences per group.
It then selects the first row from each group, or, where there are no multiple occurences of null/not available.
with cte as (
select *, Row_Number() over (
partition by name, lastname, amount, door
order by case when comment like 'not available%' then 1 else case when comment like 'requires%' then 2 else 0 end end
) rn,
Sum(case when comment is null or comment like 'not available%' then 1 else 0 end) over (partition by name, lastname, amount, door) gp
from test
)
select [Name], Lastname, Comment, Amount, Door
from cte
where rn=1 or gp=0
order by door, comment desc
The following produces your desired result. Its not pretty because its having to detect the 3 different conditions for keeping a row.
It checks whether a NULL comment exists in the table for the same Name/Lastname/Door/Amount and if so discards any other rows.
It checks whether a NOT AVAIABLE exists in the table for the same Name/Lastname/Door/Amount and if so discards any other REQUIRES rows.
Any other rows are left untouched.
declare #Test table ([Name] varchar(12), Lastname varchar(12), Comment varchar(128), Amount money, Door int);
insert into #Test ([Name], Lastname, Comment, Amount, Door)
values
('John', 'R.', NULL, 250, 1),
('John', 'R.', NULL, 250, 1),
('John', 'R.', 'New design is available', 250, 1),
('John', 'R.', 'Not available', 250, 2),
('John', 'R.', 'Requires additional comment', 250, 2),
('John', 'R.', 'XYZ', 200, 3),
('John', 'R.', 'Requires more information', 200, 4),
('John', 'R.', 'Requires more information', 200, 4),
('John', 'R.', 'Requires more information', 200, 4),
('John', 'R.', 'ABC', 200, 4);
WITH RNs AS (
SELECT
[Name]
, Lastname
, Door
, Amount
, Comment
, ROW_NUMBER() OVER (PARTITION BY [Name], Lastname, Amount, Door
ORDER BY CASE WHEN Comment IS NULL THEN 1 ELSE 0 END DESC) AS RN
, CASE WHEN EXISTS (
SELECT 1
FROM #Test T1
WHERE T.[Name] = T1.[Name]
AND T.Lastname = T1.Lastname
AND T.Door = T1.Door
AND T.Amount = T1.Amount
AND T1.Comment IS NULL
) THEN 1 ELSE 0 END HasNull
, CASE WHEN EXISTS (
SELECT 1
FROM #Test T1
WHERE T.[Name] = T1.[Name]
AND T.Lastname = T1.Lastname
AND T.Door = T1.Door
AND T.Amount = T1.Amount
AND T1.Comment LIKE '%not available%'
) THEN 1 ELSE 0 END HasNotAvailable
FROM #Test T
)
SELECT *
FROM Rns
WHERE (HasNull = 1 AND RN = 1)
OR (HasNotAvailable = 1 AND Comment NOT LIKE '%requires%')
OR (HasNull = 0 AND HasNotAvailable = 0)
ORDER BY Door, Comment;
Note: If you present your sample data as DDL+DML (as above) its much easier and faster for people to answer.

Query to exclude two or more records if they match a single value

I have a database table in which multiple customers can be assigned to multiple types. I am having trouble formulating a query that will exclude all customer records that match a certain type. For example:
ID CustomerName Type
=========================
111 John Smith TFS-A
111 John Smith PRO
111 John Smith RWAY
222 Jane Doe PRO
222 Jane Doe TFS-A
333 Richard Smalls PRO
444 Bob Rhoads PRO
555 Jacob Jones TFS-B
555 Jacob Jones TFS-A
What I want is to pull only those people who are marked PRO but not marked TFS. If they are PRO and TFS, exclude them.
Any help is greatly appreciated.
You can get all 'PRO' customers and use NOT EXISTS clause to exclude the ones that are also 'TFS':
SELECT DISTINCT ID, CustomerName
FROM mytable AS t1
WHERE [Type] = 'PRO' AND NOT EXISTS (SELECT 1
FROM mytable AS t2
WHERE t1.ID = t2.ID AND [Type] LIKE 'TFS%')
SQL Fiddle Demo
solution using EXCEPT
WITH TestData
AS (
SELECT *
FROM (
VALUES ( 111, 'John Smith', 'TFS-A' )
, ( 111, 'John Smith', 'PRO' )
, ( 111, 'John Smith', 'RWAY' )
, ( 222, 'Jane Doe', 'PRO' )
, ( 222, 'Jane Doe', 'TFS-A' )
, ( 333, 'Richard Smalls', 'PRO' )
, ( 444, 'Bob Rhoads', 'PRO' )
, ( 555, 'Jacob Jones', 'TFS-B' )
, ( 555, 'Jacob Jones', 'TFS-A' ))
AS t (ID, CustomerName, [Type])
)
SELECT ID, CustomerName
FROM TestData
WHERE [Type] = 'PRO'
EXCEPT
SELECT ID, CustomerName
FROM TestData
WHERE [Type] LIKE 'TFS%'
output result
Select DISTINCT(Customername),ID
FROM tablename
WHERE NOT (ID IN (SELECT ID FROM tablename WHERE type='PRO')
AND ID IN (SELECT ID FROM tablename WHERE type='TFS'))
EDIT: now added working TFS clause
Get all customers that do not have TYPE PRO AND TFS for example
SQLFIDDLE:http://sqlfiddle.com/#!9/da4f9/2
Try This :
SELECT *
FROM table a
WHERE Type = 'PRO'
AND NOT EXISTS(SELECT 1
FROM table b
WHERE a.ID = b.ID
AND LEFT(Type, 3) = 'TFS')
I know this question has been answered, but mine answer is different. Everyone else solutions involves two queries which means what I call "double-dipping". You have to look access the same table twice. It's better to avoid this when possible for better performance. Check this out:
SELECT ID,
CustomerName,
MIN([type]) AS [Type] --doesn't matter if it's MIN or MAX
FROM yourTable
WHERE [Type] = 'PRO' --only load values that matter. Ignore RWAY
OR [Type] LIKE 'TFS-_' --notice I use a "_" instead of "%". That because "_" is a wildcard for a single character
--instead of wildcard looking for any number of characters because normally it's best to be as narrow as possible to be more efficient
GROUP BY ID,CustomerName
HAVING SUM(CASE
WHEN [Type] = 'Pro' --This is where it returns values that only have type PRO
THEN 9999
ELSE 1
END
) = 9999
So let me explain my funky HAVING logic. So as you can see it's a SUM() so and for PRO it's 9999 and TFS-_ it's 1. So when the sum is EXACTLY 9999, then it's good. Why I can't just do a COUNT(*) = 1 is because if a value has only one TFS and no pro, it would be returned, which of course would be incorrect.
Results:
ID CustomerName Type
----------- -------------- -----
444 Bob Rhoads PRO
333 Richard Smalls PRO

SQL Select with Priority

I need to select top 1 most valid discount for a given FriendId.
I have the following tables:
DiscountTable - describes different discount types
DiscountId, Percent, Type, Rank
1 , 20 , Friend, 2
2 , 10 , Overwrite, 1
Then I have another two tables (both list FriendIds)
Friends
101
102
103
Overwrites
101
105
I have to select top 1 most valid discount for a given FriendId. So for the above data this would be sample output
Id = 101 => gets "Overwrite" discount (higher rank)
Id = 102 => gets "Friend" discount (only in friends table)
Id = 103 => gets "Friend" discount (only in friends table)
Id = 105 => gets "Overwrite" discount
Id = 106 => gets NO discount as it does not exist in neither Friend and overwrite tables
INPUT => SINGLE friendId (int).
OUTPUT => Single DISCOUNT Record (DiscountId, Percent, Type)
Overwrites and Friend tables are the same. They only hold list of Ids (single column)
Having multiple tables of identical structure is usually bad practice, a single table with ID and Type would suffice, you could then use it in a JOIN to your DiscountTable:
;WITH cte AS (SELECT ID,[Type] = 'Friend'
FROM Friends
UNION ALL
SELECT ID,[Type] = 'Overwrite'
FROM Overwrites
)
SELECT TOP 1 a.[Type]
FROM cte a
JOIN DiscountTable DT
ON a.[Type] = DT.[Type]
WHERE ID = '105'
ORDER BY [Rank]
Note, non-existent ID values will not return.
This will get you all the FriendIds and the associate discount of the highest rank. It's an older hack that doesn't require using top or row numbering.
select
elig.FriendId,
min(Rank * 10000 + DiscountId) % 10000 as DiscountId
min(Rank * 10000 + Percent) % 10000 as Percent,
from
DiscountTable as dt
inner join (
select FriendId, 'Friend' as Type from Friends union all
select FriendId, 'Overwrite' from Overwrites
) as elig /* for eligible? */
on elig.Type = dt.Type
group by
elig.FriendId
create table discounts (id int, percent1 int, type1 varchar(12), rank1 int)
insert into discounts
values (1 , 20 , 'Friend', 2),
(2 , 10 , 'Overwrite', 1)
create table friends (friendid int)
insert into friends values (101),(102), (103)
create table overwrites (overwriteid int)
insert into overwrites values (101),(105)
select ids, isnull(percent1,0) as discount from (
select case when friendid IS null and overwriteid is null then 'no discount'
when friendid is null and overwriteid is not null then 'overwrite'
when friendid is not null and overwriteid is null then 'friend'
when friendid is not null and overwriteid is not null then (select top 1 TYPE1 from discounts order by rank1 desc)
else '' end category
,ids
from tcase left outer join friends
on tcase.ids = friends.friendid
left join overwrites
on tcase.ids = overwrites.overwriteid
) category1 left join discounts
on category1.category=discounts.type1

Query Results from Two Different Tables

I'm trying to write a query to produce a dataset from two or more tables, and I'm having trouble writing the query. I apologize in advanced my lack of knowledge in SQL.
Table 1 consists of basic customer account info and Table 2 consists of customer contract details where one customer account can have multiple contracts, both inactive and active
Table 1 and Table 2 can be joined with the values contained under a column named acct_id.
I would like the query to show only acct_ids where account status (acct_status) is "active" from Table 1, and that do not have an "active" contract from Table 2.
The problem is that in Table 2, there are more than one contract associated to an acct_id and are in different statuses.
If my where clause just focuses on the contract status values from table 2, my dataset won't be accurate. It will only return acct_ids that have contracts with those values.
for example:
acct_iD 123 has 6 contracts: 1 active contract, 4 cancelled contracts, 1 cancel in progress contract
acct_ID 456 has 3 contracts: 3 cancelled contracts
acct_ID 789 has 4 contracts: 2 active contracts, 2 cancelled contracts
acct_ID 012 has 1 contract: 1 cancelled contract
I would like my query result to show only acct_IDs: 456 and 012 as it truly represents that they do not have "active" contracts
I'm using SQL Management Studio 2008 R2.
select acct_id
from table1
where acct_status = "active" and
acct_id not in (select acct_id from table2 where contract_status = "active")
SELECT A.*
FROM Table1 A
WHERE A.acct_status = 'active'
AND NOT A.acct_id in (SELECT acct_id FROM Table2 WHERE contract_status = 'active')
Avoid the horror of IN and sub-selects by utilising LEFT OUTER JOINs like so:
SELECT A.*
FROM Table1 A
LEFT OUTER JOIN table2 B
ON b.acct_id = A.acct_id
AND B.status = 'active'
WHERE b.acct_id IS NULL
IF OBJECT_ID(N'tempdb..#Customer', N'U') is not null drop table #Customer
select
identity(int,1,1) as Customer_ID
, 'John Doe, the Ranger' as name
, '012' as Acct_ID
, 1 as Active
into #Customer
insert into #Customer (name, Acct_ID, Active) values ('Kermit the Frog', '789',1)
select * from #Customer
GO
IF OBJECT_ID(N'tempdb..#Contracts', N'U') is not null drop table #Contracts
select
identity(int,1,1) as Contract_ID
, 1 as Customer_ID
, '012' as Acct_ID
, 123.45 as amt
, 1 as Active
into #Contracts
insert into #Contracts (Customer_ID, Acct_ID, amt, active) values (1, '012', 234.56, 1)
insert into #Contracts (Customer_ID, Acct_ID, amt, active) values (2, '788', 9.56,1)
insert into #Contracts (Customer_ID, Acct_ID, amt, active) values (1, '789', 111.56, 0)
select * from #Contracts A
select *
from #Customer A
where a.Active=1
and (a.Acct_ID not in (select Acct_ID from #Contracts where Active=1))